Slide 1

Slide 1 text

Ruby meets WebAssembly Yuta Saito / @kateinoigakukun 1

Slide 2

Slide 2 text

About this talk 1. Introduction to Ruby on WebAssembly 1. Motivation 2. Live demo: “on-browser” usage 3. About WebAssembly and WASI 4. Live demo: “non-web” usage 2. Technical details: How to port the Ruby Interpreter 1. Explore missing pieces for porting: Exceptions, Fiber, Conservative GC 2. A magic technique Asyncify 3. FAQ & Recap 2

Slide 3

Slide 3 text

About me • Yuta Saito / @kateinoigakukun • Waseda University B4 • Working at • A newbie CRuby / Swift / LLVM committer • Porting languages to WebAssembly 3

Slide 4

Slide 4 text

Introduction to Ruby on WebAssembly Motivation On-browser Demo Motivation non-Web Demo About 
 Wasm and WASI

Slide 5

Slide 5 text

What's Good about Ruby • Designed to make programmers happy • Fast to write a program • Well-developed Gem Ecosystem 5

Slide 6

Slide 6 text

What's Dif fi cult in Ruby? 1. Some platforms can’t run Ruby easily • Some restricted platforms can’t install Ruby interpreter (e.g. Web browser, Mobile device) • Hard to run your Ruby program on user machine • Run on server? Need to maintain server? 2. Installation battle 💥 • Making the fi rst step easy is important for beginners • How many times have you seen BUILD FAILED ? 6

Slide 7

Slide 7 text

🤔 How can we solve them? 7

Slide 8

Slide 8 text

Ruby 🤝 WebAssembly 8

Slide 9

Slide 9 text

WebAssembly is a game changer • A binary instruction format for a stack-based machine • Designed to be • Portable • Language agnostic • Size- and Load-time ef fi cient • Secure by Sandbox • and more… 9

Slide 10

Slide 10 text

What WebAssembly solves 1. Some platforms can’t run Ruby easily → ✅ Now browser is everywhere 🌐 2. Installation battle 💥 → ✅ Beginners can try Ruby on browser without installation

Slide 11

Slide 11 text

Execution Flow of WebAssembly C / C++ Swift Rust Go … .wasm Web browsers 11

Slide 12

Slide 12 text

Ruby on WebAssembly Ruby Interpreter (CRuby) ruby.wasm app.rb Web browsers 12

Slide 13

Slide 13 text

Introduction to Ruby on WebAssembly Live demo: On browser Ruby About 
 Wasm and WASI On-browser Demo Motivation non-Web Demo

Slide 14

Slide 14 text

github.com/ruby/ruby.wasm
 packages/npm-packages/ruby-wasm-wasi/example/README.md

Slide 15

Slide 15 text

https://irb-wasm.vercel.app

Slide 16

Slide 16 text

Introduction to Ruby on WebAssembly About WebAssembly and WASI About 
 Wasm and WASI On-browser Demo Motivation non-Web Demo

Slide 17

Slide 17 text

https://emscripten.org/ Musl libc CRuby .wasm How WebAssembly works • Vanilla WebAssembly has: • No fi le system • No system clock • No networks • … • Time.now returns … what? ? 17

Slide 18

Slide 18 text

Musl libc CRuby JS Glue code .wasm .js Web API syscall emulation How WebAssembly works MemFS … JS provides functions to the WebAssembly Time.now clock_gettime Date.now() 18

Slide 19

Slide 19 text

• Secure by sandbox, architecture portability, many language support are useful for other area • Serverless platforms • Plugin systems • and more… • WebAssembly is not always on JavaScript WebAssembly is not only for Web 19

Slide 20

Slide 20 text

Overview of WASI User Application .wasm WASI interface Host Application Web Poly fi ll Bare Metal Serverless platform Musl libc … WASI (WebAssembly System Interface) WASI standardize platform independent system call interface 20

Slide 21

Slide 21 text

WASI compatible things Platforms • Node.js / Deno / Wasmtime • Fastly Compute@Edge • Cloud fl are Workers • VSCode Extensions • Fluent Bit • Etc… Languages • C / C++ • Rust • Swift • Ruby (NEW) • Etc… 21

Slide 22

Slide 22 text

• CRuby program itself is now portable everywhere! • However • Need to distribute .rb fi les also • Most Wasm integrated tools requires “One Binary” WASI + VFS = Portable Ruby App WASI interface WASI Implementation app.rb lib.rb Host User Application Musl libc .wasm 22

Slide 23

Slide 23 text

WASI + VFS = Portable Ruby App User Application .wasm WASI interface WASI Implementation Musl libc Host wasi-vfs app.rb lib.rb • wasi-vfs • A VFS layer backed by WASI • Read-only in-memory FS 23

Slide 24

Slide 24 text

Introduction to Ruby on WebAssembly Live demo: Serverless About 
 Wasm and WASI On-browser Demo Motivation non-Web Demo github.com/kateinoigakukun/ruby-compute-runtime

Slide 25

Slide 25 text

https://ruby-compute-runtime- demo.edgecompute.app/

Slide 26

Slide 26 text

Ruby 3.2 will support WebAssembly and WASI

Slide 27

Slide 27 text

What’s new around WebAssembly in Ruby 3.2 • Support WebAssembly / WASI target • ruby/ruby.wasm provides npm packages and pre-compiled rubies • Let’s fi nd interesting use cases! 27

Slide 28

Slide 28 text

How to Port CRuby to 
 WebAssembly with WASI 28 Technical details Introduction FAQ & Recap

Slide 29

Slide 29 text

Port CRuby to WebAssembly with WASI Me: OK, we already have C to WebAssembly compiler,
 so it’s easy to port! 29

Slide 30

Slide 30 text

Port CRuby to WebAssembly with WASI CRuby has many internal dragons… • 🐲 Exceptions • Heavily depend on setjmp/longjmp, which cannot be implemented on WebAssembly itself • 🐲 Fiber • WebAssembly itself doesn’t support context-switching • 🐲 Conservative GC • Need to inspect WebAssembly VM 30

Slide 31

Slide 31 text

Technical details Exception implementation 🐲 Conservative GC 🐲 Fiber 🐲 Exception ⚔ Asyncify

Slide 32

Slide 32 text

raise rb_raise_jump ... #define EC_EXEC_TAG() \ (ruby_setjmp(_tag.buf) ? rb_ec_tag_state(VAR_FROM_MEMORY(_ec)) : (EC_REPUSH_TAG(), 0)) VALUE vm_exec(rb_execution_context_t *ec, bool mjit_enable_p) { enum ruby_tag_type state; VALUE result = Qundef; VALUE initial = 0; EC_PUSH_TAG(ec); _tag.retval = Qnil; if ((state = EC_EXEC_TAG()) == TAG_NONE) { if (!mjit_enable_p || (result = mjit_exec(ec)) == Qundef) { result = vm_exec_core(ec, initial); } goto vm_loop_start; /* fallback to the VM */ } else { result = ec->errinfo; rb_ec_raised_reset(ec, RAISED_STACKOVERFLOW | RAISED_NOMEMORY); while ((result = vm_exec_handle_exception(ec, state, result, &initial)) == Qundef) { /* caught a jump, exec the handler */ result = vm_exec_core(ec, initial); vm_loop_start: VM_ASSERT(ec->tag == &_tag); /* when caught `throw`, `tag.state` is set. */ if ((state = _tag.state) == TAG_NONE) break; _tag.state = TAG_NONE; } } EC_POP_TAG(); return result; } CRuby implements exceptions by setjmp/longjmp static void rb_raise_jump(VALUE mesg, VALUE cause) { rb_execution_context_t *ec = GET_EC(); ... rb_vm_pop_frame(ec); ... rb_longjmp(ec, TAG_RAISE, mesg, cause); } raise “You got an error” setjmp saves the current program state longjmp restores the saved program state #define EC_EXEC_TAG() \ (ruby_setjmp(_tag.buf) ? rb_ec_tag_state(VAR_FROM_MEMORY(_ec)) : (EC_REPUSH_TAG(), 0)) if ((state = EC_EXEC_TAG()) == TAG_NONE) { if (!mjit_enable_p || (result = mjit_exec(ec)) == Qundef) { result = vm_exec_core(ec, initial); } goto vm_loop_start; /* fallback to the VM */ } #define EC_EXEC_TAG() \

Slide 33

Slide 33 text

setjmp/longjmp on x86_64 musl-libc src/setjmp/x86_64/setjmp.s src/setjmp/x86_64/longjmp.s /* Copyright 2011-2012 Nicholas J. Kain, licensed under standard MIT license */ setjmp: mov %rbx,(%rdi) /* rdi is jmp_buf, move registers onto it */ mov %rbp,8(%rdi) mov %r12,16(%rdi) mov %r13,24(%rdi) mov %r14,32(%rdi) mov %r15,40(%rdi) lea 8(%rsp),%rdx /* this is our rsp WITHOUT current ret addr */ mov %rdx,48(%rdi) mov (%rsp),%rdx /* save return addr ptr for new rip */ mov %rdx,56(%rdi) xor %eax,%eax /* always return 0 */ ret longjmp: xor %eax,%eax cmp $1,%esi /* CF = val ? 0 : 1 */ adc %esi,%eax /* eax = val + !val */ mov (%rdi),%rbx /* rdi is the jmp_buf, restore regs from it */ mov 8(%rdi),%rbp mov 16(%rdi),%r12 mov 24(%rdi),%r13 mov 32(%rdi),%r14 mov 40(%rdi),%r15 mov 48(%rdi),%rsp jmp *56(%rdi) /* goto saved address without altering rsp */ longjmp: jmp *56(%rdi) /* goto saved address without altering rsp */ /* Copyright 2011-2012 Nicholas J. Kain, licensed under standard MIT license */ setjmp: mov (%rsp),%rdx /* save return addr ptr for new rip */ Save and Restore • Machine stack position • Machine registers • Program counter
 (Return address) 33

Slide 34

Slide 34 text

WebAssembly Execution Model WebAssembly VM Code Functions[0] Functions[1] Functions[2] Functions[3] Push (Call) Pop (Return) Protected Stack Call Frame i32(0x42) i64(0xffff0a) Call Frame Return Address Return Address 🙅 🙅 Can’t jump! Can’t read/write! Control- fl ow is protected by WebAssembly VM Only goto/call/return are allowed 34

Slide 35

Slide 35 text

Missing pieces for porting 1. Save the current execution state 2. Unwind to the saved execution state 35

Slide 36

Slide 36 text

Technical details Fiber Implementation 🐲 Conservative GC 🐲 Fiber 🐲 Exception ⚔ Asyncify

Slide 37

Slide 37 text

Fiber in CRuby Main Fiber fi b Main Fiber fi b fib.resume Fiber.yield • Semi-coroutine • Suspend/Resume programs fib = Fiber.new do Fiber.yield x = 0 Fiber.yield y = 1 loop do x, y = y, x + y Fiber.yield y end end 3.times { puts fib.resume } fib.resume Fiber.yield ... 37

Slide 38

Slide 38 text

Fiber context-switch • Fiber#resume / Fiber.yield / Fiber#transfer switches “context” • Context • Ruby VM stack • Machine stack • Machine registers • Program counter Fiber 1 Fiber 2 Context Context Current Context Program Architecture Speci fi c 38

Slide 39

Slide 39 text

Missing pieces for porting 1. Save the current execution state 2. Unwind to the saved execution state
 → Restore an execution state from arbitrary execution state
 (not limited to being within the call stack) 39

Slide 40

Slide 40 text

Technical details Conservative GC Implementation 🐲 Conservative GC 🐲 Fiber 🐲 Exception ⚔ Asyncify

Slide 41

Slide 41 text

Conservative GC in CRuby • Scan data space to fi nd object-like values • CRuby scans: • Machine Registers • Machine Stack Registers VALUE VALUE non- VALUE Machine Stack VALUE non-VALUE VALUE VALUE non-VALUE (?) 41

Slide 42

Slide 42 text

WebAssembly Execution Model WebAssembly VM Code Functions[0] Functions[1] Functions[2] Functions[3] Push (Call) Pop (Return) Protected Stack (Current) Call Frame Write Read Linear Memory C Stack Data Heap Can’t read! Space for putting
 address referenced
 local values Call Frame Return Address Locals[1] Locals[2] Locals[0] Return Address Value Stack Value Stack i32(0x42) i64(0xffff0a) i32(0x42) 🙅 Can’t scan! 🙅 OK! 🙆 Machine Stack = C Stack + Value Stack Machine Register = Locals 42

Slide 43

Slide 43 text

Missing pieces for porting 1. Save the current execution state 2. Restore an execution state from arbitrary execution state
 (not limited to being within the call stack) 3. Inspect the Locals and Value Stack of all call frames 43

Slide 44

Slide 44 text

Technical details Asyncify 🐲 Conservative GC 🐲 Fiber 🐲 Exception Asyncify

Slide 45

Slide 45 text

Asyncify fi lls the missing pieces 🧙 • Provides low-level support for pausing and resuming programs • Designed to call async JS function from sync C code by Alon Zakai • Works by instrumenting .wasm binaries • wasm-opt input.wasm --asyncify -o output.wasm 45

Slide 46

Slide 46 text

Asyncify Example sleep Rewind sleep Call stack main main foo Unwind • Unwind: Serialize execution state
 and return the control to root • Rewind: Call entrypoint function again
 and rebuild call stack void sleep(void) { static bool is_sleeping = false; if (!is_sleeping) { is_sleeping = true; asyncify_start_unwind(); } else { is_sleeping = false; asyncify_stop_rewind(); } } void foo(void) { puts("before"); sleep(); puts("after"); } int main(void) { foo(); asyncify_stop_unwind(); puts("sleeping"); asyncify_start_rewind(); foo(); foo foo 46

Slide 47

Slide 47 text

Asyncify Example void sleep(void) { int main(void) { foo(); foo foo 47

Slide 48

Slide 48 text

void sleep(void) { puts("before"); } foo foo 48

Slide 49

Slide 49 text

void sleep(void) { static bool is_sleeping = false; if (!is_sleeping) { is_sleeping = true; asyncify_start_unwind(); } } sleep(); foo Asyncify Example 49

Slide 50

Slide 50 text

void sleep(void) { sleep(); } foo Asyncify Example 50

Slide 51

Slide 51 text

void sleep(void) { puts("before"); sleep(); puts("after"); // skipped } int main(void) { foo(); asyncify_stop_unwind(); foo Asyncify Example 51

Slide 52

Slide 52 text

void sleep(void) { puts("sleeping"); asyncify_start_rewind(); foo(); foo Asyncify Example 52

Slide 53

Slide 53 text

void sleep(void) { asyncify_start_rewind(); foo(); foo Asyncify Example 53

Slide 54

Slide 54 text

void sleep(void) { static bool is_sleeping = false; if (!is_sleeping) { } else { is_sleeping = false; asyncify_stop_rewind(); } } void foo(void) { // skipped sleep(); } foo Asyncify Example 54

Slide 55

Slide 55 text

void sleep(void) { puts("after"); } Asyncify Example 55

Slide 56

Slide 56 text

How Asyncify instruments program 1. Spill out Wasm stack values into Wasm registers 2. Instrument control fl ow, around calls and adding skips for rewinding 3. Instrument Wasm registers saving/restoring. 56

Slide 57

Slide 57 text

Asyncify: Instrument program static VALUE sym_length(VALUE sym) { return rb_str_length(rb_sym2str(sym)); } C code Wasm stack machine code (local.get $sym) (call $rb_sym2str) (call $rb_str_length) (return) 57

Slide 58

Slide 58 text

1. Spill out Wasm stack values into Wasm registers Guarantee that each statement doesn’t pop previous results static VALUE sym_length(VALUE sym) { register VALUE v1 = rb_sym2str(sym); register VALUE v2 = rb_str_length(v1); return v2; } (local.get $sym) (call $rb_sym2str) (local.set $v1) (local.get $v1) (call $rb_str_length) (local.set $v2) (local.get $v2) (return) C code Wasm stack machine code 58

Slide 59

Slide 59 text

2. Instrument control fl ow, around calls
 and adding skips for rewinding static VALUE sym_length(VALUE sym) { register VALUE v1; register VALUE v2; if (__asyncify_state == REWINDING) { __asyncify_get_call_index(); } if (__asyncify_state == NORMAL || __asyncify_check_call_index(0)) { v1 = rb_sym2str(sym); if (__asyncify_state == UNWINDING) { __asyncify_unwind(0); } } if (__asyncify_state == NORMAL || __asyncify_check_call_index(1)) { v2 = rb_str_length(v1); if (__asyncify_state == UNWINDING) { __asyncify_unwind(1); } } return v2; } 59

Slide 60

Slide 60 text

static VALUE sym_length(VALUE sym) { register int __asyncify_call_index; register VALUE v1; register VALUE v2; if (__asyncify_state == REWINDING) { __asyncify_call_index = *(__asyncify_data--); v1 = *(__asyncify_data--); v2 = *(__asyncify_data--); } if (__asyncify_state == NORMAL || __asyncify_check_call_index(0)) { v1 = rb_sym2str(sym); if (__asyncify_state == UNWINDING) { goto __asyncify_unwind(0); } } if (__asyncify_state == NORMAL || __asyncify_check_call_index(1)) { v2 = rb_str_length(v1); if (__asyncify_state == UNWINDING) { goto __asyncify_unwind(1); } } return v2; __asyncify_unwind(int call_index): *(__asyncify_data++) = v2; *(__asyncify_data++) = v1; *(__asyncify_data++) = call_index; return 0; } 3. Instrument Wasm registers saving/restoring. 60

Slide 61

Slide 61 text

Missing pieces for porting 1. Save the current execution state → ✅ Asyncify serializes the execution state into memory space 2. Restore an execution state from arbitrary execution state
 (not limited to being within the call stack) → ✅ Asyncify deserializes the execution state from memory space 3. Inspect the Locals and Value Stack of all call frames → ✅ Asyncify spills Value Stack to Locals, 
 and Locals are serialized to execution state 61

Slide 62

Slide 62 text

CRuby now runs on WebAssembly 👏 62

Slide 63

Slide 63 text

FQA & Recap 63 FAQ & Recap Technical details Introduction

Slide 64

Slide 64 text

FAQ • No Thread related API • Wasm and WASI don’t have thread spawning API yet. • Throw NotImplementedError • C Extension library must be linked statically • Dynamic-linking ABI is not yet well-supported What are the limitations of Ruby on WebAssembly? 64

Slide 65

Slide 65 text

FAQ How large is the .wasm binary? raw gzip Brotli minimal No standard extensions 8.1M 2.9M 1.7M full All standard extensions (like json, yaml, or stringio) 10M 3.4M 2.0M full+stdlib With stdlib .rb fi les 25M 7.3M 5.0M 65

Slide 66

Slide 66 text

FAQ How fast is CRuby on WebAssembly? • Near mruby speed • Asyncify adds much overheads • Environment • Node.js v17.9.0 • Ubuntu 20.04 • AMD Ryzen 9 5900HX Optcarrot FPS (bigger is better) master (native) mruby master (wasm32-wasi) Opal 0 15 30 45 60 1.51 18 21.6 54.6 66

Slide 67

Slide 67 text

Acknowledgements • @mame, @ko1, and other Ruby committers gave me a lot of advice • Ruby Association supported the project as a grant project • And thanks for all contributors and users 67

Slide 68

Slide 68 text

Recap • Ruby 3.2 will support WebAssembly and WASI • ruby.wasm provides npm packages and pre-compiled rubies • Try your favorite gems on https://irb-wasm.vercel.app/ • Welcome your feedback! 68

Slide 69

Slide 69 text

FAQ WebAssembly is designed for the mixed goals of them Why not use JVM / .NET CLI / NaCL / eBPF / Lua … ? Wasm design goal Similar with Portability JVM / .NET CLI Language independency .NET CLI Secure sandbox NaCL, eBPF Embeddable runtime Lua 69