Slide 1

Slide 1 text

Lightning-Fast Method Calls with Ruby 4.1 ZJIT @k0kubun / RubyKaigi 2026

Slide 2

Slide 2 text

@k0kubun

Slide 3

Slide 3 text

from “Ruby Committers and the World” session

Slide 4

Slide 4 text

Ruby JIT

Slide 5

Slide 5 text

˒ Ruby 3.1+: ruby --yjit ˒ Production-ready JIT compiler ˒ Enabled by default on Rails 7.2+ YJIT

Slide 6

Slide 6 text

https://speed.ruby-lang.org/

Slide 7

Slide 7 text

https://railsatscale.com/2025-01-10-yjit-3-4-even-faster-and-more-memory-e ff i cient/

Slide 8

Slide 8 text

˒ Ruby 4.0+: ruby --zjit ˒ Experimental JIT compiler ˒ To be productionized at Ruby 4.1 ˒ We call it “zee-jit”, not “zed-jit” ZJIT

Slide 9

Slide 9 text

ZJIT can make Ruby 10x faster https://rubybench.github.io/benchmarks/ruby-bench.html#getivar-module

Slide 10

Slide 10 text

ZJIT can make Ruby 15x faster https://rubybench.github.io/benchmarks/ruby-bench.html#getivar

Slide 11

Slide 11 text

ZJIT can make Ruby 40x faster https://rubybench.github.io/benchmarks/ruby-bench.html#structaset

Slide 12

Slide 12 text

But not as fast as YJIT on Rails (yet) https://rubybench.github.io/benchmarks/ruby-bench.html#railsbench

Slide 13

Slide 13 text

˒ Some YJIT optimizations are not ported to ZJIT yet ˒ Method calls can be slower in ZJIT than in YJIT or the interpreter (for now) Why?

Slide 14

Slide 14 text

Why Ruby’s method calls are slow

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

Method call

Slide 22

Slide 22 text

Ruby VM stack Interpreter

Slide 23

Slide 23 text

Ruby VM stack Interpreter 
 Stack: six

Slide 24

Slide 24 text


 Stack: six Ruby VM stack Interpreter 3

Slide 25

Slide 25 text


 Stack: six Ruby VM stack Interpreter 3 Memory writes: 1

Slide 26

Slide 26 text


 Stack: six Ruby VM stack Interpreter 3 2 Memory writes: 2

Slide 27

Slide 27 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 nil Memory writes: 3

Slide 28

Slide 28 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 Env: add1 method
 entry Memory writes: 4 nil

Slide 29

Slide 29 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 Env: add1 method
 entry block
 handler Memory writes: 5 nil

Slide 30

Slide 30 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 Env: add1 method
 entry block
 handler flags Memory writes: 6 nil

Slide 31

Slide 31 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 pc Memory writes: 7 nil

Slide 32

Slide 32 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 pc Memory writes: 8 sp nil

Slide 33

Slide 33 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 pc Memory writes: 9 sp iseq nil

Slide 34

Slide 34 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 pc Memory writes: 10 sp iseq self nil

Slide 35

Slide 35 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 pc Memory writes: 11 sp iseq self ep nil

Slide 36

Slide 36 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 pc Memory writes: 12 sp iseq self ep block code nil

Slide 37

Slide 37 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 pc Memory writes: 13 sp iseq self ep block code jit return nil

Slide 38

Slide 38 text

Ruby VM stack Interpreter 
 Stack: six 3 Locals: add1 2 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 pc Memory writes: 14 sp iseq self ep block code jit return Ruby thread rb_execution_context_t vm_stack ... cfp nil

Slide 39

Slide 39 text

˒ Method parameters are passed in memory ˒ A lot of memory writes for frame fi elds ˒ This is the bo tt leneck of method calls in Ruby Method calls on the interpreter

Slide 40

Slide 40 text

Method calls in YJIT

Slide 41

Slide 41 text

Ruby VM stack YJIT 
 Stack: six

Slide 42

Slide 42 text


 Stack: six Ruby VM stack YJIT CPU Reg1 Reg2 Reg3 3

Slide 43

Slide 43 text


 Stack: six Ruby VM stack YJIT CPU Reg1 Reg2 Reg3 3 Register writes: 1

Slide 44

Slide 44 text


 Stack: six Ruby VM stack YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2

Slide 45

Slide 45 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 1

Slide 46

Slide 46 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 2 Locals: add1 nil

Slide 47

Slide 47 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 3 Locals: add1 nil Env: add1 method
 entry

Slide 48

Slide 48 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 4 Locals: add1 nil Env: add1 method
 entry block
 handler

Slide 49

Slide 49 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 5 Locals: add1 nil Env: add1 method
 entry block
 handler flags

Slide 50

Slide 50 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 6 Locals: add1 nil Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 sp

Slide 51

Slide 51 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 7 Locals: add1 nil Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 sp iseq

Slide 52

Slide 52 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 8 Locals: add1 nil Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 sp iseq self

Slide 53

Slide 53 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 9 Locals: add1 nil Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 sp iseq self ep

Slide 54

Slide 54 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 10 Locals: add1 nil Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 sp iseq self ep block code

Slide 55

Slide 55 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 11 Locals: add1 nil Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 sp iseq self ep block code jit return

Slide 56

Slide 56 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 12 Locals: add1 nil Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 sp iseq self ep block code jit return Ruby thread rb_execution_context_t vm_stack ... cfp

Slide 57

Slide 57 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 12 Locals: add1 nil Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 sp iseq self ep block code jit return Ruby thread rb_execution_context_t vm_stack ... cfp = Memory writes: 14 - 1 Interpreter - PC

Slide 58

Slide 58 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 12 Locals: add1 nil Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 pc sp iseq self ep block code jit return Ruby thread rb_execution_context_t vm_stack ... cfp = + Register Allocation Register writes: 2 Memory writes: -2 Memory writes: 14 - 1 Interpreter - PC

Slide 59

Slide 59 text

Ruby VM stack 
 Stack: six 3 YJIT CPU Reg1 Reg2 Reg3 3 2 Register writes: 2 Memory writes: 12 Locals: add1 nil Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 pc sp iseq self ep block code jit return Ruby thread rb_execution_context_t vm_stack ... cfp = + Register Spill Memory writes: 1 + Register Allocation Register writes: 2 Memory writes: -2 Memory writes: 14 - 1 Interpreter - PC

Slide 60

Slide 60 text

˒ Method parameters are passed in registers ˒ Most writes into frame fi elds (except PC) are not optimized ˒ This is still the bo tt leneck of method calls in Ruby Method calls in YJIT

Slide 61

Slide 61 text

Method calls in ZJIT (before Lightweight Frames)

Slide 62

Slide 62 text

Ruby VM stack ZJIT 
 Stack: six

Slide 63

Slide 63 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 ZJIT Arg1 Arg2

Slide 64

Slide 64 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 ZJIT 3 Arg2 Arg1

Slide 65

Slide 65 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT Memory writes: 1 3 Arg2 Arg1

Slide 66

Slide 66 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT Memory writes: 1 3 Arg2 Arg1 Locals: add1

Slide 67

Slide 67 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT Memory writes: 2 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry

Slide 68

Slide 68 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT Memory writes: 3 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry block
 handler

Slide 69

Slide 69 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT Memory writes: 4 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry block
 handler flags

Slide 70

Slide 70 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT Memory writes: 5 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 iseq

Slide 71

Slide 71 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT Memory writes: 6 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 iseq self

Slide 72

Slide 72 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT Memory writes: 7 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 iseq self ep

Slide 73

Slide 73 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT Memory writes: 8 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 iseq self ep block code

Slide 74

Slide 74 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 iseq self ep block code Register writes: 1 Memory writes: 8

Slide 75

Slide 75 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 iseq self ep block code Register writes: 2 Memory writes: 8 2

Slide 76

Slide 76 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 iseq self ep block code Register writes: 2 Memory writes: 9 2 C stack Stack: six 3

Slide 77

Slide 77 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 ZJIT 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry block
 handler flags rb_control_frame_t: add1 iseq self ep block code Register writes: 2 Memory writes: 10 2 C stack Stack: six 3 Ruby thread rb_execution_context_t vm_stack ... cfp

Slide 78

Slide 78 text

Interpreter vs YJIT vs ZJIT Interpreter YJIT ZJIT Memory writes 14 12 10 Register writes 0 2 2 Caller stack size: 1

Slide 79

Slide 79 text

Interpreter vs YJIT vs ZJIT Interpreter YJIT ZJIT Memory writes 14 15 16 Register writes 0 2 2 Caller stack size: 4

Slide 80

Slide 80 text

˒ Method parameters are passed in C ABI registers ˒ Most writes into frame fi elds (except PC and jit_return) are not optimized ˒ ZJIT spills registers into both the VM stack and the C stack Method calls in ZJIT

Slide 81

Slide 81 text

Why are we writing method frames?

Slide 82

Slide 82 text

˒ Backtraces need to reconstruct the full call chain at any point ˒ Even if most calls never raise, the metadata must always be there Backtraces

Slide 83

Slide 83 text

Backtraces

Slide 84

Slide 84 text

Backtraces This is based on the frame’s PC and iseq

Slide 85

Slide 85 text

˒ The interpreter may suddenly take over JIT-compiled stack slots ˒ Frame layout must stay compatible with interpreter expectations Exception handling

Slide 86

Slide 86 text

Exception handling

Slide 87

Slide 87 text

Exception handling On raise, the interpreter takes over the stack: 3

Slide 88

Slide 88 text

˒ Debuggers can access local variables outside the current frame ˒ Ruby must keep locals in a predictable location on the stack Local variables

Slide 89

Slide 89 text

Local variables

Slide 90

Slide 90 text

Local variables It overwrites the caller’s local variable

Slide 91

Slide 91 text

Lightweight Frames

Slide 92

Slide 92 text

https://github.com/Shopify/ruby/issues/909

Slide 93

Slide 93 text

˒ Lightweight Frames reduce frame push to writing a single fi eld ˒ Everything still works: backtraces, exceptions, debugger access Writing only one fi eld on a method call

Slide 94

Slide 94 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 ZJIT: Lightweight Frames 3 Arg2 Arg1

Slide 95

Slide 95 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 Arg2 Arg1 Locals: add1 ZJIT: Lightweight Frames

Slide 96

Slide 96 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 Arg2 Arg1 Locals: add1 Env: add1 ZJIT: Lightweight Frames

Slide 97

Slide 97 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 Memory writes: 1 Arg2 Arg1 Locals: add1 Env: add1 rb_control_frame_t: add1 ZJIT: Lightweight Frames jit return struct zjit_jit_frame method
 entry block
 handler iseq self ep block code JIT Frame pc sp flags

Slide 98

Slide 98 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 Arg2 Arg1 Locals: add1 Env: add1 rb_control_frame_t: add1 ZJIT: Lightweight Frames jit return struct zjit_jit_frame method
 entry block
 handler iseq self ep block code JIT Frame pc sp flags Memory writes: 2 C stack Stack: six 3

Slide 99

Slide 99 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 Arg2 Arg1 Locals: add1 Env: add1 rb_control_frame_t: add1 ZJIT: Lightweight Frames jit return struct zjit_jit_frame method
 entry block
 handler iseq self ep block code JIT Frame pc sp flags Register writes: 1 Memory writes: 2 C stack Stack: six 3

Slide 100

Slide 100 text


 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 Arg2 Arg1 Locals: add1 Env: add1 rb_control_frame_t: add1 ZJIT: Lightweight Frames jit return struct zjit_jit_frame method
 entry block
 handler iseq self ep block code JIT Frame pc sp flags Register writes: 2 Memory writes: 2 C stack Stack: six 3 2

Slide 101

Slide 101 text

struct zjit_jit_frame 
 Stack: six Ruby VM stack CPU Reg1 Reg2 Reg3 3 Arg2 Arg1 Locals: add1 Env: add1 method
 entry block
 handler rb_control_frame_t: add1 iseq self ep block code 2 C stack Stack: six 3 Ruby thread rb_execution_context_t vm_stack ... cfp Register writes: 2 Memory writes: 3 JIT Frame pc sp flags jit return ZJIT: Lightweight Frames

Slide 102

Slide 102 text

Interpreter vs YJIT vs ZJIT Interpreter YJIT ZJIT before ZJIT after Memory writes 14 12 10 3 Register writes 0 2 2 2

Slide 103

Slide 103 text

˒ Materializing JIT frame is slow; we want to avoid writing them ˒ The register allocator needs to spill registers into locations that the interpreter can retrieve as needed ˒ For local variables, and longjmp for exceptions Challenges in Lightweight Frames

Slide 104

Slide 104 text

https://github.com/ruby/ruby/pull/16262

Slide 105

Slide 105 text

˒ PC, ISEQ, and block_code are already optimized away ˒ They are mostly queried, not materialized in many cases ˒ Future work: ˒ SP, EP, self ˒ Env: method entry, block handler, fl ags ˒ Spills into the VM stack Lightweight Frames: Current State

Slide 106

Slide 106 text

No content

Slide 107

Slide 107 text

No content

Slide 108

Slide 108 text

No content

Slide 109

Slide 109 text

10% of JIT code is spent on method calls https://github.com/k0kubun/ruby/tree/zjit-perf-frames (before Lightweight Frames)

Slide 110

Slide 110 text

˒ Method inlining can use Lightweight Frames to save contexts for pro fi lers and other features ˒ Lightweight Frames does not remove register spills into the C stack, but method inlining can Method inlining with Lightweight Frames

Slide 111

Slide 111 text

Conclusion ˒ Ruby’s method calls are slow because it needs to write many fi elds that are read out-of-frame ˒ ZJIT’s Lightweight Frames will make method calls faster by lazily writing metadata

Slide 112

Slide 112 text

In RubyKaigi 2024…

Slide 113

Slide 113 text

No content

Slide 114

Slide 114 text

No content

Slide 115

Slide 115 text

No content

Slide 116

Slide 116 text

Let’s make Ruby faster together ˒ Anybody can understand and develop ZJIT by asking AI today ˒ Chat with us on Zulip: h tt ps://zjit.zulipchat.com/ ˒ See also: h tt ps://github.com/Shopify/ruby/issues