Slide 1

Slide 1 text

BYOJ: Build your own JIT @Faraaz98

Slide 2

Slide 2 text

What is a compiler? Ahead-of-Time compiler Just-in-Time compiler C, C++, Rust, etc Ruby, Python, Java, etc Compile, Then Execute Compile While Executing

Slide 3

Slide 3 text

The many JITs of Ruby MJIT YJIT First ever JIT for Ruby. Introduced in Ruby 2.6 To be discontinued RJIT New first-class JIT released with Ruby 3.0 Default JIT written in Ruby, shipping with Ruby 3.3 Experimental only

Slide 4

Slide 4 text

https://browse.arxiv.org/abs/1411.0352

Slide 5

Slide 5 text

The many JITs of Ruby MJIT YJIT First ever JIT for Ruby. Introduced in Ruby 2.6 To be discontinued RJIT New first-class JIT released with Ruby 3.0 Default JIT written in Ruby, shipping with Ruby 3.3 Experimental only

Slide 6

Slide 6 text

What is a machine?

Slide 7

Slide 7 text

Physical Machine Virtual Machine CPU Virtual CPU Core i9, Apple M2 Ruby interpreter (YARV)

Slide 8

Slide 8 text

YARV (Yet Another Ruby VM) A Stack machine: Stores/retrieves all your objects onto/from the stack 1 + 2 SP push 1 push 2 add PC PC: Program counter SP: Stack pointer

Slide 9

Slide 9 text

YARV (Yet Another Ruby VM) A Stack machine: Stores/retrieves all your objects onto/from the stack 1 SP 1 + 2 push 1 push 2 add PC

Slide 10

Slide 10 text

YARV (Yet Another Ruby VM) A Stack machine: Stores/retrieves all your objects onto/from the stack 1 2 SP 1 + 2 push 1 push 2 add PC

Slide 11

Slide 11 text

YARV (Yet Another Ruby VM) A Stack machine: Stores/retrieves all your objects onto/from the stack SP 1 + 2 push 1 push 2 add PC 2, 1

Slide 12

Slide 12 text

YARV (Yet Another Ruby VM) A Stack machine: Stores/retrieves all your objects onto/from the stack 3 SP 1 + 2 push 1 push 2 add PC

Slide 13

Slide 13 text

CPU A Register machine: Stores/retrieves all your values onto/from registers 1 + 2 mov r1, 1 mov r2, 2 add r1, r2 PC Register Value r1 1 r2 r3

Slide 14

Slide 14 text

CPU A Register machine: Stores/retrieves all your values onto/from registers 1 + 2 mov r1, 1 mov r2, 2 add r1, r2 PC Register Value r1 1 r2 2 r3

Slide 15

Slide 15 text

CPU A Register machine: Stores/retrieves all your values onto/from registers 1 + 2 mov r1, 1 mov r2, 2 add r1, r2 PC Register Value r1 3 r2 2 r3

Slide 16

Slide 16 text

Compiling nil 1. Put nil onto the stack 2. return nil SP def foo nil end

Slide 17

Slide 17 text

Compiling nil 1. Put nil onto the stack 2. return nil nil SP def foo nil end

Slide 18

Slide 18 text

Compiling nil 1. Put nil onto the stack 2. return nil nil SP WHERE DOES THE nil GO?? def foo nil end

Slide 19

Slide 19 text

Instruction sequence $ ruby --dump=insns foo.rb Looking under the hood == disasm: #foo.rb:1 (1,0)-(3,3)> 0000 definemethod :foo, foo 0003 putobject :foo 0005 leave == disasm: # 0000 putnil 0001 leave

Slide 20

Slide 20 text

nil SP 0000 putnil 0001 leave

Slide 21

Slide 21 text

nil SP WHERE DO WE leave FROM?? 0000 putnil 0001 leave

Slide 22

Slide 22 text

nil SP WHERE DO WE leave FROM?? WHERE DO WE leave TO?? 0000 putnil 0001 leave

Slide 23

Slide 23 text

Stack traces ● Shows the exact path from which you errored out ● Traces it all the way to the top level: def foo raise StandardError end def bar foo end bar frames.rb:2:in `foo': StandardError from frames.rb:6:in `bar' from frames.rb:9:in `'

Slide 24

Slide 24 text

A data structure that knows the following: frames.rb:2:in `foo': StandardError from frames.rb:6:in `bar' from frames.rb:9:in `' 1. Which method you are in right now

Slide 25

Slide 25 text

A data structure that knows the following: frames.rb:2:in `foo': StandardError from frames.rb:6:in `bar' from frames.rb:9:in `' 1. Which method you are in right now 2. Where in the program you are in right now

Slide 26

Slide 26 text

A data structure that knows the following: frames.rb:2:in `foo': StandardError from frames.rb:6:in `bar' from frames.rb:9:in `' 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return

Slide 27

Slide 27 text

A data structure that knows the following: frames.rb:2:in `foo': StandardError from frames.rb:6:in `bar' from frames.rb:9:in `' 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return

Slide 28

Slide 28 text

A data structure that knows the following: frames.rb:2:in `foo': StandardError from frames.rb:6:in `bar' from frames.rb:9:in `' 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return

Slide 29

Slide 29 text

A data structure that knows the following: frames.rb:2:in `foo': StandardError from frames.rb:6:in `bar' from frames.rb:9:in `' 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return

Slide 30

Slide 30 text

A data structure that knows the following: frames.rb:2:in `foo': StandardError from frames.rb:6:in `bar' from frames.rb:9:in `' 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return Needs to be stacked: 1. So Every method has its own copy of context 2. Can be unrolled when a complete backtrace needs to be shown

Slide 31

Slide 31 text

Control frames rb_control_frame_t [TOP] rb_control_frame_t [EVAL] rb_control_frame_t [METHOD:bar] CFP rb_control_frame_t [METHOD:foo] Represent the path taken by your program pc sp self ... rb_control_frame_t Program counter Stack pointer frames.rb:2:in `foo': StandardError from frames.rb:6:in `bar' from frames.rb:9:in `'

Slide 32

Slide 32 text

Increment CFP to leave == disasm: #@main.rb:1 (1,0)-(1,3)> (catch: FALSE) 0000 putnil 0001 leave Increment Control Frame Pointer

Slide 33

Slide 33 text

Building your own JIT (This time for real) Using RJIT

Slide 34

Slide 34 text

Using RJIT # Replace RJIT with JIT::Compiler RubyVM::RJIT::Compiler.prepend(Module.new { def compile(iseq, _) @compiler ||= JIT::Compiler.new @compiler.compile(iseq) end }) # Enable JIT compilation (paused by --rjit=pause) RubyVM::RJIT.resume

Slide 35

Slide 35 text

Building our own JIT # Replace RJIT with JIT::Compiler RubyVM::RJIT::Compiler.prepend(Module.new { def compile(iseq, _) @compiler ||= JIT::Compiler.new @compiler.compile(iseq) end }) # Enable JIT compilation (paused by --rjit=pause) RubyVM::RJIT.resume We write the JIT::Compiler class

Slide 36

Slide 36 text

The JIT::Compiler class module JIT class Compiler # Utilities to call C functions and interact with the Ruby VM. C = RubyVM::RJIT::C # Metadata for each YARV instruction. INSNS = RubyVM::RJIT::INSNS # Compile a method def compile(iseq) # Write machine code to this assembler. asm = Assembler.new # Iterate over each YARV instruction. insn_index = 0 while insn_index < iseq.body.iseq_size # Compile our instructions here end end end end

Slide 37

Slide 37 text

The JIT::Compiler class module JIT class Compiler # Utilities to call C functions and interact with the Ruby VM. C = RubyVM::RJIT::C # Metadata for each YARV instruction. INSNS = RubyVM::RJIT::INSNS # Compile a method def compile(iseq) # Write machine code to this assembler. asm = Assembler.new # Iterate over each YARV instruction. insn_index = 0 while insn_index < iseq.body.iseq_size # Compile our instructions here end end end end # Utilities to call C functions and interact with the Ruby VM. C = RubyVM::RJIT::C # Metadata for each YARV instruction. INSNS = RubyVM::RJIT::INSNS

Slide 38

Slide 38 text

The JIT::Compiler class module JIT class Compiler # Utilities to call C functions and interact with the Ruby VM. C = RubyVM::RJIT::C # Metadata for each YARV instruction. INSNS = RubyVM::RJIT::INSNS # Compile a method def compile(iseq) # Write machine code to this assembler. asm = Assembler.new # Iterate over each YARV instruction. insn_index = 0 while insn_index < iseq.body.iseq_size # Compile our instructions here end end end end def compile(iseq) # Write machine code to this assembler. asm = Assembler.new # Iterate over each YARV instruction. insn_index = 0 while insn_index < iseq.body.iseq_size # Compile our instructions here end end

Slide 39

Slide 39 text

Compiling putnil 0000 putnil 0001 leave def foo nil end

Slide 40

Slide 40 text

Compiling putnil # Iterate over each YARV instruction. insn_index = 0 while insn_index < iseq.body.iseq_size insn = INSNS.fetch( C.rb_vm_insn_decode(iseq.body.iseq_encoded[insn_index]) ) case insn.name in :putnil # ... end insn_index += insn.len end

Slide 41

Slide 41 text

Compiling putnil STACK = [:r8, :r9] stack_size = 0 while insn_index < iseq.body.iseq_size # ... case insn.name # ... in :putnil asm.mov(STACK[stack_size], C.to_value(nil)) stack_size += 1 end end :r8 :r9 SP nil r8 register

Slide 42

Slide 42 text

Compiling leave 0000 putnil 0001 leave def foo nil end

Slide 43

Slide 43 text

Compiling leave EC = :rdi CFP = :rsi while insn_index < iseq.body.iseq_size case insn.name # ... in :leave asm.add(CFP, C.rb_control_frame_t.size) asm.mov([EC, C.rb_execution_context_t.offsetof(:cfp)], CFP) end end

Slide 44

Slide 44 text

Compiling leave asm.add( CFP, C.rb_control_frame_t.size ) rb_control_frame_t [TOP] rb_control_frame_t [EVAL] rb_control_frame_t [METHOD:bar] CFP 0000 0007 0008 0015 0016 0024

Slide 45

Slide 45 text

Compiling leave rb_control_frame_t [TOP] rb_control_frame_t [EVAL] rb_control_frame_t [METHOD:bar] CFP 0000 0007 0008 0015 0016 0024 “Pop a control frame from the stack” asm.add( CFP, C.rb_control_frame_t.size )

Slide 46

Slide 46 text

Compiling leave asm.mov([EC, C.rb_execution_context_t.offsetof(:cfp)], CFP) rb_control_frame_t [TOP] rb_control_frame_t [EVAL] rb_control_frame_t [METHOD:bar] CFP pc sp self ... rb_control_frame_t

Slide 47

Slide 47 text

Compiling leave rb_control_frame_t [TOP] rb_control_frame_t [EVAL] rb_control_frame_t [METHOD:bar] CFP “Set execution context to current control frame” pc sp self ... rb_control_frame_t asm.mov([EC, C.rb_execution_context_t.offsetof(:cfp)], CFP)

Slide 48

Slide 48 text

Compiling leave EC = :rdi CFP = :rsi while insn_index < iseq.body.iseq_size case insn.name # ... in :leave asm.add(CFP, C.rb_control_frame_t.size) asm.mov([EC, C.rb_execution_context_t.offsetof(:cfp)], CFP) asm.mov(:rax, STACK[stack_size - 1]) asm.ret end end

Slide 49

Slide 49 text

Compiling leave EC = :rdi CFP = :rsi while insn_index < iseq.body.iseq_size case insn.name # ... in :leave asm.add(CFP, C.rb_control_frame_t.size) asm.mov([EC, C.rb_execution_context_t.offsetof(:cfp)], CFP) asm.mov(:rax, STACK[stack_size - 1]) asm.ret end end All return values go into :rax

Slide 50

Slide 50 text

Compiling 2 + 3 def five 2 + 3 end 0000 putobject 2 0002 putobject 3 0004 opt_plus 0006 leave

Slide 51

Slide 51 text

Compiling 2 + 3 0000 putobject 2 0002 putobject 3 0004 opt_plus 0006 leave PC SP

Slide 52

Slide 52 text

Compiling 2 + 3 2 PC SP 0000 putobject 2 0002 putobject 3 0004 opt_plus 0006 leave

Slide 53

Slide 53 text

Compiling 2 + 3 2 3 PC SP 0000 putobject 2 0002 putobject 3 0004 opt_plus 0006 leave

Slide 54

Slide 54 text

Compiling 2 + 3 5 PC SP 0000 putobject 2 0002 putobject 3 0004 opt_plus 0006 leave

Slide 55

Slide 55 text

Compiling putobject while insn_index < iseq.body.iseq_size case insn.name # ... in :putobject operand = iseq.body.iseq_encoded[insn_index + 1] asm.mov(STACK[stack_size], operand) stack_size += 1 end end

Slide 56

Slide 56 text

Compiling opt_plus while insn_index < iseq.body.iseq_size case insn.name # ... in :opt_plus recv = STACK[stack_size - 2] obj = STACK[stack_size - 1] asm.add(recv, obj) stack_size -= 1 end end 2 3 SP :r8 :r9

Slide 57

Slide 57 text

Compiling opt_plus while insn_index < iseq.body.iseq_size case insn.name # ... in :opt_plus recv = STACK[stack_size - 2] obj = STACK[stack_size - 1] asm.add(recv, obj) stack_size -= 1 end end 5 3 SP :r8 :r9

Slide 58

Slide 58 text

and that’s it!

Slide 59

Slide 59 text

Benchmarks (Higher is better) rubybench.github.io

Slide 60

Slide 60 text

One More Thing

Slide 61

Slide 61 text

def sum 2 + 3 end puts sum ➜ ruby plus.rb 420

Slide 62

Slide 62 text

Bonus - Monkey patching at the machine level while insn_index < iseq.body.iseq_size case insn.name # ... in :opt_plus recv = STACK[stack_size - 2] obj = STACK[stack_size - 1] # asm.add(recv, obj) asm.mov(recv, C.to_value(420)) stack_size -= 1 end end

Slide 63

Slide 63 text

Credits Takashi Kokubun @k0kubun Github: k0kubun/ruby-jit-challenge (github.com/k0kubun/ruby-jit-challenge)

Slide 64

Slide 64 text

Thank you!