Upgrade to Pro — share decks privately, control downloads, hide ads and more …

BYOJ: Build your Own JIT (RubyConfTH 2023)

Avatar for Syed Faraaz Ahmad Syed Faraaz Ahmad
October 06, 2023
47

BYOJ: Build your Own JIT (RubyConfTH 2023)

Avatar for Syed Faraaz Ahmad

Syed Faraaz Ahmad

October 06, 2023
Tweet

Transcript

  1. What is a compiler? Ahead-of-Time compiler Just-in-Time compiler C, C++,

    Rust, etc Ruby, Python, Java, etc Compile, Then Execute Compile While Executing
  2. The many JITs of Ruby MJIT YJIT First ever JIT

    for Ruby. Introduced in Ruby 2.6 To be discontinued RJIT New first-class JIT released with Ruby 3.0 Default JIT written in Ruby, shipping with Ruby 3.3 Experimental only
  3. The many JITs of Ruby MJIT YJIT First ever JIT

    for Ruby. Introduced in Ruby 2.6 To be discontinued RJIT New first-class JIT released with Ruby 3.0 Default JIT written in Ruby, shipping with Ruby 3.3 Experimental only
  4. YARV (Yet Another Ruby VM) A Stack machine: Stores/retrieves all

    your objects onto/from the stack 1 + 2 SP push 1 push 2 add PC PC: Program counter SP: Stack pointer
  5. YARV (Yet Another Ruby VM) A Stack machine: Stores/retrieves all

    your objects onto/from the stack 1 SP 1 + 2 push 1 push 2 add PC
  6. YARV (Yet Another Ruby VM) A Stack machine: Stores/retrieves all

    your objects onto/from the stack 1 2 SP 1 + 2 push 1 push 2 add PC
  7. YARV (Yet Another Ruby VM) A Stack machine: Stores/retrieves all

    your objects onto/from the stack SP 1 + 2 push 1 push 2 add PC 2, 1
  8. YARV (Yet Another Ruby VM) A Stack machine: Stores/retrieves all

    your objects onto/from the stack 3 SP 1 + 2 push 1 push 2 add PC
  9. CPU A Register machine: Stores/retrieves all your values onto/from registers

    1 + 2 mov r1, 1 mov r2, 2 add r1, r2 PC Register Value r1 1 r2 r3
  10. CPU A Register machine: Stores/retrieves all your values onto/from registers

    1 + 2 mov r1, 1 mov r2, 2 add r1, r2 PC Register Value r1 1 r2 2 r3
  11. CPU A Register machine: Stores/retrieves all your values onto/from registers

    1 + 2 mov r1, 1 mov r2, 2 add r1, r2 PC Register Value r1 3 r2 2 r3
  12. Compiling nil 1. Put nil onto the stack 2. return

    nil nil SP WHERE DOES THE nil GO?? def foo nil end
  13. Instruction sequence $ ruby --dump=insns foo.rb Looking under the hood

    == disasm: #<ISeq:<main>foo.rb:1 (1,0)-(3,3)> 0000 definemethod :foo, foo 0003 putobject :foo 0005 leave == disasm: #<ISeq:[email protected]:1 (1,0)-(3,3)> 0000 putnil 0001 leave
  14. nil SP WHERE DO WE leave FROM?? WHERE DO WE

    leave TO?? 0000 putnil 0001 leave
  15. Stack traces • Shows the exact path from which you

    errored out • Traces it all the way to the top level: <main> def foo raise StandardError end def bar foo end bar frames.rb:2:in `foo': StandardError from frames.rb:6:in `bar' from frames.rb:9:in `<main>'
  16. A data structure that knows the following: frames.rb:2:in `foo': StandardError

    from frames.rb:6:in `bar' from frames.rb:9:in `<main>' 1. Which method you are in right now
  17. A data structure that knows the following: frames.rb:2:in `foo': StandardError

    from frames.rb:6:in `bar' from frames.rb:9:in `<main>' 1. Which method you are in right now 2. Where in the program you are in right now
  18. A data structure that knows the following: frames.rb:2:in `foo': StandardError

    from frames.rb:6:in `bar' from frames.rb:9:in `<main>' 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return
  19. A data structure that knows the following: frames.rb:2:in `foo': StandardError

    from frames.rb:6:in `bar' from frames.rb:9:in `<main>' 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return
  20. A data structure that knows the following: frames.rb:2:in `foo': StandardError

    from frames.rb:6:in `bar' from frames.rb:9:in `<main>' 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return
  21. A data structure that knows the following: frames.rb:2:in `foo': StandardError

    from frames.rb:6:in `bar' from frames.rb:9:in `<main>' 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return
  22. A data structure that knows the following: frames.rb:2:in `foo': StandardError

    from frames.rb:6:in `bar' from frames.rb:9:in `<main>' 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return 1. Which method you are in right now 2. Where in the program you are in right now 3. What value you need to return Needs to be stacked: 1. So Every method has its own copy of context 2. Can be unrolled when a complete backtrace needs to be shown
  23. Control frames rb_control_frame_t [TOP] rb_control_frame_t [EVAL] rb_control_frame_t [METHOD:bar] CFP rb_control_frame_t

    [METHOD:foo] Represent the path taken by your program pc sp self ... rb_control_frame_t Program counter Stack pointer frames.rb:2:in `foo': StandardError from frames.rb:6:in `bar' from frames.rb:9:in `<main>'
  24. Increment CFP to leave == disasm: #<ISeq:<main>@main.rb:1 (1,0)-(1,3)> (catch: FALSE)

    0000 putnil 0001 leave Increment Control Frame Pointer
  25. Using RJIT # Replace RJIT with JIT::Compiler RubyVM::RJIT::Compiler.prepend(Module.new { def

    compile(iseq, _) @compiler ||= JIT::Compiler.new @compiler.compile(iseq) end }) # Enable JIT compilation (paused by --rjit=pause) RubyVM::RJIT.resume
  26. Building our own JIT # Replace RJIT with JIT::Compiler RubyVM::RJIT::Compiler.prepend(Module.new

    { def compile(iseq, _) @compiler ||= JIT::Compiler.new @compiler.compile(iseq) end }) # Enable JIT compilation (paused by --rjit=pause) RubyVM::RJIT.resume We write the JIT::Compiler class
  27. The JIT::Compiler class module JIT class Compiler # Utilities to

    call C functions and interact with the Ruby VM. C = RubyVM::RJIT::C # Metadata for each YARV instruction. INSNS = RubyVM::RJIT::INSNS # Compile a method def compile(iseq) # Write machine code to this assembler. asm = Assembler.new # Iterate over each YARV instruction. insn_index = 0 while insn_index < iseq.body.iseq_size # Compile our instructions here end end end end
  28. The JIT::Compiler class module JIT class Compiler # Utilities to

    call C functions and interact with the Ruby VM. C = RubyVM::RJIT::C # Metadata for each YARV instruction. INSNS = RubyVM::RJIT::INSNS # Compile a method def compile(iseq) # Write machine code to this assembler. asm = Assembler.new # Iterate over each YARV instruction. insn_index = 0 while insn_index < iseq.body.iseq_size # Compile our instructions here end end end end # Utilities to call C functions and interact with the Ruby VM. C = RubyVM::RJIT::C # Metadata for each YARV instruction. INSNS = RubyVM::RJIT::INSNS
  29. The JIT::Compiler class module JIT class Compiler # Utilities to

    call C functions and interact with the Ruby VM. C = RubyVM::RJIT::C # Metadata for each YARV instruction. INSNS = RubyVM::RJIT::INSNS # Compile a method def compile(iseq) # Write machine code to this assembler. asm = Assembler.new # Iterate over each YARV instruction. insn_index = 0 while insn_index < iseq.body.iseq_size # Compile our instructions here end end end end def compile(iseq) # Write machine code to this assembler. asm = Assembler.new # Iterate over each YARV instruction. insn_index = 0 while insn_index < iseq.body.iseq_size # Compile our instructions here end end
  30. Compiling putnil # Iterate over each YARV instruction. insn_index =

    0 while insn_index < iseq.body.iseq_size insn = INSNS.fetch( C.rb_vm_insn_decode(iseq.body.iseq_encoded[insn_index]) ) case insn.name in :putnil # ... end insn_index += insn.len end
  31. Compiling putnil STACK = [:r8, :r9] stack_size = 0 while

    insn_index < iseq.body.iseq_size # ... case insn.name # ... in :putnil asm.mov(STACK[stack_size], C.to_value(nil)) stack_size += 1 end end :r8 :r9 SP nil r8 register
  32. Compiling leave EC = :rdi CFP = :rsi while insn_index

    < iseq.body.iseq_size case insn.name # ... in :leave asm.add(CFP, C.rb_control_frame_t.size) asm.mov([EC, C.rb_execution_context_t.offsetof(:cfp)], CFP) end end
  33. Compiling leave rb_control_frame_t [TOP] rb_control_frame_t [EVAL] rb_control_frame_t [METHOD:bar] CFP 0000

    0007 0008 0015 0016 0024 “Pop a control frame from the stack” asm.add( CFP, C.rb_control_frame_t.size )
  34. Compiling leave rb_control_frame_t [TOP] rb_control_frame_t [EVAL] rb_control_frame_t [METHOD:bar] CFP “Set

    execution context to current control frame” pc sp self ... rb_control_frame_t asm.mov([EC, C.rb_execution_context_t.offsetof(:cfp)], CFP)
  35. Compiling leave EC = :rdi CFP = :rsi while insn_index

    < iseq.body.iseq_size case insn.name # ... in :leave asm.add(CFP, C.rb_control_frame_t.size) asm.mov([EC, C.rb_execution_context_t.offsetof(:cfp)], CFP) asm.mov(:rax, STACK[stack_size - 1]) asm.ret end end
  36. Compiling leave EC = :rdi CFP = :rsi while insn_index

    < iseq.body.iseq_size case insn.name # ... in :leave asm.add(CFP, C.rb_control_frame_t.size) asm.mov([EC, C.rb_execution_context_t.offsetof(:cfp)], CFP) asm.mov(:rax, STACK[stack_size - 1]) asm.ret end end All return values go into :rax
  37. Compiling 2 + 3 def five 2 + 3 end

    0000 putobject 2 0002 putobject 3 0004 opt_plus 0006 leave
  38. Compiling 2 + 3 0000 putobject 2 0002 putobject 3

    0004 opt_plus 0006 leave PC SP
  39. Compiling 2 + 3 2 PC SP 0000 putobject 2

    0002 putobject 3 0004 opt_plus 0006 leave
  40. Compiling 2 + 3 2 3 PC SP 0000 putobject

    2 0002 putobject 3 0004 opt_plus 0006 leave
  41. Compiling 2 + 3 5 PC SP 0000 putobject 2

    0002 putobject 3 0004 opt_plus 0006 leave
  42. Compiling putobject while insn_index < iseq.body.iseq_size case insn.name # ...

    in :putobject operand = iseq.body.iseq_encoded[insn_index + 1] asm.mov(STACK[stack_size], operand) stack_size += 1 end end
  43. Compiling opt_plus while insn_index < iseq.body.iseq_size case insn.name # ...

    in :opt_plus recv = STACK[stack_size - 2] obj = STACK[stack_size - 1] asm.add(recv, obj) stack_size -= 1 end end 2 3 SP :r8 :r9
  44. Compiling opt_plus while insn_index < iseq.body.iseq_size case insn.name # ...

    in :opt_plus recv = STACK[stack_size - 2] obj = STACK[stack_size - 1] asm.add(recv, obj) stack_size -= 1 end end 5 3 SP :r8 :r9
  45. Bonus - Monkey patching at the machine level while insn_index

    < iseq.body.iseq_size case insn.name # ... in :opt_plus recv = STACK[stack_size - 2] obj = STACK[stack_size - 1] # asm.add(recv, obj) asm.mov(recv, C.to_value(420)) stack_size -= 1 end end