Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A Rails Developer’s Guide To The Ruby VM

A Rails Developer’s Guide To The Ruby VM

RailsConf 2022

Maple Ong

June 07, 2022
Tweet

More Decks by Maple Ong

Other Decks in Programming

Transcript

  1. The Outline so you know what to expect • The

    Ruby Execution Process • What’s a YARV? • How The Ruby VM Works • Why?? • Practical Uses to Rails Apps
  2. foo_example = <<~HEREDOC def foo(a, b) puts a + b

    end foo(9, 8) HEREDOC puts Ripper.lex(foo_example)
  3. [[[1, 0], :on_kw, "def", FNAME], [[1, 3], :on_sp, " ",

    FNAME], [[1, 4], :on_ident, "foo", ENDFN], [[1, 7], :on_lparen, "(", BEG|LABEL], [[1, 8], :on_ident, "a", ARG], [[1, 9], :on_comma, ",", BEG|LABEL], [[1, 10], :on_sp, " ", BEG|LABEL], [[1, 11], :on_ident, "b", ARG], [[1, 12], :on_rparen, ")", ENDFN], [[1, 13], :on_ignored_nl, "\n", BEG], [[2, 0], :on_sp, " ", BEG], [[2, 2], :on_ident, "puts", CMDARG], [[2, 6], :on_sp, " ", CMDARG], [[2, 7], :on_ident, "a", END|LABEL], [[2, 8], :on_sp, " ", END|LABEL], [[2, 9], :on_op, "+", BEG], [[2, 10], :on_sp, " ", BEG], [[2, 11], :on_ident, "b", END|LABEL], [[2, 12], :on_nl, "\n", BEG], [[3, 0], :on_kw, "end", END], [[3, 3], :on_nl, "\n", BEG], [[4, 0], :on_ignored_nl, "\n", BEG], [[5, 0], :on_ident, "foo", CMDARG], [[5, 3], :on_lparen, "(", BEG|LABEL], [[5, 4], :on_int, "9", END], [[5, 5], :on_comma, ",", BEG|LABEL], [[5, 6], :on_sp, " ", BEG|LABEL], [[5, 7], :on_int, "8", END], [[5, 8], :on_rparen, ")", ENDFN], [[5, 9], :on_nl, "\n", BEG]]
  4. [[[1, 0], :on_kw, "def", FNAME], [[1, 3], :on_sp, " ",

    FNAME], [[1, 4], :on_ident, "foo", ENDFN], [[1, 7], :on_lparen, "(", BEG|LABEL], [[1, 8], :on_ident, "a", ARG], [[1, 9], :on_comma, ",", BEG|LABEL], [[1, 10], :on_sp, " ", BEG|LABEL], [[1, 11], :on_ident, "b", ARG], [[1, 12], :on_rparen, ")", ENDFN], [[1, 13], :on_ignored_nl, "\n", BEG], …
  5. Tokenizer Your Ruby Code Tokens [[[1, 0], :on_kw, "def", FNAME],

    [[1, 3], :on_sp, " ", FNAME], [[1, 4], :on_ident, "foo", ENDFN], [[1, 7], :on_lparen, "(", BEG|LABEL], [[1, 8], :on_ident, "a", ARG], [[1, 9], :on_comma, ",", BEG|LABEL], [[1, 10], :on_sp, " ", BEG|LABEL], [[1, 11], :on_ident, "b", ARG], [[1, 12], :on_rparen, ")", ENDFN], [[1, 13], :on_ignored_nl, "\n", BEG], [[2, 0], :on_sp, " ", BEG], [[2, 2], :on_ident, "puts", CMDARG], [[2, 6], :on_sp, " ", CMDARG], [[2, 7], :on_ident, "a", END|LABEL], [[2, 8], :on_sp, " ", END|LABEL], [[2, 9], :on_op, "+", BEG], [[2, 10], :on_sp, " ", BEG], [[2, 11], :on_ident, "b", END|LABEL], [[2, 12], :on_nl, "\n", BEG],
  6. [[[1, 0], :on_kw, "def", FNAME], [[1, 3], :on_sp, " ",

    FNAME], [[1, 4], :on_ident, "foo", ENDFN], [[1, 7], :on_lparen, "(", BEG|LABEL], [[1, 8], :on_ident, "a", ARG], [[1, 9], :on_comma, ",", BEG|LABEL], [[1, 10], :on_sp, " ", BEG|LABEL], [[1, 11], :on_ident, "b", ARG], [[1, 12], :on_rparen, ")", ENDFN], [[1, 13], :on_ignored_nl, "\n", BEG], [[2, 0], :on_sp, " ", BEG], [[2, 2], :on_ident, "puts", CMDARG], [[2, 6], :on_sp, " ", CMDARG], [[2, 7], :on_ident, "a", END|LABEL], [[2, 8], :on_sp, " ", END|LABEL], [[2, 9], :on_op, "+", BEG], [[2, 10], :on_sp, " ", BEG], [[2, 11], :on_ident, "b", END|LABEL], [[2, 12], :on_nl, "\n", BEG], [[3, 0], :on_kw, "end", END], [[3, 3], :on_nl, "\n", BEG], [[4, 0], :on_ignored_nl, "\n", BEG], [[5, 0], :on_ident, "foo", CMDARG], [[5, 3], :on_lparen, "(", BEG|LABEL], [[5, 4], :on_int, "9", END], [[5, 5], :on_comma, ",", BEG|LABEL], [[5, 6], :on_sp, " ", BEG|LABEL], [[5, 7], :on_int, "8", END], [[5, 8], :on_rparen, ")", ENDFN], [[5, 9], :on_nl, "\n", BEG]]
  7. foo_example = <<~HEREDOC def foo(a, b) puts a + b

    end foo(9, 8) HEREDOC puts Ripper.sexp(foo_example) e
  8. command args_add_block ident “puts” binary var_ref var_ref ident “b” ident

    “a” + Abstract Syntax Tree (AST) Representation [:command, [:@ident, "puts", [2, 2]], [:args_add_block, [[:binary, [:var_ref, [:@ident, "a", [2, 7]]], :+, [:var_ref, [:@ident, "b", [2, 11]] ] ] ], false]] Symbolic Expression (S-exp) Representation puts a + b Source Code
  9. foo_example = <<~HEREDOC def foo(a, b) puts a + b

    end foo(9, 8) HEREDOC puts RubyVM::InstructionSequence.compile(foo_example, specialized_instruction: false, instructions_unification: false, operands_unification: false ).to_a
  10. foo_example = <<~HEREDOC def foo(a, b) puts a + b

    end foo(9, 8) HEREDOC puts RubyVM::InstructionSequence.compile(foo_example, specialized_instruction: false, instructions_unification: false, operands_unification: false ).to_a
  11. ["YARVInstructionSequence/SimpleDataFormat", 3, 1, 1, {:arg_size=>0, :local_size=>0, :stack_max=>3, :node_id=>19, :code_location=>[1, 0,

    5, 9], :node_ids=>[1, 12, 13, 15, 12, -1]}, "<compiled>", "<compiled>", "<compiled>", 1, :top, [], {}, [], [1, :RUBY_EVENT_LINE, [:definemethod, :foo, ["YARVInstructionSequence/SimpleDataFormat", 3, 1, 1, {:arg_size=>2, :local_size=>2, :stack_max=>3, :node_id=>11, :code_location=>[1, 0, 3, 3], :node_ids=>[5, 6, 7, 9, 5, -1]}, "foo", "<compiled>", "<compiled>", 1, :method, [:a, :b], {:lead_num=>2}, [], [2, :RUBY_EVENT_LINE, :RUBY_EVENT_CALL, [:putself], [:getlocal, 4, 0], [:getlocal, 3, 0], [:send, {:mid=>:+, :flag=>16, :orig_argc=>1}, nil], [:send, {:mid=>:puts, :flag=>20, :orig_argc=>1}, nil], 3, :RUBY_EVENT_RETURN, [:leave]]]], 5, :RUBY_EVENT_LINE, [:putself], [:putobject, 9], [:putobject, 8], [:send, {:mid=>:foo, :flag=>20, :orig_argc=>2}, nil], [:leave]]]
  12. foo_example = <<~HEREDOC def foo(a, b) puts a + b

    end foo(9, 8) HEREDOC puts RubyVM::InstructionSequence.compile(foo_example, specialized_instruction: false, instructions_unification: false, operands_unification: false ).disasm
  13. == disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(5,9)> (catch: FALSE) 0000 definemethod :foo, foo

    ( 1)[Li] 0003 putself ( 5)[Li] 0004 putobject 9 0006 putobject 8 0008 send <calldata!mid:foo, argc:2, FCALL|ARGS_SIMPLE>, nil 0011 leave == disasm: #<ISeq:foo@<compiled>:1 (1,0)-(3,3)> (catch: FALSE) local table (size: 2, argc: 2 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 2] a@0<Arg> [ 1] b@1<Arg> 0000 putself ( 2)[LiCa] 0001 getlocal a@0, 0 0004 getlocal b@1, 0 0007 send <calldata!mid:+, argc:1, ARGS_SIMPLE>, nil 0010 send <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, nil 0013 leave ( 3)[Re]
  14. == disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(5,9)> (catch: FALSE) 0000 definemethod :foo, foo

    ( 1)[Li] 0003 putself ( 5)[Li] 0004 putobject 9 0006 putobject 8 0008 send <calldata! mid:foo, argc:2, FCALL|ARGS_SIMPLE>, nil 0011 leave == disasm: #<ISeq:foo@<compiled>:1 (1,0)-(3,3)> (catch: FALSE) local table (size: 2, argc: 2 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 2] a@0<Arg> [ 1] b@1<Arg> 0000 putself ( 2)[LiCa] 0001 getlocal a@0, 0 0004 getlocal b@1, 0 0007 send <calldata!mid:+, argc:1, ARGS_SIMPLE>, nil 0010 send <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, nil 0013 leave ( 3)[Re] … puts a + b … … def foo(a, b) … end foo(9, 8)
  15. Compiler AST nodes Bytecode == disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)- (5,9)> (catch:

    FALSE) 0000 definemethod :foo, foo ( 1)[Li] 0003 putself ( 5)[Li] 0004 putobject 9 0006 putobject 8 0008 send <calldata!mid:foo, argc:2, FCALL|ARGS_SIMPLE>, nil 0011 leave com args_add ident binar var_re var_re ident ident +
  16. ➡ & Interpret Tokenizer Parser Compiler Your Ruby Code AST

    nodes Tokens Bytecode Virtual Machine
  17. 1. Internal stack 2. Control frame stack Stores values: returns,

    local variables, arguments Tracks Ruby’s call stack (methods and blocks)
  18. == disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(1,10)> (catch: FALSE) 0000 putself ( 1)[Li]

    0001 putobject 9 0003 putobject 8 0005 send <calldata!mid:+, argc:1, ARGS_SIMPLE>, nil 0008 send <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, nil 0011 leave Output: puts RubyVM::InstructionSequence.compile(“puts 9 + 8”, specialized_instruction: false, instructions_unification: false, operands_unification: false ).disasm
  19. leave send(:puts, argc: 1) send(:+, argc: 1) putobject(8) putobject(9) putself

    Program Counter self YARV Instruction Sequence YARV’s Internal Stack Stack Pointer Instruction being executed: putself
  20. leave send(:puts, argc: 1) send(:+, argc: 1) putobject(8) putobject(9) putself

    Program Counter self 9 YARV Instruction Sequence YARV’s Internal Stack Stack Pointer Instruction being executed: putobject(9)
  21. leave send(:puts, argc: 1) send(:+, argc: 1) putobject(8) putobject(9) putself

    Program Counter self 9 8 YARV Instruction Sequence YARV’s Internal Stack Stack Pointer Instruction being executed: putobject(8)
  22. leave send(:puts, argc: 1) send(:+, argc: 1) putobject(8) putobject(9) putself

    Program Counter self 9 8 YARV Instruction Sequence YARV’s Internal Stack Stack Pointer Instruction being executed: send(:+, argc: 1)
  23. leave send(:puts, argc: 1) send(:+, argc: 1) putobject(8) putobject(9) putself

    Program Counter self YARV Instruction Sequence YARV’s Internal Stack Stack Pointer Instruction being executed: 9.send(:+, 8) 17
  24. leave send(:puts, argc: 1) send(:+, argc: 1) putobject(8) putobject(9) putself

    Program Counter self YARV Instruction Sequence YARV’s Internal Stack Stack Pointer Instruction being executed: 17 send(:puts, argc: 1)
  25. leave send(:puts, argc: 1) send(:+, argc: 1) putobject(8) putobject(9) putself

    Program Counter YARV Instruction Sequence YARV’s Internal Stack Stack Pointer Instruction being executed: self.send(:puts, 17) nil
  26. leave send(:puts, argc: 1) send(:+, argc: 1) putobject(8) putobject(9) putself

    Program Counter YARV Instruction Sequence YARV’s Internal Stack Stack Pointer Instruction being executed: nil leave
  27. == disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(5,9)> (catch: FALSE) 0000 definemethod :foo, foo

    ( 1)[Li] 0003 putself ( 5)[Li] 0004 putobject 9 0006 putobject 8 0008 send <calldata!mid:foo, argc:2, FCALL|ARGS_SIMPLE>, nil 0011 leave == disasm: #<ISeq:foo@<compiled>:1 (1,0)-(3,3)> (catch: FALSE) local table (size: 2, argc: 2 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 2] a@0<Arg> [ 1] b@1<Arg> 0000 putself ( 2)[LiCa] 0001 getlocal a@0, 0 0004 getlocal b@1, 0 0007 send <calldata!mid:+, argc:1, ARGS_SIMPLE>, nil 0010 send <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, nil 0013 leave ( 3)[Re]
  28. == disasm: #<ISeq:<compiled>@<compiled>:1 (1,0)-(5,9)> (catch: FALSE) 0000 definemethod :foo, foo

    ( 1)[Li] 0003 putself ( 5)[Li] 0004 putobject 9 0006 putobject 8 0008 send <calldata!mid:foo, argc:2, FCALL| ARGS_SIMPLE>, nil 0011 leave YARV Instruction Sequence For Main Scope
  29. definemethod(:foo) putself putobject(9) putobject(8) send(:foo, argc: 2) leave Simpli ied

    Bytecode: Main Scope def foo(a, b) … end foo(9, 8) Source Code
  30. leave send(:foo, argc: 2) putobject(8) putobject(9) putself definemethod(:foo) Program Counter

    YARV instructions for main scope YARV’s Internal Stack Stack Pointer Instruction being executed: definemethod(:foo)
  31. leave send(:foo, argc: 2) putobject(8) putobject(9) putself definemethod(:foo) Program Counter

    self YARV instructions for main scope YARV’s Internal Stack Stack Pointer Instruction being executed: putself
  32. leave send(:foo, argc: 2) putobject(8) putobject(9) putself definemethod(:foo) Program Counter

    self 9 YARV instructions for main scope YARV’s Internal Stack Stack Pointer Instruction being executed: putobject(9)
  33. leave send(:foo, argc: 2) putobject(8) putobject(9) putself definemethod(:foo) Program Counter

    self 9 8 YARV instructions for main scope YARV’s Internal Stack Stack Pointer Instruction being executed: putobject(8)
  34. leave send(:foo, argc: 2) putobject(8) putobject(9) putself definemethod(:foo) Program Counter

    self 9 8 YARV instructions for main scope YARV’s Internal Stack Stack Pointer Instruction being executed: send(:foo, argc: 2)
  35. == disasm: #<ISeq:foo@<compiled>:1 (1,0)-(3,3)> (catch: FALSE) local table (size: 2,

    argc: 2 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 2] a@0<Arg> [ 1] b@1<Arg> 0000 putself ( 2)[LiCa] 0001 getlocal a@0, 0 0004 getlocal b@1, 0 0007 send <calldata!mid:+, argc:1, ARGS_SIMPLE>, nil 0010 send <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, nil 0013 leave ( 3)[Re] YARV Instruction Sequence For Method #foo
  36. == disasm: #<ISeq:foo@<compiled>:1 (1,0)-(3,3)> (catch: FALSE) local table (size: 2,

    argc: 2 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 2] a@0<Arg> [ 1] b@1<Arg> 0000 putself ( 2)[LiCa] 0001 getlocal a@0, 0 0004 getlocal b@1, 0 0007 send <calldata!mid:+, argc:1, ARGS_SIMPLE>, nil 0010 send <calldata!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, nil 0013 leave ( 3)[Re] YARV Instruction Sequence For Method #foo
  37. putself getlocal(a, index: 0) getlocal(b, index: 1) send(:+, argc: 1)

    send(:puts, argc: 1) leave Simpli ied Bytecode: #foo Method Scope a@0 <Arg> b@1 <Arg> Local Table … puts a + b … … Source Code
  38. leave send(:puts, argc: 1) send(:+, argc: 1) getlocal(b, index: 1)

    getlocal(a, index: 0) putself YARV instructions for method scope self 9 8 YARV’s Internal Stack Instruction being executed: Stack Pointer
  39. leave send(:puts, argc: 1) send(:+, argc: 1) getlocal(b, index: 1)

    getlocal(a, index: 0) putself YARV instructions for method scope self 9 8 YARV’s Internal Stack Environment Pointer Instruction being executed: Special 3 Special 2 Special 1 Stack Pointer
  40. leave send(:puts, argc: 1) send(:+, argc: 1) getlocal(b, index: 1)

    getlocal(a, index: 0) putself Program Counter YARV instructions for method scope self 9 8 YARV’s Internal Stack Environment Pointer Instruction being executed: Special 3 Special 2 Special 1 Stack Pointer putself self
  41. leave send(:puts, argc: 1) send(:+, argc: 1) getlocal(b, index: 1)

    getlocal(a, index: 0) putself Program Counter YARV instructions for method scope self 9 8 YARV’s Internal Stack Environment Pointer Instruction being executed: Special 3 Special 2 Special 1 Stack Pointer self 9 getlocal(a, index: 0)
  42. leave send(:puts, argc: 1) send(:+, argc: 1) getlocal(b, index: 1)

    getlocal(a, index: 0) putself Program Counter YARV instructions for method scope self 9 8 YARV’s Internal Stack Environment Pointer Instruction being executed: Special 3 Special 2 Special 1 Stack Pointer self 9 8 getlocal(b, index: 1)
  43. leave send(:puts, argc: 1) send(:+, argc: 1) getlocal(b, index: 1)

    getlocal(a, index: 0) putself Program Counter YARV instructions for method scope self 9 8 YARV’s Internal Stack Environment Pointer Instruction being executed: Special 3 Special 2 Special 1 Stack Pointer self 9 8 send(:+, argc: 1)
  44. leave send(:puts, argc: 1) send(:+, argc: 1) getlocal(b, index: 1)

    getlocal(a, index: 0) putself Program Counter YARV instructions for method scope self 9 8 YARV’s Internal Stack Environment Pointer Instruction being executed: Special 3 Special 2 Special 1 Stack Pointer self 17 9.send(:+, 8)
  45. leave send(:puts, argc: 1) send(:+, argc: 1) getlocal(b, index: 1)

    getlocal(a, index: 0) putself Program Counter YARV instructions for method scope self 9 8 YARV’s Internal Stack Environment Pointer Instruction being executed: Special 3 Special 2 Special 1 Stack Pointer self 17 send(:puts, argc: 1)
  46. leave send(:puts, argc: 1) send(:+, argc: 1) getlocal(b, index: 1)

    getlocal(a, index: 0) putself Program Counter YARV instructions for method scope self 9 8 YARV’s Internal Stack Environment Pointer Instruction being executed: Special 3 Special 2 Special 1 Stack Pointer self.send(:puts, 17) nil
  47. leave send(:puts, argc: 1) send(:+, argc: 1) getlocal(b, index: 1)

    getlocal(a, index: 0) putself Program Counter YARV instructions for method scope self 9 8 YARV’s Internal Stack Environment Pointer Instruction being executed: Special 3 Special 2 Special 1 Stack Pointer nil leave
  48. Why use a virtual machine? • Better performance, especially for

    long-running programs • e.g. loops with large number of iterations • Additional optimizations performed on bytecode • Expandable / future-proof (e.g. JITs) • Cache bytecode instructions for use at a later time
  49. Why use a virtual machine? • Better performance, especially for

    long-running programs • e.g. loops with large number of iterations • Additional optimizations performed on bytecode • Expandable / future-proof (e.g. JITs) • Cache bytecode instructions for use at a later time
  50. Tokenizer Parser Compiler Your Ruby Code AST nodes Tokens Bytecode

    VM Interpret CPU JIT Compiler Machine code
  51. Tokenizer Parser Compiler Your Ruby Code AST nodes Tokens Bytecode

    VM Interpret CPU JIT Compiler Machine code
  52. Why use a virtual machine? • Better performance, especially for

    long-running programs • e.g. loops with large number of iterations • Additional optimizations performed on bytecode • Expandable / future-proof (e.g. JITs) • Cache bytecode instructions for use at a later time
  53. • Ruby parser & formatter: ruby-syntax-tree/syntax_tree • YARV emulator: kddnewton/yarv

    • JIT for Ruby (educational): chrisseaton/rhizome • JIT for Ruby (YJIT): tenderlove/tenderjit Projects Written in Ruby • YJIT: shopify/yjit • Bootsnap: shopify/bootsnap Mentioned Projects