Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Parsing Ruby (RubyConf)

Kevin Newton
November 09, 2021
120

Parsing Ruby (RubyConf)

Since Ruby's inception, there have been many different projects that parse Ruby code. This includes everything from development tools to Ruby implementations themselves. This talk dives into the technical details and tradeoffs of how each of these tools parses and subsequently understands your applications. After, we'll discuss how you can do the same with your own projects using the Ripper standard library. You'll see just how far we can take this library toward building useful development tools.

Kevin Newton

November 09, 2021
Tweet

Transcript

  1. • Build a grammar for a simple language • Build

    a parser for a simple language • Look at the history of the Ruby parser
  2. • Build a grammar for a simple language • Build

    a parser for a simple language • Look at the history of the Ruby parser • Look at how Ripper works
  3. twitter.com/kddnewton Parsing Ruby program: | expression expression: | number "+"

    number | number "-" number | number number: | NUMBER
  4. twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+"

    number | expression "-" number | number number: | NUMBER
  5. twitter.com/kddnewton Parsing Ruby 1 program: | expression expression: | expression

    "+" number | expression "-" number | number number: | NUMBER
  6. twitter.com/kddnewton Parsing Ruby 1 + 2 program: | expression expression:

    | expression "+" number | expression "-" number | number number: | NUMBER
  7. twitter.com/kddnewton Parsing Ruby 1 + 2 + 3 program: |

    expression expression: | expression "+" number | expression "-" number | number number: | NUMBER
  8. twitter.com/kddnewton Parsing Ruby 1 + 2 + 3 - 4

    + 5 - 6 program: | expression expression: | expression "+" number | expression "-" number | number number: | NUMBER
  9. twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+"

    number | expression "-" number | number number: | NUMBER
  10. twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+"

    number | expression "-" number | number term: | term "*" number | term "/" number | number number: | NUMBER
  11. twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+"

    term | expression "-" term | term term: | term "*" number | term "/" number | number number: | NUMBER
  12. twitter.com/kddnewton Parsing Ruby 1 program: | expression expression: | expression

    "+" term | expression "-" term | term term: | term "*" number | term "/" number | number number: | NUMBER
  13. twitter.com/kddnewton Parsing Ruby 1 * 2 program: | expression expression:

    | expression "+" term | expression "-" term | term term: | term "*" number | term "/" number | number number: | NUMBER
  14. twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+"

    term | expression "-" term | term term: | term "*" number | term "/" number | number number: | NUMBER
  15. twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+"

    term | expression "-" term | term term: | term "*" number | term "/" number | number number: | NUMBER | "(" expression ")"
  16. twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+"

    term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  17. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end
  18. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end
  19. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end
  20. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end
  21. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end
  22. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end
  23. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end
  24. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end
  25. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER
  26. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+”
  27. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(”
  28. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER
  29. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER “-”
  30. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER “-” NUMBER
  31. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER “-” NUMBER “)”
  32. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER “-” NUMBER “)” “*”
  33. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER “-” NUMBER “)” “*” NUMBER
  34. twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

    NUMBER “+” “(” NUMBER “-” NUMBER “)” “*” NUMBER
  35. twitter.com/kddnewton Parsing Ruby NUMBER “+” “(” NUMBER “-” NUMBER “)”

    “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  36. twitter.com/kddnewton Parsing Ruby NUMBER “+” “(” NUMBER “-” NUMBER “)”

    “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  37. twitter.com/kddnewton Parsing Ruby “+” “(” NUMBER “-” NUMBER “)” “*”

    NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" NUMBER
  38. twitter.com/kddnewton Parsing Ruby “+” “(” NUMBER “-” NUMBER “)” “*”

    NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" NUMBER
  39. twitter.com/kddnewton Parsing Ruby “+” “(” NUMBER “-” NUMBER “)” “*”

    NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor
  40. twitter.com/kddnewton Parsing Ruby “(” NUMBER “-” NUMBER “)” “*” NUMBER

    program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" “+” factor
  41. twitter.com/kddnewton Parsing Ruby NUMBER “-” NUMBER “)” “*” NUMBER program:

    | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" “(” “+” factor
  42. twitter.com/kddnewton Parsing Ruby “-” NUMBER “)” “*” NUMBER program: |

    expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" NUMBER “(” “+” factor
  43. twitter.com/kddnewton Parsing Ruby “-” NUMBER “)” “*” NUMBER program: |

    expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" NUMBER “(” “+” factor
  44. twitter.com/kddnewton Parsing Ruby “-” NUMBER “)” “*” NUMBER program: |

    expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor “(” “+” factor
  45. twitter.com/kddnewton Parsing Ruby NUMBER “)” “*” NUMBER program: | expression

    expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" “-” factor “(” “+” factor
  46. twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER program: | expression expression:

    | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" NUMBER “-” factor “(” “+” factor
  47. twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER program: | expression expression:

    | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor “-” factor “(” “+” factor
  48. twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER program: | expression expression:

    | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor “+” “(” factor “-” factor
  49. twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER program: | expression expression:

    | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor “-” factor “(” “+” factor
  50. twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER program: | expression expression:

    | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor “-” factor “(” “+” factor
  51. twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER term “-” term “(”

    “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  52. twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER term “-” term “(”

    “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  53. twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER expression “-” term “(”

    “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  54. twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER expression “-” term “(”

    “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  55. twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER expression “(” “+” factor

    program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  56. twitter.com/kddnewton Parsing Ruby “*” NUMBER “)” expression “(” “+” factor

    program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  57. twitter.com/kddnewton Parsing Ruby “*” NUMBER “)” expression “(” “+” factor

    program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  58. twitter.com/kddnewton Parsing Ruby “*” NUMBER factor “+” factor program: |

    expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  59. twitter.com/kddnewton Parsing Ruby NUMBER “*” factor “+” factor program: |

    expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  60. twitter.com/kddnewton Parsing Ruby NUMBER “*” factor “+” factor program: |

    expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  61. twitter.com/kddnewton Parsing Ruby factor “*” factor “+” factor program: |

    expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  62. twitter.com/kddnewton Parsing Ruby term “*” factor “+” factor program: |

    expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  63. twitter.com/kddnewton Parsing Ruby term “*” factor “+” factor program: |

    expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  64. twitter.com/kddnewton Parsing Ruby term “+” factor program: | expression expression:

    | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  65. twitter.com/kddnewton Parsing Ruby expression “+” term program: | expression expression:

    | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  66. twitter.com/kddnewton Parsing Ruby expression “+” term program: | expression expression:

    | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  67. twitter.com/kddnewton Parsing Ruby program program: | expression expression: | expression

    "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  68. twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+"

    term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"
  69. twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+"

    term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" class Parser prechigh left "*" "/" left "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression | expression "-" expression | expression "*" expression | expression "/" expression | "(" expression ")" | NUMBER end
  70. twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left

    "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression | expression "-" expression | expression "*" expression | expression "/" expression | "(" expression ")" | NUMBER end
  71. twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left

    "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression { val[0] + val[2] } | expression "-" expression { val[0] - val[2] } | expression "*" expression { val[0] * val[2] } | expression "/" expression { val[0] / val[2] } | "(" expression ")" { val[1] } | NUMBER end
  72. twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left

    "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER end
  73. twitter.com/kddnewton Parsing Ruby $ racc parser.y -o parser.rb $ irb

    irb(main):001:0> require_relative "parser" => true irb(main):002:0>
  74. twitter.com/kddnewton Parsing Ruby $ racc parser.y -o parser.rb $ irb

    irb(main):001:0> require_relative "parser" => true irb(main):002:0> Parser.new.parse("1 + (4 - 2) * 3") => [:add, 1, [:mul, [:prn, [:sub, 4, 2]], 3]] irb(main):003:0>
  75. twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left

    "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER end
  76. twitter.com/kddnewton Parsing Ruby Ruby 0.06 1994-01-07 Fri Jan 7 15:23:20

    1994 Yukihiro Matsumoto (matz at nws119) * baseline - version 0.06.
  77. twitter.com/kddnewton Parsing Ruby Ruby 0.76 1995-05-19 Thu Jul 14 11:18:07

    1994 Yukihiro Matsumoto (matz@ix-02) * parse.y: Dictを⽣成する構⽂を追加. こちらを{..}にした. * parse.y: 配列を⽣成する構⽂を[..]に変更した. 過去のRubyスクリプ トとの互換性が保てないが, Dictを⽣成する構⽂を導⼊するに当たり, perl5に合わせて(意識して), 変更する時期は今しかないと考えた. *BACKWARD INCOMPATIBILITY*
  78. twitter.com/kddnewton Parsing Ruby Ruby 0.95 1995-12-21 Thu Nov 9 23:26:01

    1995 Yukihiro Matsumoto <[email protected]> * parse.y (f_arglist): メソッド定義の引数を括弧で括らなくても良い ようにした. Mon Aug 7 12:47:41 1995 Yukihiro Matsumoto <[email protected]> * parse.y: resque -> rescue.恥ずかしいがtypoを残しておくわけには いかないよなあ.なんで今まで気がつかなかったのか….
  79. twitter.com/kddnewton Parsing Ruby Ruby 1.0.961225 1996-12-25 Wed May 22 19:48:42

    1996 Yukihiro Matsumoto <[email protected]> * parse.y (superclass): スーパークラスの指定⼦を`:'から`<'に変更. Wed Mar 27 10:02:44 1996 Yukihiro Matsumoto <[email protected]> * parse.y: 予約語の変更 continue -> next
  80. twitter.com/kddnewton Parsing Ruby Ruby 1.0.971225 1997-12-25 Mon Apr 7 11:36:16

    1997 Yukihiro Matsumoto <[email protected]> * parse.y (primary): syntax to access singleton class. Thu Apr 3 02:12:31 1997 Yukihiro Matsumoto <[email protected]> * parse.y (parse_regx): new option //[nes] to specify character code for regexp literals. Last speci fi ed code option is valid.
  81. twitter.com/kddnewton Parsing Ruby Ruby 1.2.0 1998-12-25 • heredocs
 • =begin

    to =end
 • true and false
 • BEGIN and END
 • %w
 • Top-level constant access
 • ||= and &&=
  82. twitter.com/kddnewton Parsing Ruby Ruby 1.4.0 1999-08-13 • binary number literals


    • anonymous * in method de fi nitions
 • nested string interpolation
 • multibyte character identi fi ers
  83. twitter.com/kddnewton Parsing Ruby Ruby 1.7.1 2001-06-01 • break and next

    now accept values
 • %w can escape spaces
 • rescue in singleton method bodies
  84. twitter.com/kddnewton Parsing Ruby JRuby 2001-09-10 • Java port of Ruby

    1.6
 • Rewrite actions in parse.y into Java
 • Rewrite standard library in Ruby
  85. twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left

    "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER end
  86. twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left

    "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER end
  87. twitter.com/kddnewton Parsing Ruby ripper 0.0.1 2001-10-20 • Rewrite parse.y to

    dispatch parser events
 • “Ripper is still early-alpha version”
  88. twitter.com/kddnewton Parsing Ruby Ruby 1.8.0 2003-08-04 • %W word lists


    • Dynamic symbols
 • Nested constant assignment
  89. twitter.com/kddnewton Parsing Ruby ruby_parser 1.0.0 2007-11-14 • Rewrite parse.y in

    Ruby, use racc
 • dawnscanner, debride, fasterer, fl ay, fl og, railroader, roodi
  90. twitter.com/kddnewton Parsing Ruby Ruby 1.9.0 2007-12-25 • Bison
 • YARV


    • Ripper merged
 • Lambda literals
 • Symbol hash keys
  91. twitter.com/kddnewton Parsing Ruby Ruby 1.9.1 2009-01-30 • encoding pragma
 •

    call shorthand
 • positional arguments after splat
  92. twitter.com/kddnewton Parsing Ruby Ruby Intermediate Language 2009-10-26 • Intermediate representation

    for powering semantic analysis
 • Used to implement type systems in OCaml
 • druby, rtc, rubydust
  93. twitter.com/kddnewton Parsing Ruby Ruby 2.0.0 • Re fi nements
 •

    %i symbol lists
 • Keyword arguments 2013-02-24
  94. twitter.com/kddnewton Parsing Ruby parser • Published gem, parser API
 •

    Well-documented
 • covered, deep-cover, erb-lint, fast, opal, packwerk, querly, rdl, reek, rubocop, rubrowser, ruby-lint, ruby-next, ruby_detective, rubycritic, seeing_is_believing, standard, steep, unparser, vernacular, yoda 2013-04-15
  95. twitter.com/kddnewton Parsing Ruby Tru ffl eRuby • Originally branched o

    ff JRuby
 • Graal dynamic compiler, Tru ffl e AST interpreter 2013-10-26
  96. twitter.com/kddnewton Parsing Ruby Ruby 2.1.0 • Required keyword arguments
 •

    Rational and complex literals
 • Frozen string literal su ffi x 2013-12-25
  97. twitter.com/kddnewton Parsing Ruby Ruby 2.3.0 • Frozen string literal pragma


    • <<~ heredocs
 • &. lonely operator 2015-12-25
  98. twitter.com/kddnewton Parsing Ruby Ruby 2.4.0 • Symbol#to_proc re fi nements


    • Top-level return
 • Multiple assignment in conditional 2016-12-25
  99. twitter.com/kddnewton Parsing Ruby typedruby • Type system in Rust, parser

    in C++
 • Grammar from Ruby 2.4, lexer from ruby_parser
 • Vendored in Sorbet 2017-02-26
  100. twitter.com/kddnewton Parsing Ruby Ruby 2.5.0 • String interpolation re fi

    nements
 • rescue and ensure at the block level 2017-12-25
  101. twitter.com/kddnewton Parsing Ruby Ruby 2.6.0 • RubyVM::AbstractSyntaxTree
 • Flip- fl

    op deprecated
 • Endless ranges
 • Non-ASCII constant names 2018-12-25
  102. twitter.com/kddnewton Parsing Ruby Ruby 2.7.0 • Flip- fl op undeprecated

    • Method reference operator • Keyword argument warning • No other keywords syntax • Beginless range • Pattern matching • Numbered parameters • Rightward assignment • Argument forwarding 2019-12-25
  103. twitter.com/kddnewton Parsing Ruby Ruby 3.0.0 • Keyword arguments
 • Single-line

    “endless” methods
 • “Find pattern” pattern matching
 • shareable_constant_value pragma
 • in keyword pattern matching 2020-12-25
  104. twitter.com/kddnewton Parsing Ruby Ruby 3.1.0 Preview 1 • Hash literal

    shorthand { x:, y: } == { x: x, y: y }
 • Pinned expressions 2021-11-09
  105. twitter.com/kddnewton Parsing Ruby class Parser expression : expression "+" expression

    { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER def parse(input) until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end end
  106. twitter.com/kddnewton Parsing Ruby class Parser expression : expression "+" expression

    { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER def parse(input) until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end end
  107. twitter.com/kddnewton Parsing Ruby class Parser expression : expression "+" expression

    { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER def parse(input) until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end end
  108. twitter.com/kddnewton Parsing Ruby static enum yytokentype parser_yylex(struct parser_params *p) {

    ... case '#': /* it's a comment */ p->token_seen = token_seen; lex_goto_eol(p); dispatch_scan_event(p, tCOMMENT); fallthru = TRUE; /* fall through */ case '\n': p->token_seen = token_seen; ... }
  109. twitter.com/kddnewton Parsing Ruby static enum yytokentype parser_yylex(struct parser_params *p) {

    ... case '#': /* it's a comment */ p->token_seen = token_seen; lex_goto_eol(p); dispatch_scan_event(p, tCOMMENT); fallthru = TRUE; /* fall through */ case '\n': p->token_seen = token_seen; ... }
  110. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader

    :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY
  111. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader

    :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY
  112. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader

    :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY
  113. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader

    :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY
  114. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader

    :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY
  115. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader

    :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY
  116. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader

    :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY
  117. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader

    :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY [ "# this is a comment\n", "# this is another comment\n", "# this is a third comment\n" ]
  118. twitter.com/kddnewton Parsing Ruby class Parser expression : expression "+" expression

    { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER def parse(input) until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end end
  119. twitter.com/kddnewton Parsing Ruby method_call : | keyword_super paren_args { /*%%%*/

    $$ = NEW_SUPER($2, &@$); /*% %*/ /*% ripper: super!($2) %*/ } | keyword_super { /*%%%*/ $$ = NEW_ZSUPER(&@$); /*% %*/ /*% ripper: zsuper! %*/ }
  120. twitter.com/kddnewton Parsing Ruby method_call : | keyword_super paren_args { /*%%%*/

    $$ = NEW_SUPER($2, &@$); /*% %*/ /*% ripper: super!($2) %*/ } | keyword_super { /*%%%*/ $$ = NEW_ZSUPER(&@$); /*% %*/ /*% ripper: zsuper! %*/ }
  121. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper def

    on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY
  122. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper def

    on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY
  123. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper def

    on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY
  124. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper def

    on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY super called without arguments super called with arguments:
  125. twitter.com/kddnewton Parsing Ruby method_call : | keyword_super paren_args { /*%%%*/

    $$ = NEW_SUPER($2, &@$); /*% %*/ /*% ripper: super!($2) %*/ } | keyword_super { /*%%%*/ $$ = NEW_ZSUPER(&@$); /*% %*/ /*% ripper: zsuper! %*/ }
  126. twitter.com/kddnewton Parsing Ruby method_call : | keyword_super paren_args { /*%%%*/

    $$ = NEW_SUPER($2, &@$); /*% %*/ /*% ripper: super!($2) %*/ } | keyword_super { /*%%%*/ $$ = NEW_ZSUPER(&@$); /*% %*/ /*% ripper: zsuper! %*/ }
  127. twitter.com/kddnewton Parsing Ruby method_call : | keyword_super paren_args { /*%%%*/

    $$ = NEW_SUPER($2, &@$); /*% %*/ /*% ripper: super!($2) %*/ } | keyword_super { /*%%%*/ $$ = NEW_ZSUPER(&@$); /*% %*/ /*% ripper: zsuper! %*/ }
  128. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper def

    on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY
  129. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper::SexpBuilderPP def

    on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY
  130. twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper::SexpBuilderPP def

    on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY super called without arguments super called with arguments: [:arg_paren, [:args_add_block, [ [:@int, "1", [2, 6]], [:@int, "2", [2, 9]], [:@int, "3", [2, 12]] ], false]]
  131. • Implement every method handler yourself • Inherit from Ripper::SexpBuilder

    or Ripper::SexpBuilderPP • Some combination of both
  132. twitter.com/kddnewton Parsing Ruby # frozen_string_literal: true require 'ripper' class Prettier::Parser

    < Ripper # Represents a line in the source. If this class is being used, it means that # every character in the string is 1 byte in length, so we can just return the # start of the line + the index. class SingleByteString def initialize(start) @start = start end def [](byteindex) @start + byteindex end end # Represents a line in the source. If this class is being used, it means that # there are characters in the string that are multi-byte, so we will build up # an array of indices, such that array[byteindex] will be equal to the index # of the character within the string. class MultiByteString def initialize(start, line) @indices = [] line .each_char .with_index(start) do |char, index| char.bytesize.times { @indices << index } end end def [](byteindex) @indices[byteindex] end end class Location attr_reader :start_line, :start_char, :end_line, :end_char def initialize(start_line:, start_char:, end_line:, end_char:) @start_line = start_line
  133. twitter.com/kddnewton Parsing Ruby ending = find_scanner_event(:@tstring_end) { type: :xstring_literal, body:

    xstring[:body], loc: xstring[:loc].to(ending[:loc]) } end end # yield is a parser event that represents using the yield keyword with # arguments. It accepts as an argument an args_add_block event that # contains all of the arguments being passed. def on_yield(args_add_block) event = find_scanner_event(:@kw, 'yield') { type: :yield, body: [args_add_block], loc: event[:loc].to(args_add_block[:loc]) } end # yield0 is a parser event that represents the bare yield keyword. It has # no body as it accepts no arguments. This is as opposed to the yield # parser event, which is the version where you're yielding one or more # values. def on_yield0 event = find_scanner_event(:@kw, 'yield') { type: :yield0, body: event[:body], loc: event[:loc] } end # zsuper is a parser event that represents the bare super keyword. It has # no body as it accepts no arguments. This is as opposed to the super # parser event, which is the version where you're calling super with one # or more values. def on_zsuper event = find_scanner_event(:@kw, 'super') { type: :zsuper, body: event[:body], loc: event[:loc] } end end
  134. twitter.com/kddnewton Parsing Ruby ruby_parser • Some community adoption • Not

    100% compatible, new stu ff may break
 • Doesn’t ship with/test with core
  135. twitter.com/kddnewton Parsing Ruby parser • Tons of community adoption
 •

    Well-documented
 • Not 100% compatible, new stu ff may break
 • Doesn’t ship with/test with core
  136. twitter.com/kddnewton Parsing Ruby ripper • Built into the parser generator


    • Well-tested in core
 • Ships with Ruby
 • No documentation
  137. twitter.com/kddnewton Parsing Ruby ripper • Built into the parser generator


    • Well-tested in core
 • Ships with Ruby
 • No documentation