$30 off During Our Annual Pro Sale. View Details »

Parsing Ruby (RubyKaigi)

Kevin Newton
September 09, 2021
87

Parsing Ruby (RubyKaigi)

Since Ruby's inception, there have been many different projects that parse Ruby code. This includes everything from development tools to Ruby implementations themselves. This talk dives into the technical details and tradeoffs of how each of these tools parses and subsequently understands your applications. After, we'll discuss how you can do the same with your own projects using the Ripper standard library. You'll see just how far we can take this library toward building useful development tools.

Kevin Newton

September 09, 2021
Tweet

Transcript

  1. Parsing Ruby
    Kevin Newton

    https://kddnewton.com/parsing-ruby

    View Slide

  2. The Early Days

    View Slide

  3. twitter.com/kddnewton
    Parsing Ruby
    Ruby 0.06 1994-01-07
    Fri Jan 7 15:23:20 1994 Yukihiro Matsumoto (matz at nws119)
    * baseline - version 0.06.

    View Slide

  4. twitter.com/kddnewton
    Parsing Ruby
    Ruby 0.76 1995-05-19
    Thu Jul 14 11:18:07 1994 Yukihiro Matsumoto (matz@ix-02)
    * parse.y: Dictを⽣成する構⽂を追加. こちらを{..}にした.
    * parse.y: 配列を⽣成する構⽂を[..]に変更した. 過去のRubyスクリプ
    トとの互換性が保てないが, Dictを⽣成する構⽂を導⼊するに当たり,
    perl5に合わせて(意識して), 変更する時期は今しかないと考えた.
    *BACKWARD INCOMPATIBILITY*

    View Slide

  5. twitter.com/kddnewton
    Parsing Ruby
    Ruby 0.95 1995-12-21
    Thu Nov 9 23:26:01 1995 Yukihiro Matsumoto
    * parse.y (f_arglist): メソッド定義の引数を括弧で括らなくても良い
    ようにした.
    Mon Aug 7 12:47:41 1995 Yukihiro Matsumoto
    * parse.y: resque -> rescue.恥ずかしいがtypoを残しておくわけには
    いかないよなあ.なんで今まで気がつかなかったのか….

    View Slide

  6. Ruby 1.x

    View Slide

  7. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.0.961225 1996-12-25
    Wed May 22 19:48:42 1996 Yukihiro Matsumoto
    * parse.y (superclass): スーパークラスの指定⼦を`:'から`<'に変更.
    Wed Mar 27 10:02:44 1996 Yukihiro Matsumoto
    * parse.y: 予約語の変更 continue -> next

    View Slide

  8. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.0.971225 1997-12-25
    Mon Apr 7 11:36:16 1997 Yukihiro Matsumoto
    * parse.y (primary): syntax to access singleton class.
    Thu Apr 3 02:12:31 1997 Yukihiro Matsumoto
    * parse.y (parse_regx): new option //[nes] to specify character
    code for regexp literals. Last speci
    fi
    ed code option is valid.

    View Slide

  9. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.3.0 1998-12-24
    • begin..rescue..else..end clauses
    • <<- indentable heredocs

    • :: method calls

    View Slide

  10. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.2.0 1998-12-25
    • heredocs

    • =begin to =end

    • true and false

    • BEGIN and END

    • %w

    • Top-level constant access

    • ||= and &&=

    View Slide

  11. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.4.0 1999-08-13
    • binary number literals

    • anonymous * in method de
    fi
    nitions

    • nested string interpolation

    • multibyte character identi
    fi
    ers

    View Slide

  12. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.5.0 1999-12-07
    • Compile-time string concatenation

    View Slide

  13. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.6.0 2000-09-19
    • rescue modi
    fi
    er

    View Slide

  14. twitter.com/kddnewton
    Parsing Ruby
    nodeDump 0.1.0 2000-10-01
    • C extension

    • Human-readable format

    View Slide

  15. twitter.com/kddnewton
    Parsing Ruby
    ruth 0.0.1 2001-01-10
    • C extension

    • Ruby under the hood

    • Ruby::Interpreter.parse

    View Slide

  16. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.7.1 2001-06-01
    • break and next now accept values

    • %w can escape spaces

    • rescue in singleton method bodies

    View Slide

  17. twitter.com/kddnewton
    Parsing Ruby
    JRuby 2001-09-10
    • Java port of Ruby 1.6

    • Rewrite actions in parse.y into Java

    • Rewrite standard library in Ruby

    View Slide

  18. twitter.com/kddnewton
    Parsing Ruby
    ripper 0.0.1 2001-10-20
    • Rewrite parse.y to dispatch parser events

    • “Ripper is still early-alpha version”

    View Slide

  19. twitter.com/kddnewton
    Parsing Ruby
    MetaRuby 0.7.0 2002-10-09
    • Ruby implemented in Ruby

    • Schema for Ruby AST

    View Slide

  20. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.8.0 2003-08-04
    • %W word lists

    • Dynamic symbols

    • Nested constant assignment

    View Slide

  21. twitter.com/kddnewton
    Parsing Ruby
    ParseTree 1.0.0 2004-11-10
    • C extension

    • s-expressions from NODE structs

    View Slide

  22. twitter.com/kddnewton
    Parsing Ruby
    RubyNode 0.1.0 2006-06-05
    • C extension

    • Hashes representing NODE struct
    fi
    elds

    View Slide

  23. twitter.com/kddnewton
    Parsing Ruby
    Rubinius 2006-07-12
    • sydney

    • Rewrote parse.y in Ruby

    • Rewrote standard library in Ruby

    View Slide

  24. twitter.com/kddnewton
    Parsing Ruby
    Cardinal 2006-07-16
    • Parrot VM

    • From scratch PGE grammar from Ruby EBNF

    View Slide

  25. twitter.com/kddnewton
    Parsing Ruby
    IronRuby 2007-04-30
    • Microsoft .NET port of Ruby

    • Rewrote parse.y, reused parts

    View Slide

  26. twitter.com/kddnewton
    Parsing Ruby
    ruby_parser 1.0.0 2007-11-14
    • Rewrite parse.y in Ruby, use racc

    • dawnscanner, debride, fasterer,
    fl
    ay,
    fl
    og, railroader, roodi

    View Slide

  27. YARV

    View Slide

  28. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.9.0 2007-12-25
    • Bison

    • YARV

    • Ripper merged

    • Lambda literals

    • Symbol hash keys

    View Slide

  29. twitter.com/kddnewton
    Parsing Ruby
    MacRuby 2008-03-13
    • LLVM port using Objective-C

    • Mostly reuse parse.y

    View Slide

  30. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.9.1 2009-01-30
    • encoding pragma

    • call shorthand

    • positional arguments after splat

    View Slide

  31. twitter.com/kddnewton
    Parsing Ruby
    Ruby Intermediate Language 2009-10-26
    • Intermediate representation for powering semantic analysis

    • Used to implement type systems in OCaml

    • druby, rtc, rubydust

    View Slide

  32. twitter.com/kddnewton
    Parsing Ruby
    LASER 0.0.1 2010-08-27
    • Undergraduate thesis

    • Linter, type system, documentation generation

    • Used ripper internally

    View Slide

  33. twitter.com/kddnewton
    Parsing Ruby
    Ruby 1.9.3
    • JIS X 3017

    • ISO/IEC 30170:2012
    2011-10-31

    View Slide

  34. twitter.com/kddnewton
    Parsing Ruby
    Topaz
    • Ruby implementation targeting RPython

    • Rewrote parse.y making grammar rules into decorators
    2012-04-07

    View Slide

  35. Ruby 2.x

    View Slide

  36. twitter.com/kddnewton
    Parsing Ruby
    Ruby 2.0.0
    • Re
    fi
    nements

    • %i symbol lists

    • Keyword arguments
    2013-02-24

    View Slide

  37. twitter.com/kddnewton
    Parsing Ruby
    parser
    • Published gem, parser API

    • Well-documented

    • covered, deep-cover, erb-lint, fast, opal, packwerk, querly,
    rdl, reek, rubocop, rubrowser, ruby-lint, ruby-next,
    ruby_detective, rubycritic, seeing_is_believing, standard,
    steep, unparser, vernacular, yoda
    2013-04-15

    View Slide

  38. twitter.com/kddnewton
    Parsing Ruby
    Tru
    ffl
    eRuby
    • Originally forked JRuby

    • Graal dynamic compiler, Tru
    ffl
    e AST interpreter
    2013-10-26

    View Slide

  39. twitter.com/kddnewton
    Parsing Ruby
    Ruby 2.1.0
    • Required keyword arguments

    • Rational and complex literals

    • Frozen string literal su
    ffi
    x
    2013-12-25

    View Slide

  40. twitter.com/kddnewton
    Parsing Ruby
    Ruby 2.2.0
    • Dynamic symbol hash keys
    2014-12-25

    View Slide

  41. twitter.com/kddnewton
    Parsing Ruby
    Ruby 2.3.0
    • Frozen string literal pragma

    • <<~ heredocs

    • &. lonely operator
    2015-12-25

    View Slide

  42. twitter.com/kddnewton
    Parsing Ruby
    Ruby 2.4.0
    • Symbol#to_proc re
    fi
    nements

    • Top-level return

    • Multiple assignment in conditional
    2016-12-25

    View Slide

  43. twitter.com/kddnewton
    Parsing Ruby
    tree-sitter
    • Parser-generator library

    • vscode-ruby
    2017-02-02

    View Slide

  44. twitter.com/kddnewton
    Parsing Ruby
    typedruby
    • Type system in Rust, parser in C++

    • Grammar from Ruby 2.4, lexer from ruby_parser

    • Vendored in Sorbet
    2017-02-26

    View Slide

  45. twitter.com/kddnewton
    Parsing Ruby
    Ruby 2.5.0
    • String interpolation re
    fi
    nements

    • rescue and ensure at the block level
    2017-12-25

    View Slide

  46. twitter.com/kddnewton
    Parsing Ruby
    Ruby 2.6.0
    • RubyVM::AbstractSyntaxTree

    • Flip-
    fl
    op deprecated

    • Endless ranges

    • Non-ASCII constant names
    2018-12-25

    View Slide

  47. twitter.com/kddnewton
    Parsing Ruby
    Ruby 2.7.0
    • Flip-
    fl
    op undeprecated
    • Method reference operator
    • Keyword argument warning
    • No other keywords syntax
    • Beginless range
    • Pattern matching
    • Numbered parameters
    • Rightward assignment
    • Argument forwarding
    2019-12-25

    View Slide

  48. Ruby 3.x

    View Slide

  49. twitter.com/kddnewton
    Parsing Ruby
    Ruby 3.0.0
    • Keyword arguments

    • Single-line “endless” methods

    • “Find pattern” pattern matching

    • shareable_constant_value pragma

    • in keyword pattern matching
    2020-12-25

    View Slide

  50. twitter.com/kddnewton
    Parsing Ruby
    Ruby 3.1.??
    • Anonymous struct syntax

    View Slide

  51. Lessons

    View Slide

  52. twitter.com/kddnewton
    Parsing Ruby
    Lessons
    • Very di
    ffi
    cult to keep up with core

    • Small changes in syntax can have wide-reaching e
    ff
    ects

    • So much subtle logic in parse.y, very di
    ff i
    cult to extend

    View Slide

  53. Options

    View Slide

  54. twitter.com/kddnewton
    Parsing Ruby
    parser
    • Tons of community adoption

    • Well-documented

    • Not 100% compatible, stu
    ff
    may break

    • Doesn’t ship with/test with core

    View Slide

  55. twitter.com/kddnewton
    Parsing Ruby
    RubyVM::AbstractSyntaxTree
    • Still too early to tell

    • Not implemented anywhere else

    View Slide

  56. twitter.com/kddnewton
    Parsing Ruby
    ripper
    • Built into the parser generator

    • Well-tested in core

    • Ships with Ruby

    • No documentation

    View Slide

  57. Language servers

    View Slide

  58. twitter.com/kddnewton
    Parsing Ruby
    Language servers
    • castwide/solargraph (parser or RubyVM::AST)
    • mtsmfm/language_server-ruby (ripper)

    • rubyide/vscode-ruby (tree-sitter)
    • tomoasleep/yoda (parser)

    View Slide

  59. Parsing Ruby
    Kevin Newton

    https://kddnewton.com/parsing-ruby

    View Slide