Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Prism

Kevin Newton
November 09, 2023

 Prism

This deck presented the prism parser as Ruby World 2023.

Kevin Newton

November 09, 2023
Tweet

More Decks by Kevin Newton

Other Decks in Technology

Transcript

  1. Overview Prism is a new Ruby parser built from scratch

    with the following goals: • Compatibility - must match CRuby semantics exactly in order to merge
  2. Overview Prism is a new Ruby parser built from scratch

    with the following goals: • Compatibility - must match CRuby semantics exactly in order to merge • Maintainability - must be as maintainable as possible
  3. Overview Prism is a new Ruby parser built from scratch

    with the following goals: • Compatibility - must match CRuby semantics exactly in order to merge • Maintainability - must be as maintainable as possible • Error tolerance - must be able to recover from errors and continue parsing
  4. Overview Prism is a new Ruby parser built from scratch

    with the following goals: • Compatibility - must match CRuby semantics exactly in order to merge • Maintainability - must be as maintainable as possible • Error tolerance - must be able to recover from errors and continue parsing • Portability - must be able to be run from any ruby platform
  5. Overview Prism is a new Ruby parser built from scratch

    with the following goals: • Compatibility - must match CRuby semantics exactly in order to merge • Maintainability - must be as maintainable as possible • Error tolerance - must be able to recover from errors and continue parsing • Portability - must be able to be run from any ruby platform • Performance - must be as fast and memory e ffi cient as possible
  6. Timeline 2022/01/01 2023/01/01 2024/01/01 2022/07/01 2023/07/01 Prototype fi rst commit

    First call with CRuby team YARP fi rst commit Tru ff l eRuby fi rst commit 👶
  7. Timeline 2022/01/01 2023/01/01 2024/01/01 2022/07/01 2023/07/01 Prototype fi rst commit

    First call with CRuby team YARP fi rst commit Tru ff l eRuby fi rst commit JRuby fi rst commit 👶
  8. Timeline 2022/01/01 2023/01/01 2024/01/01 2022/07/01 2023/07/01 Prototype fi rst commit

    First call with CRuby team YARP fi rst commit Tru ff l eRuby fi rst commit JRuby fi rst commit CRuby fi rst commit 👶
  9. Timeline 2022/01/01 2023/01/01 2024/01/01 2022/07/01 2023/07/01 Prototype fi rst commit

    First call with CRuby team YARP fi rst commit Tru ff l eRuby fi rst commit JRuby fi rst commit CRuby fi rst commit Syntax Tree fi rst commit 👶
  10. Timeline 2022/01/01 2023/01/01 2024/01/01 2022/07/01 2023/07/01 Prototype fi rst commit

    First call with CRuby team YARP fi rst commit Tru ff l eRuby fi rst commit JRuby fi rst commit CRuby fi rst commit Syntax Tree fi rst commit Parser fi rst commit 👶
  11. Timeline 2022/01/01 2023/01/01 2024/01/01 2022/07/01 2023/07/01 Prototype fi rst commit

    First call with CRuby team YARP fi rst commit Tru ff l eRuby fi rst commit JRuby fi rst commit CRuby fi rst commit Syntax Tree fi rst commit Parser fi rst commit Rename to prism 👶
  12. • Comparative analysis with existing trees • "Semantic" versus "syntactic"

    trees • Provide for all consumers Approach Design
  13. • Make compilation as simple as possible • Never have

    to check child to compile current node Implementations Design
  14. • Make compilation as simple as possible • Never have

    to check child to compile current node • Cover as much common functionality as possible Implementations Design
  15. • Named fi elds and known types • Nodes with

    enough information to round-trip Tooling Design
  16. • Named fi elds and known types • Nodes with

    enough information to round-trip • Location information for as much as possible Tooling Design
  17. • Template as much as possible • Provide as much

    documentation as possible Maintenance Design
  18. • Template as much as possible • Provide as much

    documentation as possible • Build up as big of a test suite as possible Maintenance Design
  19. Challenges • Escaping - required to properly fi nd string-like

    terminators • Regular expressions - required because of named capture groups
  20. Challenges • Escaping - required to properly fi nd string-like

    terminators • Regular expressions - required because of named capture groups • Encoding support - required to properly fi nd string-like terminators
  21. Challenges • Escaping - required to properly fi nd string-like

    terminators • Regular expressions - required because of named capture groups • Encoding support - required to properly fi nd string-like terminators • Heredocs - very complicated semantics, can be almost anywhere in source
  22. Testing • Lexer testing - testing against ripper lex •

    Snapshot testing - protect from regressions, check parser structure
  23. Testing • Lexer testing - testing against ripper lex •

    Snapshot testing - protect from regressions, check parser structure • Brute-force testing - all escape sequences in all contexts
  24. Testing • Lexer testing - testing against ripper lex •

    Snapshot testing - protect from regressions, check parser structure • Brute-force testing - all escape sequences in all contexts • Fuzzing - random input, check for crashes
  25. Testing • Lexer testing - testing against ripper lex •

    Snapshot testing - protect from regressions, check parser structure • Brute-force testing - all escape sequences in all contexts • Fuzzing - random input, check for crashes • Grammar-based fuzzing - random input, check for crashes and parse errors
  26. Integrations • C - CRuby, Sorbet, Artichoke, Natalie, C/C++/Rust/Zig tooling

    • Ruby - Natalie, tooling • Serialization - JRuby, Tru ffl eRuby, JavaScript tooling
  27. Fractured ecosystem Ruby parsers • ruby/ruby/parse.y • ruby/ruby/ext/ripper/parse.y • lib-ruby-parser/lib-ruby-parser

    • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser • seattlerb/ruby_parser
  28. Fractured ecosystem Ruby parsers • ruby/ruby/parse.y • ruby/ruby/ext/ripper/parse.y • lib-ruby-parser/lib-ruby-parser

    • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser • seattlerb/ruby_parser • jruby/jruby
  29. Fractured ecosystem Ruby parsers • ruby/ruby/parse.y • ruby/ruby/ext/ripper/parse.y • lib-ruby-parser/lib-ruby-parser

    • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser • seattlerb/ruby_parser • jruby/jruby • oracle/tru ff l eruby
  30. Fractured ecosystem Ruby parsers • ruby/ruby/parse.y • ruby/ruby/ext/ripper/parse.y • lib-ruby-parser/lib-ruby-parser

    • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser • seattlerb/ruby_parser • jruby/jruby • oracle/tru ff l eruby • tree-sitter/tree-sitter-ruby
  31. Fractured ecosystem Ruby parsers • ruby/ruby/parse.y • ruby/ruby/ext/ripper/parse.y • lib-ruby-parser/lib-ruby-parser

    • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser • seattlerb/ruby_parser • jruby/jruby • oracle/tru ff l eruby • tree-sitter/tree-sitter-ruby • sisshiki1969/ruruby
  32. Fractured ecosystem Ruby parsers • ruby/ruby/parse.y • ruby/ruby/ext/ripper/parse.y • lib-ruby-parser/lib-ruby-parser

    • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser • seattlerb/ruby_parser • jruby/jruby • oracle/tru ff l eruby • tree-sitter/tree-sitter-ruby • sisshiki1969/ruruby • natalie-lang/natalie
  33. Fractured ecosystem Ruby parsers • ruby/ruby/parse.y • ruby/ruby/ext/ripper/parse.y • lib-ruby-parser/lib-ruby-parser

    • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser • seattlerb/ruby_parser • jruby/jruby • oracle/tru ff l eruby • tree-sitter/tree-sitter-ruby • sisshiki1969/ruruby • natalie-lang/natalie • ruby/prism
  34. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser
  35. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby
  36. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet
  37. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast
  38. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor
  39. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast
  40. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e
  41. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter
  42. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby
  43. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie
  44. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie
  45. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie • ruby/prism
  46. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie • ruby/prism
  47. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie • ruby/prism
  48. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie • ruby/prism
  49. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie • ruby/prism
  50. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie • ruby/prism
  51. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie • ruby/prism
  52. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie • ruby/prism
  53. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie • ruby/prism
  54. Fractured ecosystem Ruby syntax trees • NODE, RubyVM::AbstractSyntaxTree • Ripper::SexpBuilder{,PP}

    • ruby-syntax-tree/syntax_tree • lib-ruby-parser/lib-ruby-parser • mruby/mruby • typedruby/typedruby, sorbet/sorbet • whitequark/parser, rubocop/rubocop-ast • seattlerb/sexp_processor • org.jruby.ast • Tru ff l e • tree-sitter/tree-sitter • sisshiki1969/ruruby • natalie-lang/natalie • ruby/prism
  55. Fractured ecosystem Ruby syntax trees • lib-ruby-parser/lib-ruby-parser • mruby/mruby •

    tree-sitter/tree-sitter • sisshiki1969/ruruby • ruby/prism
  56. Fractured ecosystem Ruby syntax trees • lib-ruby-parser/lib-ruby-parser • mruby/mruby •

    tree-sitter/tree-sitter • sisshiki1969/ruruby • ruby/prism
  57. Fractured ecosystem Ruby syntax trees • lib-ruby-parser/lib-ruby-parser • mruby/mruby •

    tree-sitter/tree-sitter • sisshiki1969/ruruby • ruby/prism
  58. Separate understandings • Every tool represents Ruby di ff erently

    • Every tool requires contributors to relearn
  59. Separate understandings • Every tool represents Ruby di ff erently

    • Every tool requires contributors to relearn • None of this knowledge translates to contributing to CRuby
  60. Maintenance burden • Every parser has to update every time

    new syntax is introduced • Very minor changes have very major consequences
  61. Maintenance burden • Every parser has to update every time

    new syntax is introduced • Very minor changes have very major consequences • Helping Ruby can inadvertently hurt Ruby tooling
  62. The future of prism • Improved error tolerance and messaging

    • Performance enhancements (SIMD instructions, arena allocation, etc.)
  63. The future of prism • Improved error tolerance and messaging

    • Performance enhancements (SIMD instructions, arena allocation, etc.) • Multi-version support
  64. The future of prism • Improved error tolerance and messaging

    • Performance enhancements (SIMD instructions, arena allocation, etc.) • Multi-version support • Further integrations into the ecosystem
  65. The future of prism • Improved error tolerance and messaging

    • Performance enhancements (SIMD instructions, arena allocation, etc.) • Multi-version support • Further integrations into the ecosystem • Improved Ruby library for external tools
  66. The future of prism • Improved error tolerance and messaging

    • Performance enhancements (SIMD instructions, arena allocation, etc.) • Multi-version support • Further integrations into the ecosystem • Improved Ruby library for external tools • Contributor community around a single tool