Rattler - Jason Arhart

B044a0f039af800f4df09bf3b2465f18?s=47 Las Vegas Ruby Group
August 01, 2012
29

Rattler - Jason Arhart

B044a0f039af800f4df09bf3b2465f18?s=128

Las Vegas Ruby Group

August 01, 2012
Tweet

Transcript

  1. Rattler Ruby Tool for Language Recognition Jason Arhart Thursday, August

    1, 13
  2. What is Rattler? Rattler is a parser generator for Ruby

    Thursday, August 1, 13
  3. What is Rattler? Rattler is a parser generator for Ruby

    WTF is a parser generator? Thursday, August 1, 13
  4. Parsing • Informal term for “syntactic analysis” • Analyzing a

    “sentence” in terms of grammatical constituents • Recognizing implicit structure in a linear sequence of “words” Thursday, August 1, 13
  5. You’re parsing right now Thursday, August 1, 13

  6. You’re parsing right now “Rattler is a parser generator.” Thursday,

    August 1, 13
  7. Syntax • A syntax is a set of rules that

    govern the sentence structure of a language • Syntax refers only to the structure of a sentence; not the meaning, or “semantics” • A formal description of a language’s syntax is called a “grammar” Thursday, August 1, 13
  8. Grammar • A grammar is a set of rules that

    define the syntax of a language • A formal grammar can be generative or analytic • Generative grammars define how to form valid sentences • Analytic grammars define how to recognize valid sentences Thursday, August 1, 13
  9. Arithmetic Grammar expression -> number expression -> “(“ expression “)”

    expression -> expression “+” expression expression -> expression “-” expression expression -> expression “*” expression expression -> expression “/” expression number -> digit number -> digit number digit -> “0” digit -> “1” ... Thursday, August 1, 13
  10. Parser • A parser is a program that recognizes grammatical

    structure • Turns a linear sequence of characters into an explicit structure • Output is typically a tree structure • Usually a component of a compiler or interpreter (or parser generator) Thursday, August 1, 13
  11. Parser Generator • A parser generator is a program that

    generates a parser • The input is a formal description of the language to parse • That formal description is a grammar • The parser generator starts by parsing the grammar Thursday, August 1, 13
  12. Ruby Parser Generators • Racc - LALR(1) • Treetop -

    PEG (packrat) • Citrus - PEG (packrat) • ANTLR for Ruby - LL • Ragel - finite state machine Thursday, August 1, 13
  13. Why Rattler? • Racc is LALR, which is a PITA

    • Treetop and Citrus are pretty basic and don’t handle left-recursion • ANTLR for Ruby is slow and doesn’t handle left-recursion • Ragel is not really designed for syntactic analysis Thursday, August 1, 13
  14. Why Rattler? Parsing has traditionally been an unnecessarily dense, difficult

    subject. Thursday, August 1, 13
  15. Why Rattler? • LR parser generators are a PITA •

    Require a separate token grammar • Difficult to debug • Difficult to produce useful error messages • Generative grammars tend to be ambiguous Thursday, August 1, 13
  16. Dangling “else” “if A then if B then C else

    D” or Thursday, August 1, 13
  17. Parsing Expression Grammars • analytic grammars (vs. generative) • the

    choice operator is ordered • express a recursive descent parsing algorithm explicitly • usually used for packrat parsing Thursday, August 1, 13
  18. Dangling “else” solved if_expr <- “if” expr “then” expr “else”

    expr / “if” expr “then” expr This PEG rule parses if-then-else expressions correctly and unambiguously Thursday, August 1, 13
  19. Why Rattler? • PEG-based parser generators have their own disadvantages

    • Can’t handle left-recursive rules • Common parsing problems are hard • handling whitespace • matching tokens • delimited lists Thursday, August 1, 13
  20. Why Rattler? The Bottom Line: Existing tools are too hard

    Parsing in Ruby should be easy Thursday, August 1, 13
  21. Rattler Parsing for Ruby that’s so easy it feels like

    cheating Thursday, August 1, 13
  22. Rattler • PEG-based, but adds many convenient grammar features •

    DRY whitespace handling • simplified keyword matching • delimited lists • back references Thursday, August 1, 13
  23. Rattler • Supports left-recursive grammars • Useful error messages •

    Generates efficient pure-ruby parsers • RSpec matchers for testing parsers • Outputs parse trees using GraphViz Thursday, August 1, 13
  24. Ok, enough slides! Let’s see Rattler in action! Thursday, August

    1, 13
  25. Philosophy of Rattler • Parsing in Ruby should be easy

    and maybe even fun! • Experimenting should be encouraged • Common parsing problems should be easy to solve • Write expressive grammars, let Rattler optimize the parser Thursday, August 1, 13
  26. Future • Separate parser generator & runtime • Multiple targets

    • Operator precedence parsing • Lazy semantic actions • Compiler back-end • More optimizations Thursday, August 1, 13
  27. Rattler Ruby Tool for Language Recognition https://github.com/jarhart/rattler https://www.relishapp.com/jarhart/rattler Thursday, August

    1, 13