Slide 1

Slide 1 text

Parsing Ruby Kevin Newton

Slide 2

Slide 2 text

Kevin Newton

Slide 3

Slide 3 text

Kevin Newton

Slide 4

Slide 4 text

Kevin Newton

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

• Build a grammar for a simple language

Slide 7

Slide 7 text

• Build a grammar for a simple language • Build a parser for a simple language

Slide 8

Slide 8 text

• Build a grammar for a simple language • Build a parser for a simple language • Look at the history of the Ruby parser

Slide 9

Slide 9 text

• Build a grammar for a simple language • Build a parser for a simple language • Look at the history of the Ruby parser • Look at how Ripper works

Slide 10

Slide 10 text

Building a grammar

Slide 11

Slide 11 text

twitter.com/kddnewton Parsing Ruby

Slide 12

Slide 12 text

twitter.com/kddnewton Parsing Ruby program: | number number: | NUMBER

Slide 13

Slide 13 text

twitter.com/kddnewton Parsing Ruby program: | number number: | NUMBER 1

Slide 14

Slide 14 text

twitter.com/kddnewton Parsing Ruby program: | number number: | NUMBER 2

Slide 15

Slide 15 text

twitter.com/kddnewton Parsing Ruby program: | number number: | NUMBER 7

Slide 16

Slide 16 text

twitter.com/kddnewton Parsing Ruby program: | number number: | NUMBER

Slide 17

Slide 17 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | number "+" number | number number: | NUMBER

Slide 18

Slide 18 text

twitter.com/kddnewton Parsing Ruby 1 program: | expression expression: | number "+" number | number number: | NUMBER

Slide 19

Slide 19 text

twitter.com/kddnewton Parsing Ruby 1 + 2 program: | expression expression: | number "+" number | number number: | NUMBER

Slide 20

Slide 20 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | number "+" number | number number: | NUMBER

Slide 21

Slide 21 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | number "+" number | number "-" number | number number: | NUMBER

Slide 22

Slide 22 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+" number | expression "-" number | number number: | NUMBER

Slide 23

Slide 23 text

twitter.com/kddnewton Parsing Ruby 1 program: | expression expression: | expression "+" number | expression "-" number | number number: | NUMBER

Slide 24

Slide 24 text

twitter.com/kddnewton Parsing Ruby 1 + 2 program: | expression expression: | expression "+" number | expression "-" number | number number: | NUMBER

Slide 25

Slide 25 text

twitter.com/kddnewton Parsing Ruby 1 + 2 + 3 program: | expression expression: | expression "+" number | expression "-" number | number number: | NUMBER

Slide 26

Slide 26 text

twitter.com/kddnewton Parsing Ruby 1 + 2 + 3 - 4 + 5 - 6 program: | expression expression: | expression "+" number | expression "-" number | number number: | NUMBER

Slide 27

Slide 27 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+" number | expression "-" number | number number: | NUMBER

Slide 28

Slide 28 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+" number | expression "-" number | number term: | term "*" number | term "/" number | number number: | NUMBER

Slide 29

Slide 29 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" number | term "/" number | number number: | NUMBER

Slide 30

Slide 30 text

twitter.com/kddnewton Parsing Ruby 1 program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" number | term "/" number | number number: | NUMBER

Slide 31

Slide 31 text

twitter.com/kddnewton Parsing Ruby 1 * 2 program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" number | term "/" number | number number: | NUMBER

Slide 32

Slide 32 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" number | term "/" number | number number: | NUMBER

Slide 33

Slide 33 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" number | term "/" number | number number: | NUMBER | "(" expression ")"

Slide 34

Slide 34 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 35

Slide 35 text

Building a parser

Slide 36

Slide 36 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

Slide 37

Slide 37 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3

Slide 38

Slide 38 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end

Slide 39

Slide 39 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end

Slide 40

Slide 40 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end

Slide 41

Slide 41 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end

Slide 42

Slide 42 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end

Slide 43

Slide 43 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end

Slide 44

Slide 44 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end

Slide 45

Slide 45 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end

Slide 46

Slide 46 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER

Slide 47

Slide 47 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+”

Slide 48

Slide 48 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(”

Slide 49

Slide 49 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER

Slide 50

Slide 50 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER “-”

Slide 51

Slide 51 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER “-” NUMBER

Slide 52

Slide 52 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER “-” NUMBER “)”

Slide 53

Slide 53 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER “-” NUMBER “)” “*”

Slide 54

Slide 54 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end NUMBER “+” “(” NUMBER “-” NUMBER “)” “*” NUMBER

Slide 55

Slide 55 text

twitter.com/kddnewton Parsing Ruby 1 + (4 - 2) * 3 NUMBER “+” “(” NUMBER “-” NUMBER “)” “*” NUMBER

Slide 56

Slide 56 text

twitter.com/kddnewton Parsing Ruby NUMBER “+” “(” NUMBER “-” NUMBER “)” “*” NUMBER

Slide 57

Slide 57 text

Accepting the input

Slide 58

Slide 58 text

twitter.com/kddnewton Parsing Ruby NUMBER “+” “(” NUMBER “-” NUMBER “)” “*” NUMBER

Slide 59

Slide 59 text

twitter.com/kddnewton Parsing Ruby NUMBER “+” “(” NUMBER “-” NUMBER “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 60

Slide 60 text

twitter.com/kddnewton Parsing Ruby NUMBER “+” “(” NUMBER “-” NUMBER “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 61

Slide 61 text

twitter.com/kddnewton Parsing Ruby “+” “(” NUMBER “-” NUMBER “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" NUMBER

Slide 62

Slide 62 text

twitter.com/kddnewton Parsing Ruby “+” “(” NUMBER “-” NUMBER “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" NUMBER

Slide 63

Slide 63 text

twitter.com/kddnewton Parsing Ruby “+” “(” NUMBER “-” NUMBER “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor

Slide 64

Slide 64 text

twitter.com/kddnewton Parsing Ruby “(” NUMBER “-” NUMBER “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" “+” factor

Slide 65

Slide 65 text

twitter.com/kddnewton Parsing Ruby NUMBER “-” NUMBER “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" “(” “+” factor

Slide 66

Slide 66 text

twitter.com/kddnewton Parsing Ruby “-” NUMBER “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" NUMBER “(” “+” factor

Slide 67

Slide 67 text

twitter.com/kddnewton Parsing Ruby “-” NUMBER “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" NUMBER “(” “+” factor

Slide 68

Slide 68 text

twitter.com/kddnewton Parsing Ruby “-” NUMBER “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor “(” “+” factor

Slide 69

Slide 69 text

twitter.com/kddnewton Parsing Ruby NUMBER “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" “-” factor “(” “+” factor

Slide 70

Slide 70 text

twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" NUMBER “-” factor “(” “+” factor

Slide 71

Slide 71 text

twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor “-” factor “(” “+” factor

Slide 72

Slide 72 text

twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor “+” “(” factor “-” factor

Slide 73

Slide 73 text

twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor “-” factor “(” “+” factor

Slide 74

Slide 74 text

twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" factor “-” factor “(” “+” factor

Slide 75

Slide 75 text

twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER term “-” term “(” “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 76

Slide 76 text

twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER term “-” term “(” “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 77

Slide 77 text

twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER expression “-” term “(” “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 78

Slide 78 text

twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER expression “-” term “(” “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 79

Slide 79 text

twitter.com/kddnewton Parsing Ruby “)” “*” NUMBER expression “(” “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 80

Slide 80 text

twitter.com/kddnewton Parsing Ruby “*” NUMBER “)” expression “(” “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 81

Slide 81 text

twitter.com/kddnewton Parsing Ruby “*” NUMBER “)” expression “(” “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 82

Slide 82 text

twitter.com/kddnewton Parsing Ruby “*” NUMBER factor “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 83

Slide 83 text

twitter.com/kddnewton Parsing Ruby NUMBER “*” factor “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 84

Slide 84 text

twitter.com/kddnewton Parsing Ruby NUMBER “*” factor “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 85

Slide 85 text

twitter.com/kddnewton Parsing Ruby factor “*” factor “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 86

Slide 86 text

twitter.com/kddnewton Parsing Ruby term “*” factor “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 87

Slide 87 text

twitter.com/kddnewton Parsing Ruby term “*” factor “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 88

Slide 88 text

twitter.com/kddnewton Parsing Ruby term “+” factor program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 89

Slide 89 text

twitter.com/kddnewton Parsing Ruby expression “+” term program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 90

Slide 90 text

twitter.com/kddnewton Parsing Ruby expression “+” term program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 91

Slide 91 text

twitter.com/kddnewton Parsing Ruby program program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 92

Slide 92 text

Parser generators

Slide 93

Slide 93 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")"

Slide 94

Slide 94 text

twitter.com/kddnewton Parsing Ruby program: | expression expression: | expression "+" term | expression "-" term | term term: | term "*" factor | term "/" factor | factor factor: | NUMBER | "(" expression ")" class Parser prechigh left "*" "/" left "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression | expression "-" expression | expression "*" expression | expression "/" expression | "(" expression ")" | NUMBER end

Slide 95

Slide 95 text

twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression | expression "-" expression | expression "*" expression | expression "/" expression | "(" expression ")" | NUMBER end

Slide 96

Slide 96 text

twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression { val[0] + val[2] } | expression "-" expression { val[0] - val[2] } | expression "*" expression { val[0] * val[2] } | expression "/" expression { val[0] / val[2] } | "(" expression ")" { val[1] } | NUMBER end

Slide 97

Slide 97 text

twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER end

Slide 98

Slide 98 text

twitter.com/kddnewton Parsing Ruby $

Slide 99

Slide 99 text

twitter.com/kddnewton Parsing Ruby $ racc parser.y -o parser.rb $

Slide 100

Slide 100 text

twitter.com/kddnewton Parsing Ruby $ racc parser.y -o parser.rb $ irb irb(main):001:0>

Slide 101

Slide 101 text

twitter.com/kddnewton Parsing Ruby $ racc parser.y -o parser.rb $ irb irb(main):001:0> require_relative "parser" => true irb(main):002:0>

Slide 102

Slide 102 text

twitter.com/kddnewton Parsing Ruby $ racc parser.y -o parser.rb $ irb irb(main):001:0> require_relative "parser" => true irb(main):002:0> Parser.new.parse("1 + (4 - 2) * 3") => [:add, 1, [:mul, [:prn, [:sub, 4, 2]], 3]] irb(main):003:0>

Slide 103

Slide 103 text

twitter.com/kddnewton Parsing Ruby [:add, 1, [:mul, [:prn, [:sub, 4, 2]], 3]]

Slide 104

Slide 104 text

twitter.com/kddnewton Parsing Ruby add 1 mul prn sub 3 4 2

Slide 105

Slide 105 text

twitter.com/kddnewton Parsing Ruby add 1 mul prn sub 3 4 2 1 + (4 - 2) * 3

Slide 106

Slide 106 text

twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER end

Slide 107

Slide 107 text

History of the Ruby parser

Slide 108

Slide 108 text

The Early Days

Slide 109

Slide 109 text

twitter.com/kddnewton Parsing Ruby Ruby 0.06 1994-01-07 Fri Jan 7 15:23:20 1994 Yukihiro Matsumoto (matz at nws119) * baseline - version 0.06.

Slide 110

Slide 110 text

twitter.com/kddnewton Parsing Ruby Ruby 0.76 1995-05-19 Thu Jul 14 11:18:07 1994 Yukihiro Matsumoto (matz@ix-02) * parse.y: Dictを⽣成する構⽂を追加. こちらを{..}にした. * parse.y: 配列を⽣成する構⽂を[..]に変更した. 過去のRubyスクリプ トとの互換性が保てないが, Dictを⽣成する構⽂を導⼊するに当たり, perl5に合わせて(意識して), 変更する時期は今しかないと考えた. *BACKWARD INCOMPATIBILITY*

Slide 111

Slide 111 text

twitter.com/kddnewton Parsing Ruby Ruby 0.95 1995-12-21 Thu Nov 9 23:26:01 1995 Yukihiro Matsumoto * parse.y (f_arglist): メソッド定義の引数を括弧で括らなくても良い ようにした. Mon Aug 7 12:47:41 1995 Yukihiro Matsumoto * parse.y: resque -> rescue.恥ずかしいがtypoを残しておくわけには いかないよなあ.なんで今まで気がつかなかったのか….

Slide 112

Slide 112 text

Ruby 1.x

Slide 113

Slide 113 text

twitter.com/kddnewton Parsing Ruby Ruby 1.0.961225 1996-12-25 Wed May 22 19:48:42 1996 Yukihiro Matsumoto * parse.y (superclass): スーパークラスの指定⼦を`:'から`<'に変更. Wed Mar 27 10:02:44 1996 Yukihiro Matsumoto * parse.y: 予約語の変更 continue -> next

Slide 114

Slide 114 text

twitter.com/kddnewton Parsing Ruby Ruby 1.0.971225 1997-12-25 Mon Apr 7 11:36:16 1997 Yukihiro Matsumoto * parse.y (primary): syntax to access singleton class. Thu Apr 3 02:12:31 1997 Yukihiro Matsumoto * parse.y (parse_regx): new option //[nes] to specify character code for regexp literals. Last speci fi ed code option is valid.

Slide 115

Slide 115 text

twitter.com/kddnewton Parsing Ruby Ruby 1.3.0 1998-12-24 • begin..rescue..else..end clauses • <<- indentable heredocs
 • :: method calls

Slide 116

Slide 116 text

twitter.com/kddnewton Parsing Ruby Ruby 1.2.0 1998-12-25 • heredocs
 • =begin to =end
 • true and false
 • BEGIN and END
 • %w
 • Top-level constant access
 • ||= and &&=

Slide 117

Slide 117 text

twitter.com/kddnewton Parsing Ruby Ruby 1.4.0 1999-08-13 • binary number literals
 • anonymous * in method de fi nitions
 • nested string interpolation
 • multibyte character identi fi ers

Slide 118

Slide 118 text

twitter.com/kddnewton Parsing Ruby Ruby 1.5.0 1999-12-07 • Compile-time string concatenation

Slide 119

Slide 119 text

twitter.com/kddnewton Parsing Ruby Ruby 1.6.0 2000-09-19 • rescue modi fi er

Slide 120

Slide 120 text

twitter.com/kddnewton Parsing Ruby nodeDump 0.1.0 2000-10-01 • C extension
 • Human-readable format

Slide 121

Slide 121 text

twitter.com/kddnewton Parsing Ruby Ruby 1.7.1 2001-06-01 • break and next now accept values
 • %w can escape spaces
 • rescue in singleton method bodies

Slide 122

Slide 122 text

twitter.com/kddnewton Parsing Ruby JRuby 2001-09-10 • Java port of Ruby 1.6
 • Rewrite actions in parse.y into Java
 • Rewrite standard library in Ruby

Slide 123

Slide 123 text

twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER end

Slide 124

Slide 124 text

twitter.com/kddnewton Parsing Ruby class Parser prechigh left "*" "/" left "+" "-" preclow options no_result_var rule program : expression expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER end

Slide 125

Slide 125 text

twitter.com/kddnewton Parsing Ruby ripper 0.0.1 2001-10-20 • Rewrite parse.y to dispatch parser events
 • “Ripper is still early-alpha version”

Slide 126

Slide 126 text

twitter.com/kddnewton Parsing Ruby Ruby 1.8.0 2003-08-04 • %W word lists
 • Dynamic symbols
 • Nested constant assignment

Slide 127

Slide 127 text

twitter.com/kddnewton Parsing Ruby ParseTree 1.0.0 2004-11-10 • C extension
 • s-expressions from NODE structs

Slide 128

Slide 128 text

twitter.com/kddnewton Parsing Ruby Rubinius 2006-07-12 • sydney
 • Rewrote parse.y in Ruby
 • Rewrote standard library in Ruby

Slide 129

Slide 129 text

twitter.com/kddnewton Parsing Ruby Cardinal 2006-07-16 • Parrot VM
 • From scratch PGE grammar from Ruby EBNF

Slide 130

Slide 130 text

twitter.com/kddnewton Parsing Ruby IronRuby 2007-04-30 • Microsoft .NET port of Ruby
 • Rewrote parse.y, reused parts

Slide 131

Slide 131 text

twitter.com/kddnewton Parsing Ruby ruby_parser 1.0.0 2007-11-14 • Rewrite parse.y in Ruby, use racc
 • dawnscanner, debride, fasterer, fl ay, fl og, railroader, roodi

Slide 132

Slide 132 text

YARV

Slide 133

Slide 133 text

twitter.com/kddnewton Parsing Ruby Ruby 1.9.0 2007-12-25 • Bison
 • YARV
 • Ripper merged
 • Lambda literals
 • Symbol hash keys

Slide 134

Slide 134 text

twitter.com/kddnewton Parsing Ruby Ruby 1.9.1 2009-01-30 • encoding pragma
 • call shorthand
 • positional arguments after splat

Slide 135

Slide 135 text

twitter.com/kddnewton Parsing Ruby Ruby Intermediate Language 2009-10-26 • Intermediate representation for powering semantic analysis
 • Used to implement type systems in OCaml
 • druby, rtc, rubydust

Slide 136

Slide 136 text

twitter.com/kddnewton Parsing Ruby Ruby 1.9.3 • JIS X 3017
 • ISO/IEC 30170:2012 2011-10-31

Slide 137

Slide 137 text

Ruby 2.x

Slide 138

Slide 138 text

twitter.com/kddnewton Parsing Ruby Ruby 2.0.0 • Re fi nements
 • %i symbol lists
 • Keyword arguments 2013-02-24

Slide 139

Slide 139 text

twitter.com/kddnewton Parsing Ruby parser • Published gem, parser API
 • Well-documented
 • covered, deep-cover, erb-lint, fast, opal, packwerk, querly, rdl, reek, rubocop, rubrowser, ruby-lint, ruby-next, ruby_detective, rubycritic, seeing_is_believing, standard, steep, unparser, vernacular, yoda 2013-04-15

Slide 140

Slide 140 text

twitter.com/kddnewton Parsing Ruby Tru ffl eRuby • Originally branched o ff JRuby
 • Graal dynamic compiler, Tru ffl e AST interpreter 2013-10-26

Slide 141

Slide 141 text

twitter.com/kddnewton Parsing Ruby Ruby 2.1.0 • Required keyword arguments
 • Rational and complex literals
 • Frozen string literal su ffi x 2013-12-25

Slide 142

Slide 142 text

twitter.com/kddnewton Parsing Ruby Ruby 2.2.0 • Dynamic symbol hash keys 2014-12-25

Slide 143

Slide 143 text

twitter.com/kddnewton Parsing Ruby Ruby 2.3.0 • Frozen string literal pragma
 • <<~ heredocs
 • &. lonely operator 2015-12-25

Slide 144

Slide 144 text

twitter.com/kddnewton Parsing Ruby Ruby 2.4.0 • Symbol#to_proc re fi nements
 • Top-level return
 • Multiple assignment in conditional 2016-12-25

Slide 145

Slide 145 text

twitter.com/kddnewton Parsing Ruby tree-sitter • Parser-generator library
 • vscode-ruby 2017-02-02

Slide 146

Slide 146 text

twitter.com/kddnewton Parsing Ruby typedruby • Type system in Rust, parser in C++
 • Grammar from Ruby 2.4, lexer from ruby_parser
 • Vendored in Sorbet 2017-02-26

Slide 147

Slide 147 text

twitter.com/kddnewton Parsing Ruby Ruby 2.5.0 • String interpolation re fi nements
 • rescue and ensure at the block level 2017-12-25

Slide 148

Slide 148 text

twitter.com/kddnewton Parsing Ruby Ruby 2.6.0 • RubyVM::AbstractSyntaxTree
 • Flip- fl op deprecated
 • Endless ranges
 • Non-ASCII constant names 2018-12-25

Slide 149

Slide 149 text

twitter.com/kddnewton Parsing Ruby Ruby 2.7.0 • Flip- fl op undeprecated • Method reference operator • Keyword argument warning • No other keywords syntax • Beginless range • Pattern matching • Numbered parameters • Rightward assignment • Argument forwarding 2019-12-25

Slide 150

Slide 150 text

Ruby 3.x

Slide 151

Slide 151 text

twitter.com/kddnewton Parsing Ruby Ruby 3.0.0 • Keyword arguments
 • Single-line “endless” methods
 • “Find pattern” pattern matching
 • shareable_constant_value pragma
 • in keyword pattern matching 2020-12-25

Slide 152

Slide 152 text

twitter.com/kddnewton Parsing Ruby Ruby 3.1.0 Preview 1 • Hash literal shorthand { x:, y: } == { x: x, y: y }
 • Pinned expressions 2021-11-09

Slide 153

Slide 153 text

How Ripper works

Slide 154

Slide 154 text

twitter.com/kddnewton Parsing Ruby class Parser expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER def parse(input) until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end end

Slide 155

Slide 155 text

twitter.com/kddnewton Parsing Ruby class Parser expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER def parse(input) until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end end

Slide 156

Slide 156 text

twitter.com/kddnewton Parsing Ruby class Parser expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER def parse(input) until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end end

Slide 157

Slide 157 text

twitter.com/kddnewton Parsing Ruby static enum yytokentype parser_yylex(struct parser_params *p) { ... case '#': /* it's a comment */ p->token_seen = token_seen; lex_goto_eol(p); dispatch_scan_event(p, tCOMMENT); fallthru = TRUE; /* fall through */ case '\n': p->token_seen = token_seen; ... }

Slide 158

Slide 158 text

twitter.com/kddnewton Parsing Ruby static enum yytokentype parser_yylex(struct parser_params *p) { ... case '#': /* it's a comment */ p->token_seen = token_seen; lex_goto_eol(p); dispatch_scan_event(p, tCOMMENT); fallthru = TRUE; /* fall through */ case '\n': p->token_seen = token_seen; ... }

Slide 159

Slide 159 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY

Slide 160

Slide 160 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY

Slide 161

Slide 161 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY

Slide 162

Slide 162 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY

Slide 163

Slide 163 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY

Slide 164

Slide 164 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY

Slide 165

Slide 165 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY

Slide 166

Slide 166 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper attr_reader :comments def initialize(*) super @comments = [] end def on_comment(value) @comments << value end end p Parser.new(<<~RUBY).tap(&:parse).comments # this is a comment # this is another comment # this is a third comment RUBY [ "# this is a comment\n", "# this is another comment\n", "# this is a third comment\n" ]

Slide 167

Slide 167 text

twitter.com/kddnewton Parsing Ruby class Parser expression : expression "+" expression { [:add, val[0], val[2]] } | expression "-" expression { [:sub, val[0], val[2]] } | expression "*" expression { [:mul, val[0], val[2]] } | expression "/" expression { [:div, val[0], val[2]] } | "(" expression ")" { [:prn, val[1]] } | NUMBER def parse(input) until input.empty? case input when /^\s+/ when /^\d+/ yield [:NUMBER, $&.to_i] when /^[-+*\/()]/ yield [$&, $&] else raise "Failed to parse: #{input.first(10)}" end input = $' end end

Slide 168

Slide 168 text

twitter.com/kddnewton Parsing Ruby method_call : | keyword_super paren_args { /*%%%*/ $$ = NEW_SUPER($2, &@$); /*% %*/ /*% ripper: super!($2) %*/ } | keyword_super { /*%%%*/ $$ = NEW_ZSUPER(&@$); /*% %*/ /*% ripper: zsuper! %*/ }

Slide 169

Slide 169 text

twitter.com/kddnewton Parsing Ruby method_call : | keyword_super paren_args { /*%%%*/ $$ = NEW_SUPER($2, &@$); /*% %*/ /*% ripper: super!($2) %*/ } | keyword_super { /*%%%*/ $$ = NEW_ZSUPER(&@$); /*% %*/ /*% ripper: zsuper! %*/ }

Slide 170

Slide 170 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper def on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY

Slide 171

Slide 171 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper def on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY

Slide 172

Slide 172 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper def on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY

Slide 173

Slide 173 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper def on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY super called without arguments super called with arguments:

Slide 174

Slide 174 text

twitter.com/kddnewton Parsing Ruby method_call : | keyword_super paren_args { /*%%%*/ $$ = NEW_SUPER($2, &@$); /*% %*/ /*% ripper: super!($2) %*/ } | keyword_super { /*%%%*/ $$ = NEW_ZSUPER(&@$); /*% %*/ /*% ripper: zsuper! %*/ }

Slide 175

Slide 175 text

twitter.com/kddnewton Parsing Ruby method_call : | keyword_super paren_args { /*%%%*/ $$ = NEW_SUPER($2, &@$); /*% %*/ /*% ripper: super!($2) %*/ } | keyword_super { /*%%%*/ $$ = NEW_ZSUPER(&@$); /*% %*/ /*% ripper: zsuper! %*/ }

Slide 176

Slide 176 text

twitter.com/kddnewton Parsing Ruby method_call : | keyword_super paren_args { /*%%%*/ $$ = NEW_SUPER($2, &@$); /*% %*/ /*% ripper: super!($2) %*/ } | keyword_super { /*%%%*/ $$ = NEW_ZSUPER(&@$); /*% %*/ /*% ripper: zsuper! %*/ }

Slide 177

Slide 177 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper def on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY

Slide 178

Slide 178 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper::SexpBuilderPP def on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY

Slide 179

Slide 179 text

twitter.com/kddnewton Parsing Ruby require "ripper" class Parser < Ripper::SexpBuilderPP def on_super(arguments) puts "super called with arguments: #{arguments}" end def on_zsuper puts "super called without arguments" end end Parser.new(<<~RUBY).parse super super(1, 2, 3) RUBY super called without arguments super called with arguments: [:arg_paren, [:args_add_block, [ [:@int, "1", [2, 6]], [:@int, "2", [2, 9]], [:@int, "3", [2, 12]] ], false]]

Slide 180

Slide 180 text

No content

Slide 181

Slide 181 text

• Implement every method handler yourself

Slide 182

Slide 182 text

• Implement every method handler yourself • Inherit from Ripper::SexpBuilder or Ripper::SexpBuilderPP

Slide 183

Slide 183 text

• Implement every method handler yourself • Inherit from Ripper::SexpBuilder or Ripper::SexpBuilderPP • Some combination of both

Slide 184

Slide 184 text

twitter.com/kddnewton Parsing Ruby # frozen_string_literal: true require 'ripper' class Prettier::Parser < Ripper # Represents a line in the source. If this class is being used, it means that # every character in the string is 1 byte in length, so we can just return the # start of the line + the index. class SingleByteString def initialize(start) @start = start end def [](byteindex) @start + byteindex end end # Represents a line in the source. If this class is being used, it means that # there are characters in the string that are multi-byte, so we will build up # an array of indices, such that array[byteindex] will be equal to the index # of the character within the string. class MultiByteString def initialize(start, line) @indices = [] line .each_char .with_index(start) do |char, index| char.bytesize.times { @indices << index } end end def [](byteindex) @indices[byteindex] end end class Location attr_reader :start_line, :start_char, :end_line, :end_char def initialize(start_line:, start_char:, end_line:, end_char:) @start_line = start_line

Slide 185

Slide 185 text

twitter.com/kddnewton Parsing Ruby ending = find_scanner_event(:@tstring_end) { type: :xstring_literal, body: xstring[:body], loc: xstring[:loc].to(ending[:loc]) } end end # yield is a parser event that represents using the yield keyword with # arguments. It accepts as an argument an args_add_block event that # contains all of the arguments being passed. def on_yield(args_add_block) event = find_scanner_event(:@kw, 'yield') { type: :yield, body: [args_add_block], loc: event[:loc].to(args_add_block[:loc]) } end # yield0 is a parser event that represents the bare yield keyword. It has # no body as it accepts no arguments. This is as opposed to the yield # parser event, which is the version where you're yielding one or more # values. def on_yield0 event = find_scanner_event(:@kw, 'yield') { type: :yield0, body: event[:body], loc: event[:loc] } end # zsuper is a parser event that represents the bare super keyword. It has # no body as it accepts no arguments. This is as opposed to the super # parser event, which is the version where you're calling super with one # or more values. def on_zsuper event = find_scanner_event(:@kw, 'super') { type: :zsuper, body: event[:body], loc: event[:loc] } end end

Slide 186

Slide 186 text

Syntax tree options

Slide 187

Slide 187 text

twitter.com/kddnewton Parsing Ruby ruby_parser • Some community adoption • Not 100% compatible, new stu ff may break
 • Doesn’t ship with/test with core

Slide 188

Slide 188 text

twitter.com/kddnewton Parsing Ruby parser • Tons of community adoption
 • Well-documented
 • Not 100% compatible, new stu ff may break
 • Doesn’t ship with/test with core

Slide 189

Slide 189 text

twitter.com/kddnewton Parsing Ruby RubyVM::AbstractSyntaxTree • Still too early to tell
 • Not implemented anywhere else

Slide 190

Slide 190 text

twitter.com/kddnewton Parsing Ruby ripper • Built into the parser generator
 • Well-tested in core
 • Ships with Ruby
 • No documentation

Slide 191

Slide 191 text

twitter.com/kddnewton Parsing Ruby ripper • Built into the parser generator
 • Well-tested in core
 • Ships with Ruby
 • No documentation

Slide 192

Slide 192 text

twitter.com/kddnewton Parsing Ruby

Slide 193

Slide 193 text

Parsing Ruby Kevin Newton https://kddnewton.com/parsing-ruby https://kddnewton.com/ripper-docs