Let's Write a Parser
Ionuț G. Stan — I T.A.K.E. — May 2016
Slide 2
Slide 2 text
About Me
Slide 3
Slide 3 text
• Software Developer at Eloquentix
About Me
Slide 4
Slide 4 text
• Software Developer at Eloquentix
• I work mostly with Scala
About Me
Slide 5
Slide 5 text
• Software Developer at Eloquentix
• I work mostly with Scala
• I like FP, programming languages, compilers
About Me
Slide 6
Slide 6 text
• Software Developer at Eloquentix
• I work mostly with Scala
• I like FP, programming languages, compilers
• I started the Bucharest FP meet-up group
About Me
Slide 7
Slide 7 text
• Software Developer at Eloquentix
• I work mostly with Scala
• I like FP, programming languages, compilers
• I started the Bucharest FP meet-up group
• I occasionally blog on igstan.ro
About Me
Slide 8
Slide 8 text
Plan
Slide 9
Slide 9 text
• Vehicle Language: µML
Plan
Slide 10
Slide 10 text
• Vehicle Language: µML
• Compilers Overview
Plan
Slide 11
Slide 11 text
• Vehicle Language: µML
• Compilers Overview
• Parsing: Intuitions and Live Coding
Plan
Slide 12
Slide 12 text
Vehicle Language: µML
Slide 13
Slide 13 text
1. Integers: 1, 23, 456, etc.
Vehicle Language: µML
Slide 14
Slide 14 text
1. Integers: 1, 23, 456, etc.
2. Identifiers (only letters): inc, cond, a, etc.
Vehicle Language: µML
Slide 15
Slide 15 text
1. Integers: 1, 23, 456, etc.
2. Identifiers (only letters): inc, cond, a, etc.
3. Booleans: true and false
Vehicle Language: µML
Slide 16
Slide 16 text
1. Integers: 1, 23, 456, etc.
2. Identifiers (only letters): inc, cond, a, etc.
3. Booleans: true and false
4. Single-argument anonymous functions: fn a => a
Vehicle Language: µML
Slide 17
Slide 17 text
1. Integers: 1, 23, 456, etc.
2. Identifiers (only letters): inc, cond, a, etc.
3. Booleans: true and false
4. Single-argument anonymous functions: fn a => a
5. Function application: inc 42
Vehicle Language: µML
Slide 18
Slide 18 text
1. Integers: 1, 23, 456, etc.
2. Identifiers (only letters): inc, cond, a, etc.
3. Booleans: true and false
4. Single-argument anonymous functions: fn a => a
5. Function application: inc 42
6. If expressions: if cond then t else f
Vehicle Language: µML
Slide 19
Slide 19 text
1. Integers: 1, 23, 456, etc.
2. Identifiers (only letters): inc, cond, a, etc.
3. Booleans: true and false
4. Single-argument anonymous functions: fn a => a
5. Function application: inc 42
6. If expressions: if cond then t else f
7. Addition and subtraction: a + b, a - b
Vehicle Language: µML
Slide 20
Slide 20 text
1. Integers: 1, 23, 456, etc.
2. Identifiers (only letters): inc, cond, a, etc.
3. Booleans: true and false
4. Single-argument anonymous functions: fn a => a
5. Function application: inc 42
6. If expressions: if cond then t else f
7. Addition and subtraction: a + b, a - b
8. Parenthesized expressions: (a + b)
Vehicle Language: µML
Slide 21
Slide 21 text
9. Let blocks/expressions:
let
val name = ...
in
name
end
Vehicle Language: µML
Slide 22
Slide 22 text
Small Example
let
val inc =
fn a => a + 1
in
inc 42
end
Slide 23
Slide 23 text
Compilers Overview
Slide 24
Slide 24 text
Compiler
Compilers Overview
Slide 25
Slide 25 text
Compiler
Compilers Overview
Source Language
Slide 26
Slide 26 text
Target Language
Compiler
Compilers Overview
Source Language
Slide 27
Slide 27 text
Target Language
Compiler
(fn a => a) 2
Compilers Overview
Source Language
Slide 28
Slide 28 text
Target Language
Compiler
(fn a => a) 2 (function(a){return a})(2)
Compilers Overview
Source Language
Slide 29
Slide 29 text
Compilers Overview
uage T
Compiler
) 2 (funct
Slide 30
Slide 30 text
Parsing
T
Compiler
) 2
Parser
uage
(funct
Slide 31
Slide 31 text
Abstract Syntax Tree
T
Compiler
) 2
Parser
APP
FUN
a VAR
a
INT
2
Abstract Syntax Tree (AST)
uage
(funct
Slide 32
Slide 32 text
Code Generation
T
Compiler
) 2
Parser CodeGen
APP
FUN
a VAR
a
INT
2
Abstract Syntax Tree (AST)
uage
(funct
Slide 33
Slide 33 text
Many Intermediate Phases
de T
Compiler
) 2
Parser CodeGen
...
AST
(funct
Slide 34
Slide 34 text
Type Checking
T
Compiler
) 2
Parser CodeGen
Type Checker
AST
Typed AST
...
uage
(funct
Slide 35
Slide 35 text
Last Year's Talk
T
Compiler
) 2
Parser CodeGen
Type Checker
AST
Typed AST
Last Year
...
uage
(funct
Slide 36
Slide 36 text
Today's Talk
T
Compiler
) 2
Parser CodeGen
Type Checker
AST
Typed AST
Today
...
uage
(funct
Slide 37
Slide 37 text
Parsing
Compiler
) 2
Parser
uage
Slide 38
Slide 38 text
Lexing + Parsing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
uage
Slide 39
Slide 39 text
Lexing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
uage
Slide 40
Slide 40 text
Lexing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
uage
Slide 41
Slide 41 text
Lexing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
uage
Slide 42
Slide 42 text
Lexing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
uage
Slide 43
Slide 43 text
Lexing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
uage
Slide 44
Slide 44 text
Lexing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
uage
Slide 45
Slide 45 text
Lexing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
uage
Slide 46
Slide 46 text
Lexing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
APP
FUN
a VAR
a
INT
2
AST
uage
Slide 47
Slide 47 text
Parsing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
APP
FUN
a VAR
a
INT
2
AST
uage
Slide 48
Slide 48 text
Parsing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
APP
FUN
a VAR
a
INT
2
AST
uage
Slide 49
Slide 49 text
Parsing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
APP
FUN
a VAR
a
INT
2
AST
uage
Slide 50
Slide 50 text
Parsing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
APP
FUN
a VAR
a
INT
2
AST
uage
Slide 51
Slide 51 text
Parsing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
APP
FUN
a VAR
a
INT
2
AST
uage
Slide 52
Slide 52 text
Lexing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
APP
FUN
a VAR
a
INT
2
AST
uage
Slide 53
Slide 53 text
Lexing
Slide 54
Slide 54 text
Lexing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
uage
• Expects a stream of characters or bytes
• Groups them into semantically atomic units: tokens!
• These are the words of the language!
• What are the rules for grouping them, though?
Slide 55
Slide 55 text
• Grouping can be thought of as "split by space"
Lexing
Slide 56
Slide 56 text
• Grouping can be thought of as "split by space"
• Why not exactly that, though? Consider:
Lexing
Slide 57
Slide 57 text
• Grouping can be thought of as "split by space"
• Why not exactly that, though? Consider:
Lexing
val sum = 1 + 2
!
val sum=1+2
!
val str = "spaces matter here"
Slide 58
Slide 58 text
• We need rules for grouping characters into tokens
Lexing
Slide 59
Slide 59 text
• We need rules for grouping characters into tokens
• These rules form the lexical grammar
Lexing
Slide 60
Slide 60 text
• We need rules for grouping characters into tokens
• These rules form the lexical grammar
• Can be defined using regular expressions
Lexing
Slide 61
Slide 61 text
• We need rules for grouping characters into tokens
• These rules form the lexical grammar
• Can be defined using regular expressions
• Conducive to easy and efficient implementations
Lexing
Slide 62
Slide 62 text
• We need rules for grouping characters into tokens
• These rules form the lexical grammar
• Can be defined using regular expressions
• Conducive to easy and efficient implementations
• Using a RegExp library
Lexing
Slide 63
Slide 63 text
• We need rules for grouping characters into tokens
• These rules form the lexical grammar
• Can be defined using regular expressions
• Conducive to easy and efficient implementations
• Using a RegExp library
• By hand isn't hard either, just a little cumbersome
Lexing
Slide 64
Slide 64 text
• We need rules for grouping characters into tokens
• These rules form the lexical grammar
• Can be defined using regular expressions
• Conducive to easy and efficient implementations
• Using a RegExp library
• By hand isn't hard either, just a little cumbersome
• Lexer generators: Lex, Flex, Alex, ANTLR, etc.
Lexing
Slide 65
Slide 65 text
• We need rules for grouping characters into tokens
• These rules form the lexical grammar
• Can be defined using regular expressions
• Conducive to easy and efficient implementations
• Using a RegExp library
• By hand isn't hard either, just a little cumbersome
• Lexer generators: Lex, Flex, Alex, ANTLR, etc.
• Lexing is what you need for syntax definition files
Lexing
Parsing
Compiler
) 2
Parser
Lexer
Tokens
( fn a => a ) 2
Parser
APP
FUN
a VAR
a
INT
2
AST
uage
Slide 74
Slide 74 text
• The lexer recognizes valid words in the language
Parsing
Slide 75
Slide 75 text
• The lexer recognizes valid words in the language
• Not all combinations of valid words form valid phrases in
a language
Parsing
Slide 76
Slide 76 text
• The lexer recognizes valid words in the language
• Not all combinations of valid words form valid phrases in
a language
• Syntactically correct: val a = 1
Parsing
Slide 77
Slide 77 text
• The lexer recognizes valid words in the language
• Not all combinations of valid words form valid phrases in
a language
• Syntactically correct: val a = 1
• Syntactically incorrect: val val val
Parsing
Slide 78
Slide 78 text
• The lexer recognizes valid words in the language
• Not all combinations of valid words form valid phrases in
a language
• Syntactically correct: val a = 1
• Syntactically incorrect: val val val
• We must define the structure of phrases
Parsing
Slide 79
Slide 79 text
• The lexer recognizes valid words in the language
• Not all combinations of valid words form valid phrases in
a language
• Syntactically correct: val a = 1
• Syntactically incorrect: val val val
• We must define the structure of phrases
• A syntactical grammar achieves that
Parsing
Slide 80
Slide 80 text
• Regular expressions are not powerful enough
Parsing
Slide 81
Slide 81 text
• Regular expressions are not powerful enough
• REs can't recognize nested structures
Parsing
Slide 82
Slide 82 text
• Regular expressions are not powerful enough
• REs can't recognize nested structures
• Because they use a finite amount of memory
Parsing
Slide 83
Slide 83 text
• Regular expressions are not powerful enough
• REs can't recognize nested structures
• Because they use a finite amount of memory
• Nesting needs a stack to remember the upper
structures you're traversing
Parsing
Slide 84
Slide 84 text
• Regular expressions are not powerful enough
• REs can't recognize nested structures
• Because they use a finite amount of memory
• Nesting needs a stack to remember the upper
structures you're traversing
• Syntactical grammars express nesting using
recursion
Parsing
Slide 85
Slide 85 text
No content
Slide 86
Slide 86 text
It's not weird-looking Unicode characters
that make regexes unsuitable for parsing.
Slide 87
Slide 87 text
Syntactical Grammar
Slide 88
Slide 88 text
µML — Syntactical Grammar
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
oper = + | -
bool = true | false
Slide 89
Slide 89 text
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
oper = + | -
bool = true | false
Here, blue symbols represent tokens coming from the lexer, not keywords.
µML — Syntactical Grammar
Slide 90
Slide 90 text
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
oper = + | -
bool = true | false
Here, blue symbols represent tokens coming from the lexer, not keywords.
µML — Syntactical Grammar
Slide 91
Slide 91 text
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
bool = true | false
oper = + | -
Here, blue symbols represent tokens coming from the lexer, not keywords.
µML — Syntactical Grammar
Slide 92
Slide 92 text
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
bool = true | false
oper = + | -
Here, blue symbols represent tokens coming from the lexer, not keywords.
µML — Syntactical Grammar
Slide 93
Slide 93 text
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
bool = true | false
oper = + | -
Here, blue symbols represent tokens coming from the lexer, not keywords.
µML — Syntactical Grammar
Slide 94
Slide 94 text
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
bool = true | false
oper = + | -
Here, blue symbols represent tokens coming from the lexer, not keywords.
µML — Syntactical Grammar
Slide 95
Slide 95 text
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
bool = true | false
oper = + | -
Here, blue symbols represent tokens coming from the lexer, not keywords.
µML — Syntactical Grammar
Slide 96
Slide 96 text
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
bool = true | false
oper = + | -
Here, blue symbols represent tokens coming from the lexer, not keywords.
µML — Syntactical Grammar
Slide 97
Slide 97 text
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
bool = true | false
oper = + | -
Here, blue symbols represent tokens coming from the lexer, not keywords.
µML — Syntactical Grammar
Slide 98
Slide 98 text
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
bool = true | false
oper = + | -
Here, blue symbols represent tokens coming from the lexer, not keywords.
µML — Syntactical Grammar
Slide 99
Slide 99 text
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
bool = true | false
oper = + | -
Here, blue symbols represent tokens coming from the lexer, not keywords.
µML — Syntactical Grammar
Slide 100
Slide 100 text
• Function application has higher precedence over infix
expressions in ML
Introducing Precedence
Slide 101
Slide 101 text
• Function application has higher precedence over infix
expressions in ML
• double 1 + 2 = (double 1) + 2
Introducing Precedence
Slide 102
Slide 102 text
• Function application has higher precedence over infix
expressions in ML
• double 1 + 2 = (double 1) + 2
• double 1 + 2 ≠ double (1 + 2)
Introducing Precedence
Slide 103
Slide 103 text
• Function application has higher precedence over infix
expressions in ML
• double 1 + 2 = (double 1) + 2
• double 1 + 2 ≠ double (1 + 2)
• A rule's alternatives don't encode precedence
Introducing Precedence
Slide 104
Slide 104 text
• Function application has higher precedence over infix
expressions in ML
• double 1 + 2 = (double 1) + 2
• double 1 + 2 ≠ double (1 + 2)
• A rule's alternatives don't encode precedence
• Grammars convey this by chaining rules in order of precedence
Introducing Precedence
Slide 105
Slide 105 text
• Function application has higher precedence over infix
expressions in ML
• double 1 + 2 = (double 1) + 2
• double 1 + 2 ≠ double (1 + 2)
• A rule's alternatives don't encode precedence
• Grammars convey this by chaining rules in order of precedence
• Doesn't scale with many infix operators
Introducing Precedence
Slide 106
Slide 106 text
• Function application has higher precedence over infix
expressions in ML
• double 1 + 2 = (double 1) + 2
• double 1 + 2 ≠ double (1 + 2)
• A rule's alternatives don't encode precedence
• Grammars convey this by chaining rules in order of precedence
• Doesn't scale with many infix operators
• Use a special parser for that, e.g., the Shunting Yard algorithm
Introducing Precedence
Slide 107
Slide 107 text
Introducing Precedence
expr = int
| var
| bool
| ( expr )
| fn var => expr
| if expr then expr else expr
| let val var = expr in expr end
| expr oper expr
| expr expr
bool = true | false
oper = + | -
Slide 108
Slide 108 text
Introducing Precedence
expr = infix
| fn var => expr
| if expr then expr else expr
!
infix = app
| infix oper infix
!
app = atomic
| app atomic
!
atomic = int
| var
| bool
| ( expr )
| let val var = expr in expr end
bool = true | false
oper = + | -
Slide 109
Slide 109 text
Introducing Precedence
expr = infix
| fn var => expr
| if expr then expr else expr
!
infix = app
| infix oper infix
!
app = atomic
| app atomic
!
atomic = int
| var
| bool
| ( expr )
| let val var = expr in expr end
bool = true | false
oper = + | -
Slide 110
Slide 110 text
Introducing Precedence
expr = infix
| fn var => expr
| if expr then expr else expr
!
infix = app
| infix oper infix
!
app = atomic
| app atomic
!
atomic = int
| var
| bool
| ( expr )
| let val var = expr in expr end
bool = true | false
oper = + | -
Slide 111
Slide 111 text
Introducing Precedence
expr = infix
| fn var => expr
| if expr then expr else expr
!
infix = app
| infix oper infix
!
app = atomic
| app atomic
!
atomic = int
| var
| bool
| ( expr )
| let val var = expr in expr end
bool = true | false
oper = + | -
Slide 112
Slide 112 text
Parsing Strategies
Slide 113
Slide 113 text
• Two styles:
Parsing Strategies
Slide 114
Slide 114 text
• Two styles:
• Top-down parsing: builds tree from the root
Parsing Strategies
Slide 115
Slide 115 text
• Two styles:
• Top-down parsing: builds tree from the root
• Bottom-up parsing: builds tree from the leaves
Parsing Strategies
Slide 116
Slide 116 text
• Two styles:
• Top-down parsing: builds tree from the root
• Bottom-up parsing: builds tree from the leaves
• Top-down is easy to write by hand
Parsing Strategies
Slide 117
Slide 117 text
• Two styles:
• Top-down parsing: builds tree from the root
• Bottom-up parsing: builds tree from the leaves
• Top-down is easy to write by hand
• Bottom-up is not, but it's used by generators
Parsing Strategies
Slide 118
Slide 118 text
• Two styles:
• Top-down parsing: builds tree from the root
• Bottom-up parsing: builds tree from the leaves
• Top-down is easy to write by hand
• Bottom-up is not, but it's used by generators
• Parser generators: YACC, ANTLR, Bison, etc.
Parsing Strategies
Slide 119
Slide 119 text
• The simplest known parsing strategy; amenable to hand-coding
Recursive Descent Parser
Slide 120
Slide 120 text
• The simplest known parsing strategy; amenable to hand-coding
• Builds the tree top to bottom, from root to leaves, hence Descent
Recursive Descent Parser
Slide 121
Slide 121 text
• The simplest known parsing strategy; amenable to hand-coding
• Builds the tree top to bottom, from root to leaves, hence Descent
• Parallels the structure of the grammar
Recursive Descent Parser
Slide 122
Slide 122 text
• The simplest known parsing strategy; amenable to hand-coding
• Builds the tree top to bottom, from root to leaves, hence Descent
• Parallels the structure of the grammar
• Main idea: each grammar production becomes a function
Recursive Descent Parser
Slide 123
Slide 123 text
• The simplest known parsing strategy; amenable to hand-coding
• Builds the tree top to bottom, from root to leaves, hence Descent
• Parallels the structure of the grammar
• Main idea: each grammar production becomes a function
• Recursion in the grammar translates to recursion in the code, hence
Recursive
Recursive Descent Parser
Slide 124
Slide 124 text
• The simplest known parsing strategy; amenable to hand-coding
• Builds the tree top to bottom, from root to leaves, hence Descent
• Parallels the structure of the grammar
• Main idea: each grammar production becomes a function
• Recursion in the grammar translates to recursion in the code, hence
Recursive
• Recursion is the main difference compared to regexes; it needs a stack
Recursive Descent Parser
Slide 125
Slide 125 text
• The simplest known parsing strategy; amenable to hand-coding
• Builds the tree top to bottom, from root to leaves, hence Descent
• Parallels the structure of the grammar
• Main idea: each grammar production becomes a function
• Recursion in the grammar translates to recursion in the code, hence
Recursive
• Recursion is the main difference compared to regexes; it needs a stack
• Very popular, e.g., Clang uses it for C/C++/Obj-C
Recursive Descent Parser
Slide 126
Slide 126 text
• The simplest known parsing strategy; amenable to hand-coding
• Builds the tree top to bottom, from root to leaves, hence Descent
• Parallels the structure of the grammar
• Main idea: each grammar production becomes a function
• Recursion in the grammar translates to recursion in the code, hence
Recursive
• Recursion is the main difference compared to regexes; it needs a stack
• Very popular, e.g., Clang uses it for C/C++/Obj-C
• Parser combinators are an abstraction over this idea
Recursive Descent Parser
Slide 127
Slide 127 text
Code
Slide 128
Slide 128 text
• The current grammar has a problem
Removing Left-Recursion
Slide 129
Slide 129 text
• The current grammar has a problem
• But, it's only a problem for our current parsing strategy;
others can easily cope with it
Removing Left-Recursion
Slide 130
Slide 130 text
• The current grammar has a problem
• But, it's only a problem for our current parsing strategy;
others can easily cope with it
• The problem is that some rules are left-recursive, i.e., the
rule itself appears as the first symbol on the left
Removing Left-Recursion
Slide 131
Slide 131 text
• The current grammar has a problem
• But, it's only a problem for our current parsing strategy;
others can easily cope with it
• The problem is that some rules are left-recursive, i.e., the
rule itself appears as the first symbol on the left
• This is problematic for a recursive descent parser because
the structure of function calls follow the structure of rule
definitions
Removing Left-Recursion
Slide 132
Slide 132 text
• The current grammar has a problem
• But, it's only a problem for our current parsing strategy;
others can easily cope with it
• The problem is that some rules are left-recursive, i.e., the
rule itself appears as the first symbol on the left
• This is problematic for a recursive descent parser because
the structure of function calls follow the structure of rule
definitions
• That means infinite recursion in the parser, which isn't good
Removing Left-Recursion
Slide 133
Slide 133 text
expr = infix
| fn var => expr
| if expr then expr else expr
!
infix = app
| infix oper infix
!
app = atomic
| app atomic
!
atomic = int
| var
| bool
| ( expr )
| let val var = expr in expr end
bool = true | false
oper = + | -
Left-Recursive Grammar
Slide 134
Slide 134 text
expr = infix
| fn var => expr
| if expr then expr else expr
!
infix = app
| infix oper infix
!
app = atomic
| app atomic
!
atomic = int
| var
| bool
| ( expr )
| let val var = expr in expr end
bool = true | false
oper = + | -
Left-Recursive Grammar
Removing Left-Recursion
expr = infix
| fn var => expr
| if expr then expr else expr
!
infix = app { oper infix }
Slide 147
Slide 147 text
Removing Left-Recursion
expr = infix
| fn var => expr
| if expr then expr else expr
!
infix = app { oper infix }
!
app = atomic { app }
!
12 14 13
(12 14) 13
!
atomic = int
| var
| bool
| ( expr )
| let val var = expr in expr end
bool = true | false
oper = + | -
Slide 148
Slide 148 text
github.com / igstan / itake-2016
Slide 149
Slide 149 text
• Write a lexer for JSON
• Write a recursive descent parser for JSON
• It's way easier than today's vehicle language
• I promise!
• Specification: json.org
Homework