Slide 1

Slide 1 text

Let's Write a Parser Ionuț G. Stan — I T.A.K.E. — May 2016

Slide 2

Slide 2 text

About Me

Slide 3

Slide 3 text

• Software Developer at Eloquentix About Me

Slide 4

Slide 4 text

• Software Developer at Eloquentix • I work mostly with Scala About Me

Slide 5

Slide 5 text

• Software Developer at Eloquentix • I work mostly with Scala • I like FP, programming languages, compilers About Me

Slide 6

Slide 6 text

• Software Developer at Eloquentix • I work mostly with Scala • I like FP, programming languages, compilers • I started the Bucharest FP meet-up group About Me

Slide 7

Slide 7 text

• Software Developer at Eloquentix • I work mostly with Scala • I like FP, programming languages, compilers • I started the Bucharest FP meet-up group • I occasionally blog on igstan.ro About Me

Slide 8

Slide 8 text

Plan

Slide 9

Slide 9 text

• Vehicle Language: µML Plan

Slide 10

Slide 10 text

• Vehicle Language: µML • Compilers Overview Plan

Slide 11

Slide 11 text

• Vehicle Language: µML • Compilers Overview • Parsing: Intuitions and Live Coding Plan

Slide 12

Slide 12 text

Vehicle Language: µML

Slide 13

Slide 13 text

1. Integers: 1, 23, 456, etc. Vehicle Language: µML

Slide 14

Slide 14 text

1. Integers: 1, 23, 456, etc. 2. Identifiers (only letters): inc, cond, a, etc. Vehicle Language: µML

Slide 15

Slide 15 text

1. Integers: 1, 23, 456, etc. 2. Identifiers (only letters): inc, cond, a, etc. 3. Booleans: true and false Vehicle Language: µML

Slide 16

Slide 16 text

1. Integers: 1, 23, 456, etc. 2. Identifiers (only letters): inc, cond, a, etc. 3. Booleans: true and false 4. Single-argument anonymous functions: fn a => a Vehicle Language: µML

Slide 17

Slide 17 text

1. Integers: 1, 23, 456, etc. 2. Identifiers (only letters): inc, cond, a, etc. 3. Booleans: true and false 4. Single-argument anonymous functions: fn a => a 5. Function application: inc 42 Vehicle Language: µML

Slide 18

Slide 18 text

1. Integers: 1, 23, 456, etc. 2. Identifiers (only letters): inc, cond, a, etc. 3. Booleans: true and false 4. Single-argument anonymous functions: fn a => a 5. Function application: inc 42 6. If expressions: if cond then t else f Vehicle Language: µML

Slide 19

Slide 19 text

1. Integers: 1, 23, 456, etc. 2. Identifiers (only letters): inc, cond, a, etc. 3. Booleans: true and false 4. Single-argument anonymous functions: fn a => a 5. Function application: inc 42 6. If expressions: if cond then t else f 7. Addition and subtraction: a + b, a - b Vehicle Language: µML

Slide 20

Slide 20 text

1. Integers: 1, 23, 456, etc. 2. Identifiers (only letters): inc, cond, a, etc. 3. Booleans: true and false 4. Single-argument anonymous functions: fn a => a 5. Function application: inc 42 6. If expressions: if cond then t else f 7. Addition and subtraction: a + b, a - b 8. Parenthesized expressions: (a + b) Vehicle Language: µML

Slide 21

Slide 21 text

9. Let blocks/expressions:
 
 let
 val name = ...
 in
 name
 end Vehicle Language: µML

Slide 22

Slide 22 text

Small Example let val inc = fn a => a + 1 in inc 42 end

Slide 23

Slide 23 text

Compilers Overview

Slide 24

Slide 24 text

Compiler Compilers Overview

Slide 25

Slide 25 text

Compiler Compilers Overview Source Language

Slide 26

Slide 26 text

Target Language Compiler Compilers Overview Source Language

Slide 27

Slide 27 text

Target Language Compiler (fn a => a) 2 Compilers Overview Source Language

Slide 28

Slide 28 text

Target Language Compiler (fn a => a) 2 (function(a){return a})(2) Compilers Overview Source Language

Slide 29

Slide 29 text

Compilers Overview uage T Compiler ) 2 (funct

Slide 30

Slide 30 text

Parsing T Compiler ) 2 Parser uage (funct

Slide 31

Slide 31 text

Abstract Syntax Tree T Compiler ) 2 Parser APP FUN a VAR a INT 2 Abstract Syntax Tree (AST) uage (funct

Slide 32

Slide 32 text

Code Generation T Compiler ) 2 Parser CodeGen APP FUN a VAR a INT 2 Abstract Syntax Tree (AST) uage (funct

Slide 33

Slide 33 text

Many Intermediate Phases de T Compiler ) 2 Parser CodeGen ... AST (funct

Slide 34

Slide 34 text

Type Checking T Compiler ) 2 Parser CodeGen Type Checker AST Typed AST ... uage (funct

Slide 35

Slide 35 text

Last Year's Talk T Compiler ) 2 Parser CodeGen Type Checker AST Typed AST Last Year ... uage (funct

Slide 36

Slide 36 text

Today's Talk T Compiler ) 2 Parser CodeGen Type Checker AST Typed AST Today ... uage (funct

Slide 37

Slide 37 text

Parsing Compiler ) 2 Parser uage

Slide 38

Slide 38 text

Lexing + Parsing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser uage

Slide 39

Slide 39 text

Lexing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser uage

Slide 40

Slide 40 text

Lexing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser uage

Slide 41

Slide 41 text

Lexing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser uage

Slide 42

Slide 42 text

Lexing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser uage

Slide 43

Slide 43 text

Lexing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser uage

Slide 44

Slide 44 text

Lexing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser uage

Slide 45

Slide 45 text

Lexing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser uage

Slide 46

Slide 46 text

Lexing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser APP FUN a VAR a INT 2 AST uage

Slide 47

Slide 47 text

Parsing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser APP FUN a VAR a INT 2 AST uage

Slide 48

Slide 48 text

Parsing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser APP FUN a VAR a INT 2 AST uage

Slide 49

Slide 49 text

Parsing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser APP FUN a VAR a INT 2 AST uage

Slide 50

Slide 50 text

Parsing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser APP FUN a VAR a INT 2 AST uage

Slide 51

Slide 51 text

Parsing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser APP FUN a VAR a INT 2 AST uage

Slide 52

Slide 52 text

Lexing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser APP FUN a VAR a INT 2 AST uage

Slide 53

Slide 53 text

Lexing

Slide 54

Slide 54 text

Lexing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser uage • Expects a stream of characters or bytes • Groups them into semantically atomic units: tokens! • These are the words of the language! • What are the rules for grouping them, though?

Slide 55

Slide 55 text

• Grouping can be thought of as "split by space" Lexing

Slide 56

Slide 56 text

• Grouping can be thought of as "split by space" • Why not exactly that, though? Consider: Lexing

Slide 57

Slide 57 text

• Grouping can be thought of as "split by space" • Why not exactly that, though? Consider: Lexing val sum = 1 + 2 ! val sum=1+2 ! val str = "spaces matter here"

Slide 58

Slide 58 text

• We need rules for grouping characters into tokens Lexing

Slide 59

Slide 59 text

• We need rules for grouping characters into tokens • These rules form the lexical grammar Lexing

Slide 60

Slide 60 text

• We need rules for grouping characters into tokens • These rules form the lexical grammar • Can be defined using regular expressions Lexing

Slide 61

Slide 61 text

• We need rules for grouping characters into tokens • These rules form the lexical grammar • Can be defined using regular expressions • Conducive to easy and efficient implementations Lexing

Slide 62

Slide 62 text

• We need rules for grouping characters into tokens • These rules form the lexical grammar • Can be defined using regular expressions • Conducive to easy and efficient implementations • Using a RegExp library Lexing

Slide 63

Slide 63 text

• We need rules for grouping characters into tokens • These rules form the lexical grammar • Can be defined using regular expressions • Conducive to easy and efficient implementations • Using a RegExp library • By hand isn't hard either, just a little cumbersome Lexing

Slide 64

Slide 64 text

• We need rules for grouping characters into tokens • These rules form the lexical grammar • Can be defined using regular expressions • Conducive to easy and efficient implementations • Using a RegExp library • By hand isn't hard either, just a little cumbersome • Lexer generators: Lex, Flex, Alex, ANTLR, etc. Lexing

Slide 65

Slide 65 text

• We need rules for grouping characters into tokens • These rules form the lexical grammar • Can be defined using regular expressions • Conducive to easy and efficient implementations • Using a RegExp library • By hand isn't hard either, just a little cumbersome • Lexer generators: Lex, Flex, Alex, ANTLR, etc. • Lexing is what you need for syntax definition files Lexing

Slide 66

Slide 66 text

µML — Lexical Grammar integers 0|[1-9][0-9]* identifiers [a-zA-Z]+ symbols (, ), +, -, =, => keywords if, then, else, let, val, in, end, fn, true, false

Slide 67

Slide 67 text

integers 0|[1-9][0-9]* identifiers [a-zA-Z]+ symbols (, ), +, -, =, => keywords if, then, else, let, val, in, end, fn, true, false µML — Lexical Grammar

Slide 68

Slide 68 text

integers 0|[1-9][0-9]* identifiers [a-zA-Z]+ symbols (, ), +, -, =, => keywords if, then, else, let, val, in, end, fn, true, false µML — Lexical Grammar

Slide 69

Slide 69 text

integers 0|[1-9][0-9]* identifiers [a-zA-Z]+ symbols (, ), +, -, =, => keywords if, then, else, let, val, in, end, fn, true, false µML — Lexical Grammar

Slide 70

Slide 70 text

integers 0|[1-9][0-9]* identifiers [a-zA-Z]+ symbols (, ), +, -, =, => keywords if, then, else, let, val, in, end, fn, true, false µML — Lexical Grammar

Slide 71

Slide 71 text

Code

Slide 72

Slide 72 text

Parsing

Slide 73

Slide 73 text

Parsing Compiler ) 2 Parser Lexer Tokens ( fn a => a ) 2 Parser APP FUN a VAR a INT 2 AST uage

Slide 74

Slide 74 text

• The lexer recognizes valid words in the language Parsing

Slide 75

Slide 75 text

• The lexer recognizes valid words in the language • Not all combinations of valid words form valid phrases in a language Parsing

Slide 76

Slide 76 text

• The lexer recognizes valid words in the language • Not all combinations of valid words form valid phrases in a language • Syntactically correct: val a = 1 Parsing

Slide 77

Slide 77 text

• The lexer recognizes valid words in the language • Not all combinations of valid words form valid phrases in a language • Syntactically correct: val a = 1 • Syntactically incorrect: val val val Parsing

Slide 78

Slide 78 text

• The lexer recognizes valid words in the language • Not all combinations of valid words form valid phrases in a language • Syntactically correct: val a = 1 • Syntactically incorrect: val val val • We must define the structure of phrases Parsing

Slide 79

Slide 79 text

• The lexer recognizes valid words in the language • Not all combinations of valid words form valid phrases in a language • Syntactically correct: val a = 1 • Syntactically incorrect: val val val • We must define the structure of phrases • A syntactical grammar achieves that Parsing

Slide 80

Slide 80 text

• Regular expressions are not powerful enough Parsing

Slide 81

Slide 81 text

• Regular expressions are not powerful enough • REs can't recognize nested structures Parsing

Slide 82

Slide 82 text

• Regular expressions are not powerful enough • REs can't recognize nested structures • Because they use a finite amount of memory Parsing

Slide 83

Slide 83 text

• Regular expressions are not powerful enough • REs can't recognize nested structures • Because they use a finite amount of memory • Nesting needs a stack to remember the upper structures you're traversing Parsing

Slide 84

Slide 84 text

• Regular expressions are not powerful enough • REs can't recognize nested structures • Because they use a finite amount of memory • Nesting needs a stack to remember the upper structures you're traversing • Syntactical grammars express nesting using recursion Parsing

Slide 85

Slide 85 text

No content

Slide 86

Slide 86 text

It's not weird-looking Unicode characters that make regexes unsuitable for parsing.

Slide 87

Slide 87 text

Syntactical Grammar

Slide 88

Slide 88 text

µML — Syntactical Grammar expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr oper = + | - bool = true | false

Slide 89

Slide 89 text

expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr oper = + | - bool = true | false Here, blue symbols represent tokens coming from the lexer, not keywords. µML — Syntactical Grammar

Slide 90

Slide 90 text

expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr oper = + | - bool = true | false Here, blue symbols represent tokens coming from the lexer, not keywords. µML — Syntactical Grammar

Slide 91

Slide 91 text

expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr bool = true | false oper = + | - Here, blue symbols represent tokens coming from the lexer, not keywords. µML — Syntactical Grammar

Slide 92

Slide 92 text

expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr bool = true | false oper = + | - Here, blue symbols represent tokens coming from the lexer, not keywords. µML — Syntactical Grammar

Slide 93

Slide 93 text

expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr bool = true | false oper = + | - Here, blue symbols represent tokens coming from the lexer, not keywords. µML — Syntactical Grammar

Slide 94

Slide 94 text

expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr bool = true | false oper = + | - Here, blue symbols represent tokens coming from the lexer, not keywords. µML — Syntactical Grammar

Slide 95

Slide 95 text

expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr bool = true | false oper = + | - Here, blue symbols represent tokens coming from the lexer, not keywords. µML — Syntactical Grammar

Slide 96

Slide 96 text

expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr bool = true | false oper = + | - Here, blue symbols represent tokens coming from the lexer, not keywords. µML — Syntactical Grammar

Slide 97

Slide 97 text

expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr bool = true | false oper = + | - Here, blue symbols represent tokens coming from the lexer, not keywords. µML — Syntactical Grammar

Slide 98

Slide 98 text

expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr bool = true | false oper = + | - Here, blue symbols represent tokens coming from the lexer, not keywords. µML — Syntactical Grammar

Slide 99

Slide 99 text

expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr bool = true | false oper = + | - Here, blue symbols represent tokens coming from the lexer, not keywords. µML — Syntactical Grammar

Slide 100

Slide 100 text

• Function application has higher precedence over infix expressions in ML Introducing Precedence

Slide 101

Slide 101 text

• Function application has higher precedence over infix expressions in ML • double 1 + 2 = (double 1) + 2 Introducing Precedence

Slide 102

Slide 102 text

• Function application has higher precedence over infix expressions in ML • double 1 + 2 = (double 1) + 2 • double 1 + 2 ≠ double (1 + 2) Introducing Precedence

Slide 103

Slide 103 text

• Function application has higher precedence over infix expressions in ML • double 1 + 2 = (double 1) + 2 • double 1 + 2 ≠ double (1 + 2) • A rule's alternatives don't encode precedence Introducing Precedence

Slide 104

Slide 104 text

• Function application has higher precedence over infix expressions in ML • double 1 + 2 = (double 1) + 2 • double 1 + 2 ≠ double (1 + 2) • A rule's alternatives don't encode precedence • Grammars convey this by chaining rules in order of precedence Introducing Precedence

Slide 105

Slide 105 text

• Function application has higher precedence over infix expressions in ML • double 1 + 2 = (double 1) + 2 • double 1 + 2 ≠ double (1 + 2) • A rule's alternatives don't encode precedence • Grammars convey this by chaining rules in order of precedence • Doesn't scale with many infix operators Introducing Precedence

Slide 106

Slide 106 text

• Function application has higher precedence over infix expressions in ML • double 1 + 2 = (double 1) + 2 • double 1 + 2 ≠ double (1 + 2) • A rule's alternatives don't encode precedence • Grammars convey this by chaining rules in order of precedence • Doesn't scale with many infix operators • Use a special parser for that, e.g., the Shunting Yard algorithm Introducing Precedence

Slide 107

Slide 107 text

Introducing Precedence expr = int | var | bool | ( expr ) | fn var => expr | if expr then expr else expr | let val var = expr in expr end | expr oper expr | expr expr bool = true | false oper = + | -

Slide 108

Slide 108 text

Introducing Precedence expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix ! app = atomic | app atomic ! atomic = int | var | bool | ( expr ) | let val var = expr in expr end bool = true | false oper = + | -

Slide 109

Slide 109 text

Introducing Precedence expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix ! app = atomic | app atomic ! atomic = int | var | bool | ( expr ) | let val var = expr in expr end bool = true | false oper = + | -

Slide 110

Slide 110 text

Introducing Precedence expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix ! app = atomic | app atomic ! atomic = int | var | bool | ( expr ) | let val var = expr in expr end bool = true | false oper = + | -

Slide 111

Slide 111 text

Introducing Precedence expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix ! app = atomic | app atomic ! atomic = int | var | bool | ( expr ) | let val var = expr in expr end bool = true | false oper = + | -

Slide 112

Slide 112 text

Parsing Strategies

Slide 113

Slide 113 text

• Two styles: Parsing Strategies

Slide 114

Slide 114 text

• Two styles: • Top-down parsing: builds tree from the root Parsing Strategies

Slide 115

Slide 115 text

• Two styles: • Top-down parsing: builds tree from the root • Bottom-up parsing: builds tree from the leaves Parsing Strategies

Slide 116

Slide 116 text

• Two styles: • Top-down parsing: builds tree from the root • Bottom-up parsing: builds tree from the leaves • Top-down is easy to write by hand Parsing Strategies

Slide 117

Slide 117 text

• Two styles: • Top-down parsing: builds tree from the root • Bottom-up parsing: builds tree from the leaves • Top-down is easy to write by hand • Bottom-up is not, but it's used by generators Parsing Strategies

Slide 118

Slide 118 text

• Two styles: • Top-down parsing: builds tree from the root • Bottom-up parsing: builds tree from the leaves • Top-down is easy to write by hand • Bottom-up is not, but it's used by generators • Parser generators: YACC, ANTLR, Bison, etc. Parsing Strategies

Slide 119

Slide 119 text

• The simplest known parsing strategy; amenable to hand-coding Recursive Descent Parser

Slide 120

Slide 120 text

• The simplest known parsing strategy; amenable to hand-coding • Builds the tree top to bottom, from root to leaves, hence Descent Recursive Descent Parser

Slide 121

Slide 121 text

• The simplest known parsing strategy; amenable to hand-coding • Builds the tree top to bottom, from root to leaves, hence Descent • Parallels the structure of the grammar Recursive Descent Parser

Slide 122

Slide 122 text

• The simplest known parsing strategy; amenable to hand-coding • Builds the tree top to bottom, from root to leaves, hence Descent • Parallels the structure of the grammar • Main idea: each grammar production becomes a function Recursive Descent Parser

Slide 123

Slide 123 text

• The simplest known parsing strategy; amenable to hand-coding • Builds the tree top to bottom, from root to leaves, hence Descent • Parallels the structure of the grammar • Main idea: each grammar production becomes a function • Recursion in the grammar translates to recursion in the code, hence Recursive Recursive Descent Parser

Slide 124

Slide 124 text

• The simplest known parsing strategy; amenable to hand-coding • Builds the tree top to bottom, from root to leaves, hence Descent • Parallels the structure of the grammar • Main idea: each grammar production becomes a function • Recursion in the grammar translates to recursion in the code, hence Recursive • Recursion is the main difference compared to regexes; it needs a stack Recursive Descent Parser

Slide 125

Slide 125 text

• The simplest known parsing strategy; amenable to hand-coding • Builds the tree top to bottom, from root to leaves, hence Descent • Parallels the structure of the grammar • Main idea: each grammar production becomes a function • Recursion in the grammar translates to recursion in the code, hence Recursive • Recursion is the main difference compared to regexes; it needs a stack • Very popular, e.g., Clang uses it for C/C++/Obj-C Recursive Descent Parser

Slide 126

Slide 126 text

• The simplest known parsing strategy; amenable to hand-coding • Builds the tree top to bottom, from root to leaves, hence Descent • Parallels the structure of the grammar • Main idea: each grammar production becomes a function • Recursion in the grammar translates to recursion in the code, hence Recursive • Recursion is the main difference compared to regexes; it needs a stack • Very popular, e.g., Clang uses it for C/C++/Obj-C • Parser combinators are an abstraction over this idea Recursive Descent Parser

Slide 127

Slide 127 text

Code

Slide 128

Slide 128 text

• The current grammar has a problem Removing Left-Recursion

Slide 129

Slide 129 text

• The current grammar has a problem • But, it's only a problem for our current parsing strategy; others can easily cope with it Removing Left-Recursion

Slide 130

Slide 130 text

• The current grammar has a problem • But, it's only a problem for our current parsing strategy; others can easily cope with it • The problem is that some rules are left-recursive, i.e., the rule itself appears as the first symbol on the left Removing Left-Recursion

Slide 131

Slide 131 text

• The current grammar has a problem • But, it's only a problem for our current parsing strategy; others can easily cope with it • The problem is that some rules are left-recursive, i.e., the rule itself appears as the first symbol on the left • This is problematic for a recursive descent parser because the structure of function calls follow the structure of rule definitions Removing Left-Recursion

Slide 132

Slide 132 text

• The current grammar has a problem • But, it's only a problem for our current parsing strategy; others can easily cope with it • The problem is that some rules are left-recursive, i.e., the rule itself appears as the first symbol on the left • This is problematic for a recursive descent parser because the structure of function calls follow the structure of rule definitions • That means infinite recursion in the parser, which isn't good Removing Left-Recursion

Slide 133

Slide 133 text

expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix ! app = atomic | app atomic ! atomic = int | var | bool | ( expr ) | let val var = expr in expr end bool = true | false oper = + | - Left-Recursive Grammar

Slide 134

Slide 134 text

expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix ! app = atomic | app atomic ! atomic = int | var | bool | ( expr ) | let val var = expr in expr end bool = true | false oper = + | - Left-Recursive Grammar

Slide 135

Slide 135 text

expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix ! app = atomic | app atomic Left-Recursive Grammar

Slide 136

Slide 136 text

expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix ! app = atomic | atomic atomic | atomic atomic atomic | atomic atomic atomic atomic ... Left-Recursive Grammar

Slide 137

Slide 137 text

expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix ! app = atomic | atomic atomic | atomic (atomic atomic) | atomic (atomic (atomic atomic)) ... Left-Recursive Grammar

Slide 138

Slide 138 text

expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix ! app = atomic { app } Left-Recursive Grammar

Slide 139

Slide 139 text

Removing Left-Recursion expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix ! app = atomic { app } ! atomic = int | var | bool | ( expr ) | let val var = expr in expr end bool = true | false oper = + | -

Slide 140

Slide 140 text

Removing Left-Recursion expr = infix | fn var => expr | if expr then expr else expr ! infix = app | infix oper infix

Slide 141

Slide 141 text

Removing Left-Recursion expr = infix | fn var => expr | if expr then expr else expr ! infix = app | app oper infix

Slide 142

Slide 142 text

Removing Left-Recursion expr = infix | fn var => expr | if expr then expr else expr ! infix = app | app oper infix | app oper app oper infix

Slide 143

Slide 143 text

Removing Left-Recursion expr = infix | fn var => expr | if expr then expr else expr ! infix = app | app oper infix | app oper app oper infix | app oper app oper app oper infix

Slide 144

Slide 144 text

Removing Left-Recursion expr = infix | fn var => expr | if expr then expr else expr ! infix = app | app oper infix | app oper app oper infix | app oper app oper app oper infix ...

Slide 145

Slide 145 text

Removing Left-Recursion expr = infix | fn var => expr | if expr then expr else expr ! infix = app | app (oper infix) | app (oper app (oper infix)) | app (oper app (oper app (oper infix))) ...

Slide 146

Slide 146 text

Removing Left-Recursion expr = infix | fn var => expr | if expr then expr else expr ! infix = app { oper infix }

Slide 147

Slide 147 text

Removing Left-Recursion expr = infix | fn var => expr | if expr then expr else expr ! infix = app { oper infix } ! app = atomic { app } ! 12 14 13 (12 14) 13 ! atomic = int | var | bool | ( expr ) | let val var = expr in expr end bool = true | false oper = + | -

Slide 148

Slide 148 text

github.com / igstan / itake-2016

Slide 149

Slide 149 text

• Write a lexer for JSON • Write a recursive descent parser for JSON • It's way easier than today's vehicle language • I promise! • Specification: json.org Homework

Slide 150

Slide 150 text

Thank You!

Slide 151

Slide 151 text

Questions!