Slide 1

Slide 1 text

One Parser to Rule Them All Ali Afroozeh Anastasia Izmaylova afruze IAnastassija

Slide 2

Slide 2 text

Isn’t parsing a solved problem?

Slide 3

Slide 3 text

Syntax Definition %token Num E : E '*' E | E '+' E | Num ;

Slide 4

Slide 4 text

Syntax Definition %token Num E : E '*' E | E '+' E | Num ; State 0 0 $accept: . E $end State 1 3 E: Num . Num State 2 0 $accept: E . $end 1 E: E . '*' E 2 | E . '+' E E R3 State 3 0 $accept: E $end . $end State 4 1 E: E '*' . E '*' State 5 2 E: E '+' . E '+' Acc Num State 6 1 E: E . '*' E 1 | E '*' E . 2 | E . '+' E E Num State 7 1 E: E . '*' E 2 | E . '+' E 2 | E '+' E . E '+' R1 R2

Slide 5

Slide 5 text

Syntax Definition E ::= E '*' E | E '+' E | Num

Slide 6

Slide 6 text

Syntax Definition + left * left E ::= E '*' E | E '+' E | Num

Slide 7

Slide 7 text

Syntax Definition %left '+' %left '*' E : E '*' E | E '+' E | Num ; YACC ANTLR E : E '*' E | E '+' E | Num ; E ::= E '*' E left > E '+' E left | Num ; GLR/GLL + left * left E ::= E '*' E | E '+' E | Num

Slide 8

Slide 8 text

Context sensitivity

Slide 9

Slide 9 text

Layout sensitivity g x = case x of 0 -> 1 _ -> x + 2 + 3

Slide 10

Slide 10 text

Layout sensitivity g x = case x of 0 -> 1 _ -> x + 2 + 3

Slide 11

Slide 11 text

Layout sensitivity g x = case x of 0 -> 1 _ -> x + 2 + 3

Slide 12

Slide 12 text

Layout sensitivity align g x = case x of 0 -> 1 _ -> x + 2 + 3

Slide 13

Slide 13 text

Layout sensitivity offside g x = case x of 0 -> 1 _ -> x + 2 + 3

Slide 14

Slide 14 text

Layout sensitivity g x = case x of 0 -> 1 _ -> x + 2 + 3 g 0 => 1

Slide 15

Slide 15 text

Layout sensitivity g x = case x of 0 -> 1 _ -> x + 2 + 3 g 0 => 4

Slide 16

Slide 16 text

Layout sensitivity g x = case x of 0 -> 1 _ -> x + 2 + 3 g 0 => 4

Slide 17

Slide 17 text

Typedefs in C typedef int T; main() { int n = (T)+1; }

Slide 18

Slide 18 text

Typedefs in C n => 1 typedef int T; main() { int n = (T)+1; }

Slide 19

Slide 19 text

Typedefs in C typedef int T; main() { int T = 1; int n = (T)+1; } n => 2

Slide 20

Slide 20 text

Conditional directives void test() { #if Debug System.Console.WriteLine(“Debug"); } #else } #endif

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

Operator Precedence

Slide 23

Slide 23 text

Operator Precedence Syntax1

Slide 24

Slide 24 text

Operator Precedence Syntax1 Impl1

Slide 25

Slide 25 text

Operator Precedence Indentation rules Syntax1 Impl1

Slide 26

Slide 26 text

Operator Precedence Indentation rules Syntax1 Syntax2 Impl1

Slide 27

Slide 27 text

Operator Precedence Indentation rules Syntax1 Syntax2 Impl1 Impl2

Slide 28

Slide 28 text

Impl3 Operator Precedence Indentation rules Conditional directives Lexical filters Syntax3 Syntax1 Syntax2 Impl1 Impl2 Impl4

Slide 29

Slide 29 text

Impl3 Operator Precedence Indentation rules Conditional directives Lexical filters Syntax3 Syntax1 Syntax2 Impl1 Impl2 Impl4 Parse table modification, GSS modification, custom lexer, custom preprocessor, constraint solving

Slide 30

Slide 30 text

Data-dependent grammars

Slide 31

Slide 31 text

Data-dependent grammars User-defined, declarative meta constructs

Slide 32

Slide 32 text

Data-dependent grammars Lexical filters User-defined, declarative meta constructs

Slide 33

Slide 33 text

Data-dependent grammars Operator Precedence Lexical filters User-defined, declarative meta constructs

Slide 34

Slide 34 text

Data-dependent grammars Operator Precedence Indentation rules Lexical filters User-defined, declarative meta constructs

Slide 35

Slide 35 text

Data-dependent grammars Operator Precedence Indentation rules Conditional directives Lexical filters User-defined, declarative meta constructs

Slide 36

Slide 36 text

Data-dependent grammars Operator Precedence Indentation rules Conditional directives Lexical filters User-defined, declarative meta constructs

Slide 37

Slide 37 text

Data-dependent grammars Modified GLL parsing Operator Precedence Indentation rules Conditional directives Lexical filters User-defined, declarative meta constructs

Slide 38

Slide 38 text

Data-dependent grammars Modified GLL parsing Operator Precedence Indentation rules Conditional directives Lexical filters User-defined, declarative meta constructs

Slide 39

Slide 39 text

Data-dependent Grammars

Slide 40

Slide 40 text

Data-dependent Grammars Octets ::= Octets Octet | ϵ List of octets :

Slide 41

Slide 41 text

Data-dependent Grammars Octets ::= Octets Octet | ϵ List of octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ

Slide 42

Slide 42 text

Data-dependent Grammars Octets ::= Octets Octet | ϵ List of octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ ~{6}aaaaaa

Slide 43

Slide 43 text

Data-dependent Grammars Octets ::= Octets Octet | ϵ List of octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ L8 ::= '~{' nm:Number {n=toInt(nm.yield)} '}' Octets(n) ~{6}aaaaaa

Slide 44

Slide 44 text

Data-dependent Grammars Octets ::= Octets Octet | ϵ List of octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ L8 ::= '~{' nm:Number {n=toInt(nm.yield)} '}' Octets(n) ~{6}aaaaaa

Slide 45

Slide 45 text

Data-dependent Grammars Octets ::= Octets Octet | ϵ List of octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ L8 ::= '~{' nm:Number {n=toInt(nm.yield)} '}' Octets(n) ~{6}aaaaaa

Slide 46

Slide 46 text

Data-dependent Grammars Octets ::= Octets Octet | ϵ List of octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ L8 ::= '~{' nm:Number {n=toInt(nm.yield)} '}' Octets(n) ~{6}aaaaaa

Slide 47

Slide 47 text

Operator precedence E ::= '-' E > E '*' E left > E '+' E left | 'if' E 'then' E 'else' E | 'a'

Slide 48

Slide 48 text

Operator precedence E ::= '-' E > E '*' E left > E '+' E left | 'if' E 'then' E 'else' E | 'a' E(l,r) ::= [4 >= l] '-' E(l,4) | [3 >= r,3 >= l] E(3,3) '*' E(l,4) | [2 >= r,2 >= l] E(2,2) '+' E(l,3) | [1 >= l] 'if' E(0,0) 'then' E(0,0) 'else' E(0,0) | 'a'

Slide 49

Slide 49 text

Haskell indentation rules Decls ::= align (offside Decl)* | ignore ('{' Decl (';' Decl)* '}')
 Decl ::= FunLHS RHS
 RHS ::= '=' Exp 'where' Decls

Slide 50

Slide 50 text

Haskell indentation rules Decls ::= align (offside Decl)* | ignore ('{' Decl (';' Decl)* '}')
 Decl ::= FunLHS RHS
 RHS ::= '=' Exp 'where' Decls Decls ::= a0:Star1(a0.l) | ignore ('{' Decl Star2 '}') Star1(v) ::= Plus1(v) | ϵ Plus1(v) ::= offside a1:Decl [col(a1.l) == col(v)] | Plus1(v) offside a1:Decl [col(a1.l) == col(v)]

Slide 51

Slide 51 text

C# conditional directives Skipped ::= Part+ Part ::= PpCond | ... PpCond ::= PpIf PpElif* PpElse? PpEndif PpIf ::= '#' 'if' PpExp PpNL Skipped? PpElse ::= '#' 'else' PpNL Skipped? PpEndif ::= '#' 'endif' PpNL

Slide 52

Slide 52 text

global ds = {}
 LAYOUT ::= (Whitespace | Comment | Decl | If | Gbg)* Decl ::= '#' 'define' id:Id {ds=put(ds,id.yield,true)} PpNL | '#' 'undef' id:Id {ds=put(ds,id.yield,false)} PpNL If ::= '#' 'if' v=Exp(ds) [v] ? LAYOUT : (Skipped (Elif|Else|PpEndif)) Else ::= '#' 'else' LAYOUT Gbg ::= GbgElif* GbgElse? '#' 'endif' GbgElse ::= '#' 'else' Skipped C# conditional directives

Slide 53

Slide 53 text

Current results Language Files Success Java 8067 100% C# 5839 99% Haskell 6457 72%

Slide 54

Slide 54 text

Current results C# 2 3 4 5 6 0 1 2 3 size (#characters) in log10 y = 1.098 x − 3 R2 = 0.9821 Regression line Haskell 0 1 2 3 4 5 −1 0 1 2 3 4 y = 0.95 x − 1.616 R2 = 0.95 Regression line Java 2 3 4 5 −1 0 1 2 3 CPU time (milliseconds) in log10 y = 1.212 x − 3.181 R2 = 0.9395 Regression line Language Files Success Java 8067 100% C# 5839 99% Haskell 6457 72%

Slide 55

Slide 55 text

https://github.com/iguana-parser/iguana