Upgrade to Pro — share decks privately, control downloads, hide ads and more …

One Parser to Rule Them All

One Parser to Rule Them All

Anastasia Izmaylova

October 29, 2015
Tweet

More Decks by Anastasia Izmaylova

Other Decks in Research

Transcript

  1. Syntax Definition %token Num E : E '*' E |

    E '+' E | Num ; State 0 0 $accept: . E $end State 1 3 E: Num . Num State 2 0 $accept: E . $end 1 E: E . '*' E 2 | E . '+' E E R3 State 3 0 $accept: E $end . $end State 4 1 E: E '*' . E '*' State 5 2 E: E '+' . E '+' Acc Num State 6 1 E: E . '*' E 1 | E '*' E . 2 | E . '+' E E Num State 7 1 E: E . '*' E 2 | E . '+' E 2 | E '+' E . E '+' R1 R2
  2. Syntax Definition %left '+' %left '*' E : E '*'

    E | E '+' E | Num ; YACC ANTLR E : E '*' E | E '+' E | Num ; E ::= E '*' E left > E '+' E left | Num ; GLR/GLL + left * left E ::= E '*' E | E '+' E | Num
  3. Layout sensitivity g x = case x of 0 ->

    1 _ -> x + 2 + 3 g 0 => 1
  4. Layout sensitivity g x = case x of 0 ->

    1 _ -> x + 2 + 3 g 0 => 4
  5. Layout sensitivity g x = case x of 0 ->

    1 _ -> x + 2 + 3 g 0 => 4
  6. Typedefs in C typedef int T; main() { int T

    = 1; int n = (T)+1; } n => 2
  7. Impl3 Operator Precedence Indentation rules Conditional directives Lexical filters Syntax3

    Syntax1 Syntax2 Impl1 Impl2 Impl4 Parse table modification, GSS modification, custom lexer, custom preprocessor, constraint solving
  8. Data-dependent grammars Modified GLL parsing Operator Precedence Indentation rules Conditional

    directives Lexical filters User-defined, declarative meta constructs
  9. Data-dependent grammars Modified GLL parsing Operator Precedence Indentation rules Conditional

    directives Lexical filters User-defined, declarative meta constructs
  10. Data-dependent Grammars Octets ::= Octets Octet | ϵ List of

    octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ
  11. Data-dependent Grammars Octets ::= Octets Octet | ϵ List of

    octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ ~{6}aaaaaa
  12. Data-dependent Grammars Octets ::= Octets Octet | ϵ List of

    octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ L8 ::= '~{' nm:Number {n=toInt(nm.yield)} '}' Octets(n) ~{6}aaaaaa
  13. Data-dependent Grammars Octets ::= Octets Octet | ϵ List of

    octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ L8 ::= '~{' nm:Number {n=toInt(nm.yield)} '}' Octets(n) ~{6}aaaaaa
  14. Data-dependent Grammars Octets ::= Octets Octet | ϵ List of

    octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ L8 ::= '~{' nm:Number {n=toInt(nm.yield)} '}' Octets(n) ~{6}aaaaaa
  15. Data-dependent Grammars Octets ::= Octets Octet | ϵ List of

    octets : Fixed List of octets : Octets(n) ::= [n > 0] Octets(n - 1) Octet | [n == 0] ϵ L8 ::= '~{' nm:Number {n=toInt(nm.yield)} '}' Octets(n) ~{6}aaaaaa
  16. Operator precedence E ::= '-' E > E '*' E

    left > E '+' E left | 'if' E 'then' E 'else' E | 'a'
  17. Operator precedence E ::= '-' E > E '*' E

    left > E '+' E left | 'if' E 'then' E 'else' E | 'a' E(l,r) ::= [4 >= l] '-' E(l,4) | [3 >= r,3 >= l] E(3,3) '*' E(l,4) | [2 >= r,2 >= l] E(2,2) '+' E(l,3) | [1 >= l] 'if' E(0,0) 'then' E(0,0) 'else' E(0,0) | 'a'
  18. Haskell indentation rules Decls ::= align (offside Decl)* | ignore

    ('{' Decl (';' Decl)* '}')
 Decl ::= FunLHS RHS
 RHS ::= '=' Exp 'where' Decls
  19. Haskell indentation rules Decls ::= align (offside Decl)* | ignore

    ('{' Decl (';' Decl)* '}')
 Decl ::= FunLHS RHS
 RHS ::= '=' Exp 'where' Decls Decls ::= a0:Star1(a0.l) | ignore ('{' Decl Star2 '}') Star1(v) ::= Plus1(v) | ϵ Plus1(v) ::= offside a1:Decl [col(a1.l) == col(v)] | Plus1(v) offside a1:Decl [col(a1.l) == col(v)]
  20. C# conditional directives Skipped ::= Part+ Part ::= PpCond |

    ... PpCond ::= PpIf PpElif* PpElse? PpEndif PpIf ::= '#' 'if' PpExp PpNL Skipped? PpElse ::= '#' 'else' PpNL Skipped? PpEndif ::= '#' 'endif' PpNL
  21. global ds = {}
 LAYOUT ::= (Whitespace | Comment |

    Decl | If | Gbg)* Decl ::= '#' 'define' id:Id {ds=put(ds,id.yield,true)} PpNL | '#' 'undef' id:Id {ds=put(ds,id.yield,false)} PpNL If ::= '#' 'if' v=Exp(ds) [v] ? LAYOUT : (Skipped (Elif|Else|PpEndif)) Else ::= '#' 'else' LAYOUT Gbg ::= GbgElif* GbgElse? '#' 'endif' GbgElse ::= '#' 'else' Skipped C# conditional directives
  22. Current results C# 2 3 4 5 6 0 1

    2 3 size (#characters) in log10 y = 1.098 x − 3 R2 = 0.9821 Regression line Haskell 0 1 2 3 4 5 −1 0 1 2 3 4 y = 0.95 x − 1.616 R2 = 0.95 Regression line Java 2 3 4 5 −1 0 1 2 3 CPU time (milliseconds) in log10 y = 1.212 x − 3.181 R2 = 0.9395 Regression line Language Files Success Java 8067 100% C# 5839 99% Haskell 6457 72%