Safe Specification of Operator Precedence Rules

Fc2292d408e831e2d3ed77b8964e5ccf?s=47 Ali Afroozeh
October 27, 2013

Safe Specification of Operator Precedence Rules

Presented at Software Language Engineering (SLE) 2013 in Indianapolis, US.

Event link: http://splashcon.org/2013/program/932

For more information see the Iguana Parsing Framework: http://iguana-parser.github.io

Fc2292d408e831e2d3ed77b8964e5ccf?s=128

Ali Afroozeh

October 27, 2013
Tweet

Transcript

  1. Safe specification of operator precedence rules Ali Afroozeh Mark van

    den Brand Adrian Johnstone Elizabeth Scott Jurgen Vinju
  2. Parsing technology Compiler construction DSLs Reverse Engineering Correctness Speed Natural

    grammars Modularity Correctness
  3. Parsing technology Compiler construction DSLs Reverse Engineering Correctness Speed Natural

    grammars Modularity Correctness
  4. Parsing technology Compiler construction DSLs Reverse Engineering Correctness Speed Natural

    grammars Modularity General parsing technology Correctness
  5. Declarative parser generation EBNF context-free syntax Disambiguation rules Lexical definitions

    Correct and efficient parser
  6. From reference manuals to parsers

  7. E ::= T E1 E1 ::= + T E1 |

    ✏ T ::= F T1 T1 ::= ⇤ F T1 | ✏ F ::= (E) | a E ::=E ⇤ E |E + E |(E) |a ⇤ left + left ⇤ > +
  8. 50 alternatives 18 precedence levels

  9. Contributions

  10. Contributions •Safe

  11. Contributions •Safe •Correct

  12. Contributions •Safe •Correct •Technology Independent

  13. Contributions •Safe •Correct •Technology Independent operator precedence rules from reference

    manuals.
  14. Terms

  15. Terms •General(ized) parsing

  16. Terms •General(ized) parsing •GLL

  17. Terms •General(ized) parsing •GLL •Yacc

  18. Terms •General(ized) parsing •GLL •Yacc •SDF

  19. Terms •General(ized) parsing •GLL •Yacc •SDF •OCaml

  20. Terms Technologies •General(ized) parsing •GLL •Yacc •SDF •OCaml

  21. Terms Case study •General(ized) parsing •GLL •Yacc •SDF •OCaml

  22. Problem Description

  23. A simple expression grammar E ::= E + E |

    E | a
  24. A simple expression grammar Where + is left associative and

    + > - E ::= E + E | E | a a + a + a ((a + a) + a) (a + (a + a)) a + a ( (a + a)) (( a) + a)
  25. A simple expression grammar Where + is left associative and

    + > - E ::= E + E | E | a a + a + a ((a + a) + a) (a + (a + a)) a + a ( (a + a)) (( a) + a)
  26. None
  27. None
  28. A ::= > B ::= ↵ No B in should

    derive ↵ E ::= E + E > E ::= E - a + a
  29. A ::= > B ::= ↵ No B in should

    derive ↵ E ::= E + E > E ::= E - a + a E E E + E (( E) + E) E E + E E ( (E + E)) - -
  30. A ::= > B ::= ↵ No B in should

    derive ↵ E ::= E + E > E ::= E - a + a E E E + E (( E) + E) E E + E E ( (E + E)) - -
  31. A ::= > B ::= ↵ No B in should

    derive ↵
  32. A ::= > B ::= ↵ No B in should

    derive ↵ E ::= E + E > E ::= E a + - a
  33. A ::= > B ::= ↵ No B in should

    derive ↵ E ::= E + E > E ::= E E E E + E (E + ( E)) a + - a -
  34. Safety: a disambiguation mechanism is safe if it does not

    change the underlying language.
  35. E E E + E E + E E E

    E + E E + E a + - a + a ((E + ( E)) + E) (E + (E + E)) One level filtering E E E + E E + E (E + (( E) + E)) - - -
  36. E E E + E E + E E E

    E + E E + E a + - a + a ((E + ( E)) + E) (E + (E + E)) One level filtering E E E + E E + E (E + (( E) + E)) - - -
  37. E E E + E E + E E E

    E + E ((E + ( E)) + E) - -
  38. Problems with current solutions •It is not safe •It is

    applied at one level only
  39. Real world example 1 + if true then 1 +

    3 else 4 + 5
  40. Real world example 1 + (if true then 1 +

    3 else (4 + 5)) 1 + (if true then 1 + 3 else 4) + 5
  41. Real world example 1 + (if true then 1 +

    3 else (4 + 5)) 1 + (if true then 1 + 3 else 4) + 5
  42. Our solution

  43. Operator style ambiguity E ::= E ↵ | E E)E↵)

    E↵ ( E)↵ E) E) E↵ (E↵)
  44. Operator style ambiguity E ::= E + E | E

    | a We only filter left and right recursive ends
  45. Operator style ambiguity E ::= E + E | E

    | a We only filter left and right recursive ends
  46. Support for deeper levels E ⇤ ) E E)E↵ ⇤

    ) E↵) E↵ ( ( E))↵ E ⇤ ) E) E) E↵ ( (E↵)) E ::= E ↵ | E | ...
  47. Support for deeper levels E ::= E + E |

    E | a E)E + E)E + E + E)E + E + E E)E + E)E + E)E + E + E (E + (( E) + E)) (E + ( (E + E)))
  48. Our operator precedence semantics E ::= E ↵ | E

    E ⇤ ) E If , then should not derive . E↵ > E E also, no derivable on the right most end,
 i.e., , should derive . E E ⇤ ) E E qE↵
  49. Support for indirect recursion expr ::= expr “+” expr >

    expr ::= “try” expr “with” pattern-matching pattern-matching ::= 
 "|"? pattern ("when" expr)? -> expr
  50. An Example E :: = E + E (left) >

    E |a
  51. An Example E :: = E + E (left) >

    E |a E ::= E + qE, E ::= E + E E ::= qE + E, E ::= E
  52. An Example E :: = E + E (left) >

    E |a E1 E2 E ::= E + qE, E ::= E + E E ::= qE + E, E ::= E
  53. An Example E :: = E + E (left) >

    E |a E1 E2 E ::= E2 + E1 | E | a E ::= E + qE, E ::= E + E E ::= qE + E, E ::= E
  54. An Example E ::= E2 + E1 | E |

    a
  55. An Example E1 :: = E | a E ::=

    E2 + E1 | E | a E ::= E + qE, E ::= E + E
  56. An Example E1 :: = E | a E ::=

    E2 + E1 | E | a E2 ::= E2 + E1 | a E ::= E + qE, E ::= E + E E ::= qE + E, E ::= E
  57. An Example E1 :: = E | a E ::=

    E2 + E1 | E | a E2 ::= E2 + E1 | a
  58. An Example E1 :: = E | a E ::=

    E2 + E1 | E | a E2 ::= E2 + E1 | a
  59. An Example E1 :: = E | a E ::=

    E2 + E1 | E | a E2 ::= E2 + E1 | a E3 ::= a
  60. An Example E1 :: = E | a E ::=

    E2 + E1 | E | a E3 ::= a E2 ::= E2 + E3 | a
  61. Evaluation •Highly ambiguous reference grammar of OCaml •OCaml files from

    the OCaml test suite
  62. None
  63. Results •229 files in general •215 correctly parse and disambiguate

    •182 files produce exact ASTs •The failing cases are related to semicolon rules and AST transformations of the OCaml compiler
  64. Conclusions •Safe disambiguation for operator precedence •Supporting arbitrary deep ambiguity

    patterns •Implementation by grammar rewriting •Evaluation by parsing OCaml examples against its highly ambiguous reference manual