Monadic Parsing in Python

Monadic Parsing in Python

B9b7a5ffa24e2af6f877a7950461ba0f?s=128

Oleksii Kachaiev

June 06, 2014
Tweet

Transcript

  1. Monadic Parsing in Python Alexey Kachayev, 2014

  2. About me • CTO at Attendify.com • Erlang, Clojure, Go,

    Haskell • Fn.py library author • CPython & Storm contributor
  3. Find me •@kachayev •github.com/kachayev •kachayev <$> gmail.com

  4. Topic

  5. Will talk •What is "parsing(ers)" •Approaches •Monadic parsing from scratch

    •More…
  6. Will talk •Less about theory •Much more about practice

  7. Won’t talk •What "monad" is •Why FP is cool (*)

    * you’ll understand it by yourself
  8. Parsing

  9. Definition •Takes grammar •Takes input string (?) •Returns tree (??)

    or an error
  10. None
  11. For PL creators only?

  12. Tasks • Processing information from logs • Source code analysing

    • DSLs • Protocols & data formats • … and more
  13. Approaches

  14. Production rule S → SS|(S)|()

  15. Grammar block = ["const" ident "=" number {"," ident "="

    number} ";"] ["var" ident {"," ident} ";"] {"procedure" ident ";" block ";"} statement ! expression = ["+"|"-"] term {("+"|"-") term} ! term = factor {("*"|"/") factor} ! factor = ident | number | "(" expression ")" ! . . . .
  16. •Top-down / bottom-up •Predictive / Backtracking •LL(k), LALR, LR, CYK

    and others In theory
  17. Manually!

  18. @ wikipedia

  19. Manually •Simple to understand •Hard to maintain •Really boring

  20. Can we do better?

  21. What we have •Context-free grammars •Formal theory •Well-defined algorithms •Standard

    grammar notation(s)
  22. So…

  23. Parser generator •1. Parse DSL notation •2. Generate parser code

    •("any" language)
  24. Parser generator •*PEG* •*Yacc* •ANTLR •… and tens more

  25. Parser generator •Pros •many targeted languages •formalism •performance & optimisations

  26. Parser generator •Cons •another language •bounded in features •"compiled-time" mostly

  27. Can we do better?

  28. Monadic parsers & combinators

  29. Functional Pearls Monadic Parsing in Haskell @Graham Hutton, @Erik Meijer

  30. Parsec MPC library for Haskell

  31. Parsec •Monadic parser combinator(s) •Works even with context- sensitive, infinite

    LA grammars •Tens of ports to other langs
  32. None
  33. The Big Idea

  34. Simple type Parser = String → Tree

  35. Compose? type Parser = String → (Tree, String)

  36. Generalize? type Parser a = String → (a, String)

  37. Errors? type Parser a = String → Maybe (a, String)

  38. Or better… type Parser a = String → [(a, String)]

  39. Let’s try…

  40. Snippets: http://goo.gl/leQIEE

  41. None
  42. None
  43. None
  44. None
  45. … and so?

  46. Expressiveness •[] for error •[s1] for single (predictive) •[s1..sN] for

    backtracking
  47. First-class citizen

  48. None
  49. Skip anything…

  50. Recognise digit

  51. Combinators

  52. RegExp •and: "abc" •or: "a | b | c" •Kleene

    star: "a*"
  53. Derives •a? = "" | a •a+ = aa* •a{2,3}

    = aa | aaa
  54. None
  55. None
  56. laziness is cool for this do you need backtracking?

  57. How to use it?

  58. None
  59. None
  60. Cool! but..

  61. ugly ugly not readable

  62. Enhancements •use generators for "laziness" •"combine" function •Scala-style methods •"delay"

    method
  63. fn.py Stream

  64. None
  65. [1,2,3,4,5] expr →"[" digit (","digit)* "]"

  66. None
  67. Interesting! but..

  68. Is it enough?

  69. In Haskell

  70. Can I do this in Python?

  71. … hm

  72. Challenge accepted!

  73. In Python

  74. How?

  75. Desugaring…

  76. What?

  77. WAT??? even more like

  78. unit a → Parser a

  79. bind Parser a → (a → Parser b) → Parser

    b
  80. lift (a → b) → (a → Parser b)

  81. lifted Parser a → (a → b) → Parser b

  82. WAT??? ok, looks cool, but

  83. None
  84. None
  85. How to use

  86. And even more..

  87. Haskell-style

  88. Do-notation

  89. None
  90. None
  91. (define R 2) (define diameter (lambda (r) (* 2 r)))

  92. None
  93. None
  94. Looks nice!

  95. Mutability kills backtracking :(

  96. And more •errors handling •backtracking control •performance

  97. Links • "funcparselib" http://goo.gl/daidQY • "Monadic parsing in Haskell" http://goo.gl/gygNlM

    • "Higher-Order functions for Parsing" http://goo.gl/c8VOIZ • "Parsec" http://goo.gl/bdnDZQ • "Parcon" http://goo.gl/CT06S5 • "Pyparsing" http://goo.gl/gmr2lQ • "You Could Have Invented Monadic Parsing" http://goo.gl/h0rnOQ
  98. Learn Haskell For Great Good

  99. Q/A thanks for your attention,