Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Monadic Parsing in Python

Monadic Parsing in Python

Oleksii Kachaiev

June 06, 2014
Tweet

More Decks by Oleksii Kachaiev

Other Decks in Programming

Transcript

  1. Monadic Parsing
    in Python
    Alexey Kachayev, 2014

    View Slide

  2. About me
    • CTO at Attendify.com
    • Erlang, Clojure, Go, Haskell
    • Fn.py library author
    • CPython & Storm contributor

    View Slide

  3. Find me
    •@kachayev
    •github.com/kachayev
    •kachayev <$> gmail.com

    View Slide

  4. Topic

    View Slide

  5. Will talk
    •What is "parsing(ers)"
    •Approaches
    •Monadic parsing from scratch
    •More…

    View Slide

  6. Will talk
    •Less about theory
    •Much more about practice

    View Slide

  7. Won’t talk
    •What "monad" is
    •Why FP is cool (*)
    * you’ll understand it by yourself

    View Slide

  8. Parsing

    View Slide

  9. Definition
    •Takes grammar
    •Takes input string (?)
    •Returns tree (??) or an error

    View Slide

  10. View Slide

  11. For PL
    creators only?

    View Slide

  12. Tasks
    • Processing information from logs
    • Source code analysing
    • DSLs
    • Protocols & data formats
    • … and more

    View Slide

  13. Approaches

    View Slide

  14. Production rule
    S → SS|(S)|()

    View Slide

  15. Grammar
    block =
    ["const" ident "=" number
    {"," ident "=" number} ";"]
    ["var" ident {"," ident} ";"]
    {"procedure" ident ";" block ";"} statement
    !
    expression = ["+"|"-"] term {("+"|"-") term}
    !
    term = factor {("*"|"/") factor}
    !
    factor = ident | number | "(" expression ")"
    !
    . . . .

    View Slide

  16. •Top-down / bottom-up
    •Predictive / Backtracking
    •LL(k), LALR, LR, CYK and others
    In theory

    View Slide

  17. Manually!

    View Slide

  18. @ wikipedia

    View Slide

  19. Manually
    •Simple to understand
    •Hard to maintain
    •Really boring

    View Slide

  20. Can we do better?

    View Slide

  21. What we have
    •Context-free grammars
    •Formal theory
    •Well-defined algorithms
    •Standard grammar notation(s)

    View Slide

  22. So…

    View Slide

  23. Parser generator
    •1. Parse DSL notation
    •2. Generate parser code
    •("any" language)

    View Slide

  24. Parser generator
    •*PEG*
    •*Yacc*
    •ANTLR
    •… and tens more

    View Slide

  25. Parser generator
    •Pros
    •many targeted languages
    •formalism
    •performance & optimisations

    View Slide

  26. Parser generator
    •Cons
    •another language
    •bounded in features
    •"compiled-time" mostly

    View Slide

  27. Can we do better?

    View Slide

  28. Monadic parsers
    & combinators

    View Slide

  29. Functional Pearls
    Monadic Parsing in Haskell
    @Graham Hutton, @Erik Meijer

    View Slide

  30. Parsec
    MPC library for Haskell

    View Slide

  31. Parsec
    •Monadic parser combinator(s)
    •Works even with context-
    sensitive, infinite LA grammars
    •Tens of ports to other langs

    View Slide

  32. View Slide

  33. The Big Idea

    View Slide

  34. Simple
    type Parser = String → Tree

    View Slide

  35. Compose?
    type Parser = String → (Tree, String)

    View Slide

  36. Generalize?
    type Parser a = String → (a, String)

    View Slide

  37. Errors?
    type Parser a = String → Maybe (a, String)

    View Slide

  38. Or better…
    type Parser a = String → [(a, String)]

    View Slide

  39. Let’s try…

    View Slide

  40. Snippets:
    http://goo.gl/leQIEE

    View Slide

  41. View Slide

  42. View Slide

  43. View Slide

  44. View Slide

  45. … and so?

    View Slide

  46. Expressiveness
    •[] for error
    •[s1] for single (predictive)
    •[s1..sN] for backtracking

    View Slide

  47. First-class citizen

    View Slide

  48. View Slide

  49. Skip anything…

    View Slide

  50. Recognise digit

    View Slide

  51. Combinators

    View Slide

  52. RegExp
    •and: "abc"
    •or: "a | b | c"
    •Kleene star: "a*"

    View Slide

  53. Derives
    •a? = "" | a
    •a+ = aa*
    •a{2,3} = aa | aaa

    View Slide

  54. View Slide

  55. View Slide

  56. laziness is cool for this
    do you need backtracking?

    View Slide

  57. How to use it?

    View Slide

  58. View Slide

  59. View Slide

  60. Cool! but..

    View Slide

  61. ugly
    ugly
    not readable

    View Slide

  62. Enhancements
    •use generators for "laziness"
    •"combine" function
    •Scala-style methods
    •"delay" method

    View Slide

  63. fn.py Stream

    View Slide

  64. View Slide

  65. [1,2,3,4,5]
    expr →"[" digit (","digit)* "]"

    View Slide

  66. View Slide

  67. Interesting! but..

    View Slide

  68. Is it enough?

    View Slide

  69. In Haskell

    View Slide

  70. Can I do this in
    Python?

    View Slide

  71. … hm

    View Slide

  72. Challenge
    accepted!

    View Slide

  73. In Python

    View Slide

  74. How?

    View Slide

  75. Desugaring…

    View Slide

  76. What?

    View Slide

  77. WAT???
    even more like

    View Slide

  78. unit
    a → Parser a

    View Slide

  79. bind
    Parser a → (a → Parser b) → Parser b

    View Slide

  80. lift
    (a → b) → (a → Parser b)

    View Slide

  81. lifted
    Parser a → (a → b) → Parser b

    View Slide

  82. WAT???
    ok, looks cool, but

    View Slide

  83. View Slide

  84. View Slide

  85. How to use

    View Slide

  86. And even more..

    View Slide

  87. Haskell-style

    View Slide

  88. Do-notation

    View Slide

  89. View Slide

  90. View Slide

  91. (define R 2)
    (define diameter (lambda (r) (* 2 r)))

    View Slide

  92. View Slide

  93. View Slide

  94. Looks nice!

    View Slide

  95. Mutability kills
    backtracking :(

    View Slide

  96. And more
    •errors handling
    •backtracking control
    •performance

    View Slide

  97. Links
    • "funcparselib" http://goo.gl/daidQY
    • "Monadic parsing in Haskell" http://goo.gl/gygNlM
    • "Higher-Order functions for Parsing" http://goo.gl/c8VOIZ
    • "Parsec" http://goo.gl/bdnDZQ
    • "Parcon" http://goo.gl/CT06S5
    • "Pyparsing" http://goo.gl/gmr2lQ
    • "You Could Have Invented Monadic Parsing" http://goo.gl/h0rnOQ

    View Slide

  98. Learn Haskell
    For Great Good

    View Slide

  99. Q/A
    thanks for your attention,

    View Slide