Slide 1

Slide 1 text

Monadic Parsing in Python Alexey Kachayev, 2014

Slide 2

Slide 2 text

About me • CTO at Attendify.com • Erlang, Clojure, Go, Haskell • Fn.py library author • CPython & Storm contributor

Slide 3

Slide 3 text

Find me •@kachayev •github.com/kachayev •kachayev <$> gmail.com

Slide 4

Slide 4 text

Topic

Slide 5

Slide 5 text

Will talk •What is "parsing(ers)" •Approaches •Monadic parsing from scratch •More…

Slide 6

Slide 6 text

Will talk •Less about theory •Much more about practice

Slide 7

Slide 7 text

Won’t talk •What "monad" is •Why FP is cool (*) * you’ll understand it by yourself

Slide 8

Slide 8 text

Parsing

Slide 9

Slide 9 text

Definition •Takes grammar •Takes input string (?) •Returns tree (??) or an error

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

For PL creators only?

Slide 12

Slide 12 text

Tasks • Processing information from logs • Source code analysing • DSLs • Protocols & data formats • … and more

Slide 13

Slide 13 text

Approaches

Slide 14

Slide 14 text

Production rule S → SS|(S)|()

Slide 15

Slide 15 text

Grammar block = ["const" ident "=" number {"," ident "=" number} ";"] ["var" ident {"," ident} ";"] {"procedure" ident ";" block ";"} statement ! expression = ["+"|"-"] term {("+"|"-") term} ! term = factor {("*"|"/") factor} ! factor = ident | number | "(" expression ")" ! . . . .

Slide 16

Slide 16 text

•Top-down / bottom-up •Predictive / Backtracking •LL(k), LALR, LR, CYK and others In theory

Slide 17

Slide 17 text

Manually!

Slide 18

Slide 18 text

@ wikipedia

Slide 19

Slide 19 text

Manually •Simple to understand •Hard to maintain •Really boring

Slide 20

Slide 20 text

Can we do better?

Slide 21

Slide 21 text

What we have •Context-free grammars •Formal theory •Well-defined algorithms •Standard grammar notation(s)

Slide 22

Slide 22 text

So…

Slide 23

Slide 23 text

Parser generator •1. Parse DSL notation •2. Generate parser code •("any" language)

Slide 24

Slide 24 text

Parser generator •*PEG* •*Yacc* •ANTLR •… and tens more

Slide 25

Slide 25 text

Parser generator •Pros •many targeted languages •formalism •performance & optimisations

Slide 26

Slide 26 text

Parser generator •Cons •another language •bounded in features •"compiled-time" mostly

Slide 27

Slide 27 text

Can we do better?

Slide 28

Slide 28 text

Monadic parsers & combinators

Slide 29

Slide 29 text

Functional Pearls Monadic Parsing in Haskell @Graham Hutton, @Erik Meijer

Slide 30

Slide 30 text

Parsec MPC library for Haskell

Slide 31

Slide 31 text

Parsec •Monadic parser combinator(s) •Works even with context- sensitive, infinite LA grammars •Tens of ports to other langs

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

The Big Idea

Slide 34

Slide 34 text

Simple type Parser = String → Tree

Slide 35

Slide 35 text

Compose? type Parser = String → (Tree, String)

Slide 36

Slide 36 text

Generalize? type Parser a = String → (a, String)

Slide 37

Slide 37 text

Errors? type Parser a = String → Maybe (a, String)

Slide 38

Slide 38 text

Or better… type Parser a = String → [(a, String)]

Slide 39

Slide 39 text

Let’s try…

Slide 40

Slide 40 text

Snippets: http://goo.gl/leQIEE

Slide 41

Slide 41 text

No content

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

No content

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

… and so?

Slide 46

Slide 46 text

Expressiveness •[] for error •[s1] for single (predictive) •[s1..sN] for backtracking

Slide 47

Slide 47 text

First-class citizen

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

Skip anything…

Slide 50

Slide 50 text

Recognise digit

Slide 51

Slide 51 text

Combinators

Slide 52

Slide 52 text

RegExp •and: "abc" •or: "a | b | c" •Kleene star: "a*"

Slide 53

Slide 53 text

Derives •a? = "" | a •a+ = aa* •a{2,3} = aa | aaa

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

laziness is cool for this do you need backtracking?

Slide 57

Slide 57 text

How to use it?

Slide 58

Slide 58 text

No content

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

Cool! but..

Slide 61

Slide 61 text

ugly ugly not readable

Slide 62

Slide 62 text

Enhancements •use generators for "laziness" •"combine" function •Scala-style methods •"delay" method

Slide 63

Slide 63 text

fn.py Stream

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

[1,2,3,4,5] expr →"[" digit (","digit)* "]"

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

Interesting! but..

Slide 68

Slide 68 text

Is it enough?

Slide 69

Slide 69 text

In Haskell

Slide 70

Slide 70 text

Can I do this in Python?

Slide 71

Slide 71 text

… hm

Slide 72

Slide 72 text

Challenge accepted!

Slide 73

Slide 73 text

In Python

Slide 74

Slide 74 text

How?

Slide 75

Slide 75 text

Desugaring…

Slide 76

Slide 76 text

What?

Slide 77

Slide 77 text

WAT??? even more like

Slide 78

Slide 78 text

unit a → Parser a

Slide 79

Slide 79 text

bind Parser a → (a → Parser b) → Parser b

Slide 80

Slide 80 text

lift (a → b) → (a → Parser b)

Slide 81

Slide 81 text

lifted Parser a → (a → b) → Parser b

Slide 82

Slide 82 text

WAT??? ok, looks cool, but

Slide 83

Slide 83 text

No content

Slide 84

Slide 84 text

No content

Slide 85

Slide 85 text

How to use

Slide 86

Slide 86 text

And even more..

Slide 87

Slide 87 text

Haskell-style

Slide 88

Slide 88 text

Do-notation

Slide 89

Slide 89 text

No content

Slide 90

Slide 90 text

No content

Slide 91

Slide 91 text

(define R 2) (define diameter (lambda (r) (* 2 r)))

Slide 92

Slide 92 text

No content

Slide 93

Slide 93 text

No content

Slide 94

Slide 94 text

Looks nice!

Slide 95

Slide 95 text

Mutability kills backtracking :(

Slide 96

Slide 96 text

And more •errors handling •backtracking control •performance

Slide 97

Slide 97 text

Links • "funcparselib" http://goo.gl/daidQY • "Monadic parsing in Haskell" http://goo.gl/gygNlM • "Higher-Order functions for Parsing" http://goo.gl/c8VOIZ • "Parsec" http://goo.gl/bdnDZQ • "Parcon" http://goo.gl/CT06S5 • "Pyparsing" http://goo.gl/gmr2lQ • "You Could Have Invented Monadic Parsing" http://goo.gl/h0rnOQ

Slide 98

Slide 98 text

Learn Haskell For Great Good

Slide 99

Slide 99 text

Q/A thanks for your attention,