Parsing with
Derivatives
David Nolen
Papers We Love NYC, August 2016
Slide 2
Slide 2 text
Parsing with
Derivatives
David Nolen
Papers We Love NYC, August 2016
Slide 3
Slide 3 text
Parsing with
Derivatives
David Nolen
Papers I Should Read NYC, August 2016
Slide 4
Slide 4 text
No content
Slide 5
Slide 5 text
No content
Slide 6
Slide 6 text
No content
Slide 7
Slide 7 text
No content
Slide 8
Slide 8 text
No content
Slide 9
Slide 9 text
No content
Slide 10
Slide 10 text
No content
Slide 11
Slide 11 text
No content
Slide 12
Slide 12 text
No content
Slide 13
Slide 13 text
No content
Slide 14
Slide 14 text
No content
Slide 15
Slide 15 text
No content
Slide 16
Slide 16 text
Overview
• Preliminaries
• Brzozowski’s derivative
• Derivatives of context-free languages
• Parsers & parser combinators
• Derivatives of parser combinators
Slide 17
Slide 17 text
• Performance and complexity
• Compaction
Slide 18
Slide 18 text
Preliminaries
Slide 19
Slide 19 text
A language L is a set of strings
Slide 20
Slide 20 text
{foo, bar}
{cat, dog}
{papers, we, love}
Slide 21
Slide 21 text
A string w is a sequence of characters
from an alphabet A
Slide 22
Slide 22 text
2 typical atomic languages
• The empty language, ∅, contains no strings
•
∅ = {}
• The null language 㸜 contains only the length zero
“null” string
• 㸜 = {w} where length(w) = 0
Slide 23
Slide 23 text
Given an alphabet A there is a
singleton language for every
character c in the alphabet
c ≡{c}
Slide 24
Slide 24 text
No content
Slide 25
Slide 25 text
union → alt
concatenation → cat
Kleene star → rep
Slide 26
Slide 26 text
No content
Slide 27
Slide 27 text
No content
Slide 28
Slide 28 text
Brzozowski’s derivative
Slide 29
Slide 29 text
The derivative of a language L with
respect to character c is a new language
that has been “filtered” and “chopped”
Dc(L)
Slide 30
Slide 30 text
No content
Slide 31
Slide 31 text
To determine membership, derive a
language with respect to each character,
and check if the final language contains
the null string: if yes, the original string
was in; if not, it wasn’t.