Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Let’s write a parser! [RUG::B edition]
Search
Denis Defreyne
May 12, 2016
Programming
3
220
Let’s write a parser! [RUG::B edition]
Denis Defreyne
May 12, 2016
Tweet
Share
More Decks by Denis Defreyne
See All by Denis Defreyne
The importance of naming
denisdefreyne
0
130
An introduction to fibers
denisdefreyne
0
230
Code as data (RubyConfBY 2019 edition)
denisdefreyne
0
140
Code as data
denisdefreyne
0
210
How to memoize
denisdefreyne
0
220
Clean & fast code with enumerators
denisdefreyne
0
160
Fibers
denisdefreyne
0
500
Let’s create a programming language! [SoundCloud HQ edition]
denisdefreyne
0
250
Let’s create a programming language! [RUG::B edition]
denisdefreyne
1
230
Other Decks in Programming
See All in Programming
Go言語はstack overflowの夢を見るか?
logica0419
0
260
CSC305 Lecture 03
javiergs
PRO
0
240
Catch Up: Go Style Guide Update
andpad
0
220
After go func(): Goroutines Through a Beginner’s Eye
97vaibhav
0
380
CSC305 Lecture 04
javiergs
PRO
0
270
階層構造を表現するデータ構造とリファクタリング 〜1年で10倍成長したプロダクトの変化と課題〜
yuhisatoxxx
3
1k
そのpreloadは必要?見過ごされたpreloadが技術的負債として爆発した日
mugitti9
2
3.3k
Foundation Modelsを実装日本語学習アプリを作ってみた!
hypebeans
0
110
CSC509 Lecture 03
javiergs
PRO
0
330
育てるアーキテクチャ:戦い抜くPythonマイクロサービスの設計と進化戦略
fujidomoe
1
170
Domain-centric? Why Hexagonal, Onion, and Clean Architecture Are Answers to the Wrong Question
olivergierke
2
840
コードとあなたと私の距離 / The Distance Between Code, You, and I
hiro_y
0
160
Featured
See All Featured
Become a Pro
speakerdeck
PRO
29
5.5k
Intergalactic Javascript Robots from Outer Space
tanoku
273
27k
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
52
5.6k
Practical Orchestrator
shlominoach
190
11k
Build your cross-platform service in a week with App Engine
jlugia
232
18k
Designing Experiences People Love
moore
142
24k
Optimising Largest Contentful Paint
csswizardry
37
3.4k
Rails Girls Zürich Keynote
gr2m
95
14k
How STYLIGHT went responsive
nonsquared
100
5.8k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
30
2.7k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
9
590
Music & Morning Musume
bryan
46
6.8k
Transcript
Let’s write a parser! DENIS DEFREYNE / RUG˸˸B / MAY
12TH, 2016
1. Language 2
I am Denis. 3
But how do you know that I am Denis? 4
But how do you know that I am Denis? I
told you. I wrote it down. Tobi introduced me. You might have seen me before. Etc. 5
But how do you know that I am Denis? You
understand English. 6
Computers are stupid. 7
8 $ git commit --message="Fix bugs"
9 def greet(name) puts "Hello, #{name}" end
Text forms a language, but computers don’t know that. 10
2. Parsing 11
Basic idea: 12 Parser objects that are small, composable, and
purely functional.
13 def read(input, pos)
14 def read(input, pos) Success.new(pos + 1) end
15 def read(input, pos) Failure.new(pos) end
16 char("H") Succeeds if the next character is the given
one.
17 char("H").apply("Hello")
17 H e l l o char("H").apply("Hello")
17 H e l l o 0 1 2 3
4 char("H").apply("Hello")
17 H e l l o 0 1 2 3
4 char("H").apply("Hello")
17 H e l l o 0 1 2 3
4 char("H").apply("Hello")
17 H e l l o 0 1 2 3
4 char("H").apply("Hello") Success(position = 1)
18 char("H").apply("Adiós")
18 A d i ó s 0 1 2 3
4 char("H").apply("Adiós")
18 A d i ó s 0 1 2 3
4 char("H").apply("Adiós")
18 A d i ó s 0 1 2 3
4 char("H").apply("Adiós")
Failure(position = 0) 18 A d i ó s 0
1 2 3 4 char("H").apply("Adiós")
19 if input[pos] == @char Success.new(pos + 1) else Failure.new(pos)
end
20 seq(a, b) Succeeds if both given parsers succeed in
sequence.
21 seq(char("H"), char("e")).apply("Hello")
H e l l o 21 0 1 2 3
4 seq(char("H"), char("e")).apply("Hello")
H e l l o 21 0 1 2 3
4 seq(char("H"), char("e")).apply("Hello")
H e l l o 21 0 1 2 3
4 seq(char("H"), char("e")).apply("Hello")
H e l l o 21 0 1 2 3
4 seq(char("H"), char("e")).apply("Hello")
H e l l o 21 0 1 2 3
4 seq(char("H"), char("e")).apply("Hello") Success(position = 2)
22 seq( char("H"), char("e"), char("l"), char("l"), char("o"), )
23 string(s) Succeeds if all characters in the given string
can be read in sequence.
H e l l o 24 0 1 2 3
4 string("Hello").apply("Hello")
H e l l o 24 0 1 2 3
4 string("Hello").apply("Hello")
H e l l o 24 0 1 2 3
4 string("Hello").apply("Hello")
H e l l o 24 0 1 2 3
4 string("Hello").apply("Hello") Success(position = 5)
25 eof() Succeeds at the end of input; fails otherwise.
H e l l o 26 0 1 2 3
4 seq(string("Hello"), eof).apply("Hello")
H e l l o 26 0 1 2 3
4 seq(string("Hello"), eof).apply("Hello")
H e l l o 26 0 1 2 3
4 seq(string("Hello"), eof).apply("Hello")
H e l l o 26 0 1 2 3
4 seq(string("Hello"), eof).apply("Hello")
H e l l o 26 0 1 2 3
4 seq(string("Hello"), eof).apply("Hello") Success(position = 5)
27 0 1 2 3 4 5 H e l
l o ! seq(string("Hello"), eof).apply("Hello!")
27 0 1 2 3 4 5 H e l
l o ! seq(string("Hello"), eof).apply("Hello!")
27 0 1 2 3 4 5 H e l
l o ! seq(string("Hello"), eof).apply("Hello!")
27 0 1 2 3 4 5 H e l
l o ! seq(string("Hello"), eof).apply("Hello!")
27 0 1 2 3 4 5 Failure(position = 5)
H e l l o ! seq(string("Hello"), eof).apply("Hello!")
28 alt(a, b) Succeeds if either of the given parsers
succeed.
A d i ó s 29 0 1 2 3
4 alt(char("H"), char("A")).apply("Adiós")
A d i ó s 29 0 1 2 3
4 alt(char("H"), char("A")).apply("Adiós")
A d i ó s 29 0 1 2 3
4 alt(char("H"), char("A")).apply("Adiós")
A d i ó s 29 0 1 2 3
4 alt(char("H"), char("A")).apply("Adiós") Success(position = 1)
30 whitespace_char = alt(char(" "), char("\t"))
31 optional(p) Succeeds always, but only advances if p succeeds.
32 repeat(p) Succeeds always, and attempts to apply p as
often as possible.
33 repeat(whitespace_char)
34 intersperse(a, b) Alternates between a and b., always ending
with a.
35 intersperse(char("a"), char(",")).apply("a,a,b")
a , a , b 35 0 1 2 3
4 intersperse(char("a"), char(",")).apply("a,a,b")
a , a , b 35 0 1 2 3
4 intersperse(char("a"), char(",")).apply("a,a,b")
a , a , b 35 0 1 2 3
4 intersperse(char("a"), char(",")).apply("a,a,b")
a , a , b 35 0 1 2 3
4 intersperse(char("a"), char(",")).apply("a,a,b")
a , a , b 35 0 1 2 3
4 intersperse(char("a"), char(",")).apply("a,a,b") Success(position = 3)
36 etc.
37 720 6 29530
38 digit = alt( *('0'..'9') .map { |c| char(c)
} )
39 digit = char_in('0'..'9')
40 nat_number = seq(digit, repeat(digit))
41 nat_number = repeat1(digit)
42 nat_number = repeat1(digit) .capture
42 nat_number = repeat1(digit) .capture Success(position = 3, data =
"123")
43 nat_number = repeat1(digit) .capture .map(&:to_i)
43 nat_number = repeat1(digit) .capture .map(&:to_i) Success(position = 3, data
= 123)
44 def read(input, pos)
45 def read(input, pos) Success.new(pos + 1) end
46 def read(input, pos) Success.new(pos + 1, "blahblah") end
47 first,last,age Denis,Defreyne,29
48
48 field = repeat(char_not(',', "\n")).capture
48 field = repeat(char_not(',', "\n")).capture line = field.intersperse(char(','))
48 field = repeat(char_not(',', "\n")).capture line = field.intersperse(char(',')) file =
seq( line.intersperse(char("\n")), end_of_input, )
49 [ ["first_name", "last_name", "age"], ["Denis", "Defreyne", "29"], ]
50 add(1, mul(2, 3)) mul(2, 3)
51 lparen = char('(') rparen = char(')') comma = char(',')
52 expr = alt(lazy { funcall }, nat_number)
53 funcall = seq( identifier, lparen, arglist, rparen, )
54 letter = char_in('a'..'z') identifier = repeat1(letter).capture
55
55 arglist = seq(expr, arglist_tail)
55 arglist = seq(expr, arglist_tail) arglist_tail = repeat(seq(comma, whitespace, expr))
56
56 expr_list = expr.intersperse(char("\n"))
56 expr_list = expr.intersperse(char("\n")) program = seq(expr_list, eof)
57 [ ["add", 1, ["mul", 2, 3]], ["mul", 2, 3],
]
And that is how you can write a parser. 58
github.com/ddfreyne/d-parse 59
github.com/ddfreyne/d-parse 59
github.com/ddfreyne/d-parse 59
60 My name is Denis Defreyne. Ready to parse your
questions. Find me at
[email protected]
, or @ddfreyne on Twitter.