Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Parsing for Humans
Search
Yorick Peterse
May 20, 2014
Programming
2
88
Parsing for Humans
Presentation for the Amsterdam Ruby user group.
Yorick Peterse
May 20, 2014
Tweet
Share
More Decks by Yorick Peterse
See All by Yorick Peterse
Garbage Collection Crash Course
yorickpeterse
1
360
Making GitLab Faster
yorickpeterse
2
450
Rubinius & The Eternal Yak
yorickpeterse
1
250
Oga
yorickpeterse
3
190
Other Decks in Programming
See All in Programming
色々なIaCツールを実際に触って比較してみる
iriikeita
0
330
よくできたテンプレート言語として TypeScript + JSX を利用する試み / Using TypeScript + JSX outside of Web Frontend #TSKaigiKansai
izumin5210
6
1.7k
카카오페이는 어떻게 수천만 결제를 처리할까? 우아한 결제 분산락 노하우
kakao
PRO
0
110
NSOutlineView何もわからん:( 前編 / I Don't Understand About NSOutlineView :( Pt. 1
usagimaru
0
340
聞き手から登壇者へ: RubyKaigi2024 LTでの初挑戦が 教えてくれた、可能性の星
mikik0
1
130
弊社の「意識チョット低いアーキテクチャ」10選
texmeijin
5
24k
Flutterを言い訳にしない!アプリの使い心地改善テクニック5選🔥
kno3a87
1
190
TypeScript Graph でコードレビューの心理的障壁を乗り越える
ysk8hori
2
1.1k
Pinia Colada が実現するスマートな非同期処理
naokihaba
4
230
Tauriでネイティブアプリを作りたい
tsucchinoko
0
370
Make Impossible States Impossibleを 意識してReactのPropsを設計しよう
ikumatadokoro
0
170
CSC509 Lecture 12
javiergs
PRO
0
160
Featured
See All Featured
A better future with KSS
kneath
238
17k
YesSQL, Process and Tooling at Scale
rocio
169
14k
Music & Morning Musume
bryan
46
6.2k
Keith and Marios Guide to Fast Websites
keithpitt
409
22k
Why Our Code Smells
bkeepers
PRO
334
57k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
109
49k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
10
720
The Invisible Side of Design
smashingmag
298
50k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
26
2.1k
GitHub's CSS Performance
jonrohan
1030
460k
Side Projects
sachag
452
42k
jQuery: Nuts, Bolts and Bling
dougneiner
61
7.5k
Transcript
Parsing For Humans
@yorickpeterse (Github, Twitter, Gmail, etc)
Olery http://olery.com/
Parsing “Analysing a string of symbols according to the rules
of a grammar.”
10 + 10
number = [0-9]+ operator = “+”
Vocabulary Token, AST, Lexer & Parser
Token A label for a specific part of the input.
10 [:INTEGER, “10”]
Abstract Syntax Tree A tree structure representing the input.
Markdown Lists * Item * Nested item
(unordered-list (list “Item” (unordered-list (list “Nested item”))))
Lexer Takes raw input and returns a sequence of tokens.
Parser Takes tokens as input and returns an AST.
Available Tools
ANTLR, Bison, Coco/R, Flex, Happy, Lemon, jQuery, Parslet, Racc, Ragel,
Rexical, Treetop, Yacc
Ragel & Racc
Ragel “A finite state machine compiler.”
“<!--” any* “-->” => { … };
$ ragel -R lexer.rl -o lexer.rb
Racc “A LALR(1) parser generator.”
expression : NUMBER OPERATOR NUMBER { … } ;
$ racc -o parser.rb parser.y
Oga Parsing XML/HTML in Ruby https://github.com/yorickpeterse/oga
Questions?