Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Parsing for Humans
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
Yorick Peterse
May 20, 2014
Programming
2
110
Parsing for Humans
Presentation for the Amsterdam Ruby user group.
Yorick Peterse
May 20, 2014
Tweet
Share
More Decks by Yorick Peterse
See All by Yorick Peterse
Garbage Collection Crash Course
yorickpeterse
1
390
Making GitLab Faster
yorickpeterse
2
480
Rubinius & The Eternal Yak
yorickpeterse
1
280
Oga
yorickpeterse
3
210
Other Decks in Programming
See All in Programming
Kubernetesでセルフホストが簡単なNewSQLを求めて / Seeking a NewSQL Database That's Simple to Self-Host on Kubernetes
nnaka2992
0
160
最初からAWS CDKで技術検証してもいいんじゃない?
akihisaikeda
4
160
車輪の再発明をしよう!PHP で実装して学ぶ、Web サーバーの仕組みと HTTP の正体
h1r0
1
160
20260313 - Grafana & Friends Taipei #1 - Kubernetes v1.36 的開發雜記:那些困在 Alpha 加護病房太久的 Metrics
tico88612
0
220
モックわからないマン卒業記 ~振る舞いを起点に見直した、フロントエンドテストにおけるモックの使いどころ~
tasukuwatanabe
3
400
How to stabilize UI tests using XCTest
akkeylab
0
130
クライアントワークでSREをするということ。あるいは事業会社におけるSREと同じこと・違うこと
nnaka2992
1
350
エラーログのマスキングの仕組みづくりに役立ったASTの話
kumoichi
0
260
ふつうの Rubyist、ちいさなデバイス、大きな一年
bash0c7
0
1.1k
Go Conference mini in Sendai 2026 : Goに新機能を提案し実装されるまでのフロー徹底解説
yamatoya
0
620
AWS Infrastructure as Code の新機能 2025 総まとめ 〜SA 4人による怒涛のデモ祭り〜
konokenj
10
3.4k
AI時代のシステム設計:ドメインモデルで変更しやすさを守る設計戦略
masuda220
PRO
6
1.1k
Featured
See All Featured
State of Search Keynote: SEO is Dead Long Live SEO
ryanjones
0
160
SEOcharity - Dark patterns in SEO and UX: How to avoid them and build a more ethical web
sarafernandez
0
150
WENDY [Excerpt]
tessaabrams
9
36k
Making the Leap to Tech Lead
cromwellryan
135
9.8k
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
64
52k
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
250
Highjacked: Video Game Concept Design
rkendrick25
PRO
1
320
Scaling GitHub
holman
464
140k
Accessibility Awareness
sabderemane
0
82
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
0
240
Game over? The fight for quality and originality in the time of robots
wayneb77
1
140
Optimizing for Happiness
mojombo
378
71k
Transcript
Parsing For Humans
@yorickpeterse (Github, Twitter, Gmail, etc)
Olery http://olery.com/
Parsing “Analysing a string of symbols according to the rules
of a grammar.”
10 + 10
number = [0-9]+ operator = “+”
Vocabulary Token, AST, Lexer & Parser
Token A label for a specific part of the input.
10 [:INTEGER, “10”]
Abstract Syntax Tree A tree structure representing the input.
Markdown Lists * Item * Nested item
(unordered-list (list “Item” (unordered-list (list “Nested item”))))
Lexer Takes raw input and returns a sequence of tokens.
Parser Takes tokens as input and returns an AST.
Available Tools
ANTLR, Bison, Coco/R, Flex, Happy, Lemon, jQuery, Parslet, Racc, Ragel,
Rexical, Treetop, Yacc
Ragel & Racc
Ragel “A finite state machine compiler.”
“<!--” any* “-->” => { … };
$ ragel -R lexer.rl -o lexer.rb
Racc “A LALR(1) parser generator.”
expression : NUMBER OPERATOR NUMBER { … } ;
$ racc -o parser.rb parser.y
Oga Parsing XML/HTML in Ruby https://github.com/yorickpeterse/oga
Questions?