Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Parsing for Humans
Search
Yorick Peterse
May 20, 2014
Programming
2
86
Parsing for Humans
Presentation for the Amsterdam Ruby user group.
Yorick Peterse
May 20, 2014
Tweet
Share
More Decks by Yorick Peterse
See All by Yorick Peterse
Garbage Collection Crash Course
yorickpeterse
1
350
Making GitLab Faster
yorickpeterse
2
430
Rubinius & The Eternal Yak
yorickpeterse
1
250
Oga
yorickpeterse
3
190
Other Decks in Programming
See All in Programming
PHP8.3の機能を振り返る / Review of PHP 8.3 features
seike460
PRO
1
110
SwiftUIで使いやすいToastの作り方 / How to build a Toast system which is easy to use in SwiftUI
lovee
3
150
Polars入門
daikikatsuragawa
1
100
Tailwind CSSを本気でカスタマイズする方法
fsubal
14
5.3k
Site Reliability Engineering for GMO
pyama86
8
1.1k
OpenAPIを中心に考えるAPI開発入門 / Introduction to API Development with a Focus on OpenAPI
seike460
PRO
2
170
Micro Frontends for Java Microservices - Devnexus 2024
mraible
PRO
0
500
Node.js v22 で変わること
yosuke_furukawa
PRO
10
3.6k
Azure OpenAI Serviceのプロンプトエンジニアリング入門
tomokusaba
3
800
R言語の環境構築と基礎 Tokyo.R 112
bob3bob3
0
270
Zero Waste, Radical Magic, and Italian Graft – Quarkus Efficiency Secrets
hollycummins
0
230
Kotlin Multiplatform at Stable and Beyond (Android Makers 2024)
zsmb
0
320
Featured
See All Featured
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
125
32k
Thoughts on Productivity
jonyablonski
58
3.8k
How To Stay Up To Date on Web Technology
chriscoyier
782
250k
Infographics Made Easy
chrislema
238
18k
Documentation Writing (for coders)
carmenintech
60
3.9k
How GitHub (no longer) Works
holman
304
140k
Clear Off the Table
cherdarchuk
84
310k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
659
120k
Six Lessons from altMBA
skipperchong
21
3k
Unsuck your backbone
ammeep
663
57k
5 minutes of I Can Smell Your CMS
philhawksworth
199
19k
Side Projects
sachag
451
41k
Transcript
Parsing For Humans
@yorickpeterse (Github, Twitter, Gmail, etc)
Olery http://olery.com/
Parsing “Analysing a string of symbols according to the rules
of a grammar.”
10 + 10
number = [0-9]+ operator = “+”
Vocabulary Token, AST, Lexer & Parser
Token A label for a specific part of the input.
10 [:INTEGER, “10”]
Abstract Syntax Tree A tree structure representing the input.
Markdown Lists * Item * Nested item
(unordered-list (list “Item” (unordered-list (list “Nested item”))))
Lexer Takes raw input and returns a sequence of tokens.
Parser Takes tokens as input and returns an AST.
Available Tools
ANTLR, Bison, Coco/R, Flex, Happy, Lemon, jQuery, Parslet, Racc, Ragel,
Rexical, Treetop, Yacc
Ragel & Racc
Ragel “A finite state machine compiler.”
“<!--” any* “-->” => { … };
$ ragel -R lexer.rl -o lexer.rb
Racc “A LALR(1) parser generator.”
expression : NUMBER OPERATOR NUMBER { … } ;
$ racc -o parser.rb parser.y
Oga Parsing XML/HTML in Ruby https://github.com/yorickpeterse/oga
Questions?