Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rubyで書くParser (自力かライブラリか、それが問題だ)

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Rubyで書くParser (自力かライブラリか、それが問題だ)

2017.Nov 福岡Ruby会議02 でのセッション資料です。Parser書いたら楽しいよというお話。

Avatar for やきとりい

やきとりい

November 25, 2017
Tweet

More Decks by やきとりい

Other Decks in Programming

Transcript

  1. ௗҪઇ ࣗݾ঺հ w גࣜձࣾສ༿ۈ຿ w 3BJMTΞϓϦέʔγϣϯΤϯδχΞ w 3BJMT(JSMT5PLZPOEΦʔΨφΠβʔ w ༁ॻʹ

    w ʰϓϩάϥϛϯά&MJYJSʱ
 %BWF5IPNBTɺ೥ΦʔϜࣾ
 ࡫ాߞҰͱڞ༁ w ʰϧϏΟͷ΅͏͚ΜʱγϦʔζ
 ϦϯμɾϦ΢Χεஶɹᠳӭࣾ
  2. RUBY Ͱॻ͘PARSER ΋͘͡ • ͳͥParserΛॻ͘͜ͱʹͳͬͨͷ͔ • ઃܭ • Parser͸Ή͔͍ͣ͠ʂ •

    Treetop ͱ͍͏ gem • ͦΕͰ΋ָ͍ࣗ͠࡞ParserʢͨͿΜ΍Ίͨ΄͏͕͍͍ʣ
  3. ʢಈ͚͹ྑ͠ɺύϑΥʔϚϯεͳͲ͸ߟ͑ͳ͍΋ͷͱ͢Δʣ ઃܭ Ruby script Parse Compile Ruby byte code PatternMatching

    %p([a, ‘bc’]) =~ [3, ‘bc’] “[a, ‘bc’]” ม਺Ϧετ[“a”] ม਺ͷఆٛ Evaluator pattern_match obj Parse pattern Binding ΛͱΔͨΊʹ͝ʹΐΔ ASTߏங Ϛον͢Δ͔νΣοΫ ม਺୅ೖ RubyͷClass
  4. ʢಈ͚͹ྑ͠ɺύϑΥʔϚϯεͳͲ͸ߟ͑ͳ͍΋ͷͱ͢Δʣ ઃܭ Ruby script Parse Compile Ruby byte code PatternMatching

    %p([a, ‘bc’]) =~ [3, ‘bc’] “[a, ‘bc’]” ม਺Ϧετ[“a”] ม਺ͷఆٛ Evaluator pattern_match obj Parse pattern Binding ΛͱΔͨΊʹ͝ʹΐΔ RubyͷClass ࠓ೔ͷ࿩͸ίί‼︎ ASTߏங Ϛον͢Δ͔νΣοΫ ม਺୅ೖ
  5. ʢPATTERN MATCHING ΫϥεͷʣPARSERͷ΍Δ͜ͱ ྫ͑͹ `%p ([a, ‘bc’])`ͱ͍͏ύλʔϯ͕ࢦఆ͞Εͨ৔߹ɺ
 “[a, ‘bc’]” ͱ͍͏จࣈྻΛड͚औͬͯ…

    • छྨ:ʮ഑ྻʯͰ͋Δ • ཁૉͷҰ൪໨͕ม਺aͰ͋Δ • ཁૉͷೋ൪໨͕จࣈྻ ͷ ‘bc’ Ͱ͋Δ • ඞཁͳύλʔϯม਺ͷϦετɿ[a]Ͱ͋Δ ͜ͱΛղੳͯ͠ɺߏ଄ʹ͢Δ
  6. %p([a, ‘bc’]) =~ [3, ‘bc’] PARSEͷྲྀΕ “[a, `bc`]” [ ͱ

    a ͱ , ͱ `bc` ͱ ] Tokenize จࣈྻ Tokens AST ASTߏங String
 Node (‘bc’) Array 
 Node Variable
 Node (a) ஫1 AST࡞Δͱ͖ʹࠓճ͸ύλʔϯม਺Ϧετ΋࡞Δ
  7. PARSEͷྲྀΕ “{status: 200, users: [a, b] }” { ͱ status:

    ͱ 200 ͱ , ͱ 
 users: ͱ [ ͱ aͱ , ͱ b ͱ ] ͱ } Tokenize จࣈྻ Tokens ASTߏங %p({status: 200, users: [a, b] }) =~ {status: 200, users: [1, 3] } AST Variable
 Node (b) val:Array 
 Node Variable
 Node (a) Hash
 Node val: 
 Integer
 Node (200) key: Symbol
 Node
 (:status) key: Symbol
 Node
 (:users)
  8. ࠷ऴతʹཉ͍͠ͷ͸AST %p({status: 200, users: [a, b] }) =~ {status: 200,

    users: [1, 3] } AST Variable
 Node (b) val:Array 
 Node Variable
 Node (a) Hash
 Node val: 
 Integer
 Node (200) key: Symbole
 Node
 (:status) key: Symbole
 Node
 (:users) {status: 200, users: [a, b] } ɹASTΛḷͬͯɺ஋ͱύλʔϯͱϚον͢Δ͔Λௐ΂Δ Ϛονର৅ ͦ΋ͦ΋hash? key ͕ status: ͷ val ͸ 200? key ͕users: ͷ val ͸ ഑ྻʁ ഑ྻͷཁૉ਺͸2? ഑ྻͷཁૉ਺ͷ1൪໨Λม਺aʹ֨ೲ͠Αʔ ഑ྻͷཁૉ਺ͷ2൪໨Λม਺bʹ֨ೲ͠Αʔ
  9. Tokenize “[a, `bc`]” [ ͱ a ͱ , ͱ `bc`

    ͱ ] Tokenize จࣈྻ Tokens Tokens Token ͷλΠϓΛݟͯɺʮ͓ͬ഑ྻͷ։͖ه߸͕དྷ͔ͨΒɺ͜ͷޙ͸഑ྻ͕ด͡ Δ·Ͱ഑ྻͷத਎ͩͳʯΈ͍ͨʹASTΛ࡞ͬͯΏ͘ ஋ λΠϓ [ ഑ྻͷ։͖ه߸ a ม਺ , ΧϯϚ `bc` จࣈྻɹ ] ഑ྻͷด͡ه߸
  10. ࢖ͬͨͷ͸StringScanner#scan Tokenize • StringScanner#scan • จࣈྻΛ಄͔ΒεΩϟϯͯ͠ɺਖ਼نදݱʹϚονͨ͠ΒϚον෦෼ Λฦͯͦ͠ͷޙΖ·ͰindexΛ͢͢ΊΔ “[a, `bc`]” [

    a , `bc` ] ਖ਼نදݱ λΠϓ /\[/ ഑ྻͷ։͖ ه߸ /[a-z_][a-z0-9_]*/ ม਺ /,/ ΧϯϚ /'.*?'/ จࣈྻɹ /\]/ ഑ྻͷด͡ ه߸ “a, `bc`]” “`bc`]” “]” “[a, `bc`]” จࣈྻ Tokens Scan
  11. • %p( [x, :y, { "array" => [5, v] }]

    ) 
 ͘Β͍·Ͱ͸ParseͰ͖ΔΑ͏ʹͳͬͨ • ࣗྗͰҰ͔ΒParserΛॻ͘ͷ͸͔ͳΓߝ౉Γ • ֦ுੑʹݶք΋͋ΔʢΘͨ͠ʹ͸ʣ
  12. PARSERΛॻ͍ͯΈΔͱ… • ࠓ·Ͱࣗ͘͝વʹಡΈॻ͖͍ͯͨ͠`[1, 2, 3]` ΍`{status: 200, users: [1, 2]

    }`ͳͲ͕ɺ
 ಥવʮ͜Ε͔Βղऍ͞ΕΔʢ·ͩҙຯΛ࣋ͨͳ͍ʣจࣈྻʯͱ ͯ͠໨ͷલʹݱΕΔ • εϖʔεɺΧϯϚɺ͢΂ͯʹҙຯ͕͋Δ • Rubyຊମͷparse͍͢͝ • ਓؒͷ໨΋͍͢͝