Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rubyで書くParser (自力かライブラリか、それが問題だ)

Rubyで書くParser (自力かライブラリか、それが問題だ)

2017.Nov 福岡Ruby会議02 でのセッション資料です。Parser書いたら楽しいよというお話。

やきとりい

November 25, 2017
Tweet

More Decks by やきとりい

Other Decks in Programming

Transcript

  1. ௗҪઇ ࣗݾ঺հ w גࣜձࣾສ༿ۈ຿ w 3BJMTΞϓϦέʔγϣϯΤϯδχΞ w 3BJMT(JSMT5PLZPOEΦʔΨφΠβʔ w ༁ॻʹ

    w ʰϓϩάϥϛϯά&MJYJSʱ
 %BWF5IPNBTɺ೥ΦʔϜࣾ
 ࡫ాߞҰͱڞ༁ w ʰϧϏΟͷ΅͏͚ΜʱγϦʔζ
 ϦϯμɾϦ΢Χεஶɹᠳӭࣾ
  2. RUBY Ͱॻ͘PARSER ΋͘͡ • ͳͥParserΛॻ͘͜ͱʹͳͬͨͷ͔ • ઃܭ • Parser͸Ή͔͍ͣ͠ʂ •

    Treetop ͱ͍͏ gem • ͦΕͰ΋ָ͍ࣗ͠࡞ParserʢͨͿΜ΍Ίͨ΄͏͕͍͍ʣ
  3. ʢಈ͚͹ྑ͠ɺύϑΥʔϚϯεͳͲ͸ߟ͑ͳ͍΋ͷͱ͢Δʣ ઃܭ Ruby script Parse Compile Ruby byte code PatternMatching

    %p([a, ‘bc’]) =~ [3, ‘bc’] “[a, ‘bc’]” ม਺Ϧετ[“a”] ม਺ͷఆٛ Evaluator pattern_match obj Parse pattern Binding ΛͱΔͨΊʹ͝ʹΐΔ ASTߏங Ϛον͢Δ͔νΣοΫ ม਺୅ೖ RubyͷClass
  4. ʢಈ͚͹ྑ͠ɺύϑΥʔϚϯεͳͲ͸ߟ͑ͳ͍΋ͷͱ͢Δʣ ઃܭ Ruby script Parse Compile Ruby byte code PatternMatching

    %p([a, ‘bc’]) =~ [3, ‘bc’] “[a, ‘bc’]” ม਺Ϧετ[“a”] ม਺ͷఆٛ Evaluator pattern_match obj Parse pattern Binding ΛͱΔͨΊʹ͝ʹΐΔ RubyͷClass ࠓ೔ͷ࿩͸ίί‼︎ ASTߏங Ϛον͢Δ͔νΣοΫ ม਺୅ೖ
  5. ʢPATTERN MATCHING ΫϥεͷʣPARSERͷ΍Δ͜ͱ ྫ͑͹ `%p ([a, ‘bc’])`ͱ͍͏ύλʔϯ͕ࢦఆ͞Εͨ৔߹ɺ
 “[a, ‘bc’]” ͱ͍͏จࣈྻΛड͚औͬͯ…

    • छྨ:ʮ഑ྻʯͰ͋Δ • ཁૉͷҰ൪໨͕ม਺aͰ͋Δ • ཁૉͷೋ൪໨͕จࣈྻ ͷ ‘bc’ Ͱ͋Δ • ඞཁͳύλʔϯม਺ͷϦετɿ[a]Ͱ͋Δ ͜ͱΛղੳͯ͠ɺߏ଄ʹ͢Δ
  6. %p([a, ‘bc’]) =~ [3, ‘bc’] PARSEͷྲྀΕ “[a, `bc`]” [ ͱ

    a ͱ , ͱ `bc` ͱ ] Tokenize จࣈྻ Tokens AST ASTߏங String
 Node (‘bc’) Array 
 Node Variable
 Node (a) ஫1 AST࡞Δͱ͖ʹࠓճ͸ύλʔϯม਺Ϧετ΋࡞Δ
  7. PARSEͷྲྀΕ “{status: 200, users: [a, b] }” { ͱ status:

    ͱ 200 ͱ , ͱ 
 users: ͱ [ ͱ aͱ , ͱ b ͱ ] ͱ } Tokenize จࣈྻ Tokens ASTߏங %p({status: 200, users: [a, b] }) =~ {status: 200, users: [1, 3] } AST Variable
 Node (b) val:Array 
 Node Variable
 Node (a) Hash
 Node val: 
 Integer
 Node (200) key: Symbol
 Node
 (:status) key: Symbol
 Node
 (:users)
  8. ࠷ऴతʹཉ͍͠ͷ͸AST %p({status: 200, users: [a, b] }) =~ {status: 200,

    users: [1, 3] } AST Variable
 Node (b) val:Array 
 Node Variable
 Node (a) Hash
 Node val: 
 Integer
 Node (200) key: Symbole
 Node
 (:status) key: Symbole
 Node
 (:users) {status: 200, users: [a, b] } ɹASTΛḷͬͯɺ஋ͱύλʔϯͱϚον͢Δ͔Λௐ΂Δ Ϛονର৅ ͦ΋ͦ΋hash? key ͕ status: ͷ val ͸ 200? key ͕users: ͷ val ͸ ഑ྻʁ ഑ྻͷཁૉ਺͸2? ഑ྻͷཁૉ਺ͷ1൪໨Λม਺aʹ֨ೲ͠Αʔ ഑ྻͷཁૉ਺ͷ2൪໨Λม਺bʹ֨ೲ͠Αʔ
  9. Tokenize “[a, `bc`]” [ ͱ a ͱ , ͱ `bc`

    ͱ ] Tokenize จࣈྻ Tokens Tokens Token ͷλΠϓΛݟͯɺʮ͓ͬ഑ྻͷ։͖ه߸͕དྷ͔ͨΒɺ͜ͷޙ͸഑ྻ͕ด͡ Δ·Ͱ഑ྻͷத਎ͩͳʯΈ͍ͨʹASTΛ࡞ͬͯΏ͘ ஋ λΠϓ [ ഑ྻͷ։͖ه߸ a ม਺ , ΧϯϚ `bc` จࣈྻɹ ] ഑ྻͷด͡ه߸
  10. ࢖ͬͨͷ͸StringScanner#scan Tokenize • StringScanner#scan • จࣈྻΛ಄͔ΒεΩϟϯͯ͠ɺਖ਼نදݱʹϚονͨ͠ΒϚον෦෼ Λฦͯͦ͠ͷޙΖ·ͰindexΛ͢͢ΊΔ “[a, `bc`]” [

    a , `bc` ] ਖ਼نදݱ λΠϓ /\[/ ഑ྻͷ։͖ ه߸ /[a-z_][a-z0-9_]*/ ม਺ /,/ ΧϯϚ /'.*?'/ จࣈྻɹ /\]/ ഑ྻͷด͡ ه߸ “a, `bc`]” “`bc`]” “]” “[a, `bc`]” จࣈྻ Tokens Scan
  11. • %p( [x, :y, { "array" => [5, v] }]

    ) 
 ͘Β͍·Ͱ͸ParseͰ͖ΔΑ͏ʹͳͬͨ • ࣗྗͰҰ͔ΒParserΛॻ͘ͷ͸͔ͳΓߝ౉Γ • ֦ுੑʹݶք΋͋ΔʢΘͨ͠ʹ͸ʣ
  12. PARSERΛॻ͍ͯΈΔͱ… • ࠓ·Ͱࣗ͘͝વʹಡΈॻ͖͍ͯͨ͠`[1, 2, 3]` ΍`{status: 200, users: [1, 2]

    }`ͳͲ͕ɺ
 ಥવʮ͜Ε͔Βղऍ͞ΕΔʢ·ͩҙຯΛ࣋ͨͳ͍ʣจࣈྻʯͱ ͯ͠໨ͷલʹݱΕΔ • εϖʔεɺΧϯϚɺ͢΂ͯʹҙຯ͕͋Δ • Rubyຊମͷparse͍͢͝ • ਓؒͷ໨΋͍͢͝