Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Rubyで書くParser (自力かライブラリか、それが問題だ)

Rubyで書くParser (自力かライブラリか、それが問題だ)

2017.Nov 福岡Ruby会議02 でのセッション資料です。Parser書いたら楽しいよというお話。

やきとりい

November 25, 2017
Tweet

More Decks by やきとりい

Other Decks in Programming

Transcript

  1. RUBYͰॻ͘PARSER

    (ࣗྗ͔ϥΠϒϥϦ͔ɺͦΕ͕໰୊ͩʣ
    2017.Nov ෱ԬRubyձٞ02 ௗҪઇ

    View Slide

  2. ௗҪઇ
    ࣗݾ঺հ
    w גࣜձࣾສ༿ۈ຿
    w 3BJMTΞϓϦέʔγϣϯΤϯδχΞ
    w 3BJMT(JSMT5PLZPOEΦʔΨφΠβʔ
    w ༁ॻʹ
    w ʰϓϩάϥϛϯά&MJYJSʱ

    %BWF5IPNBTɺ೥ΦʔϜࣾ

    ࡫ాߞҰͱڞ༁
    w ʰϧϏΟͷ΅͏͚ΜʱγϦʔζ

    ϦϯμɾϦ΢Χεஶɹᠳӭࣾ

    View Slide

  3. ௗҪઇ
    ࣗݾ঺հ
    ੜ·Ε͸෱Ԭ
    ౦۠ശ࡚খֶߍ
    ஜࢵঁֶԂதֶߍ
    ஜࢵঁֶԂߴ౳ֶߍ
    => େֶ͔Β౦ژ΁
    ͍ͭͷؒʹ͔ϓϩάϥϚʹ
    2012೥෱ԬRubyձٞ01 LT
    2015 RailsGirls Fukuoka ίʔν
    ෱

    Ԭ

    ظ

    ؒ

    View Slide

  4. View Slide

  5. RUBY Ͱॻ͘PARSER
    ΋͘͡
    • ͳͥParserΛॻ͘͜ͱʹͳͬͨͷ͔
    • ઃܭ
    • Parser͸Ή͔͍ͣ͠ʂ
    • Treetop ͱ͍͏ gem
    • ͦΕͰ΋ָ͍ࣗ͠࡞ParserʢͨͿΜ΍Ίͨ΄͏͕͍͍ʣ

    View Slide

  6. ͳͥPARSERΛॻ͘͜ͱʹͳͬͨͷ͔
    ͦΕ͸ͪΐͬͱͨ͠ग़དྷ৺ͩͬͨ

    View Slide

  7. RUBYʹELIXIRʢ஫̍
    Έ͍ͨͳ

    ύλʔϯϚονϯάʢ஫2
    ΄͍͠…
    ஫1 Elixir: Erlang VM্Ͱಈ͘ϓϩάϥϛϯάݴޠ

    ஫2 ύλʔϯϚονϯάɿElixirʹ͋Δ͔͍͍ͬ͜ػೳ

    View Slide

  8. =~ ͰϚονͤͯ͞ύλʔϯม਺ͱͯ͠࢖͍͍ͨ
    RUBYͰॻ͖͍ͨύλʔϯϚον

    View Slide

  9. === Ͱ΋Ϛον͍ͤͨ͞ʢCASEจॻ͖͍ͨʣ
    RUBYͰॻ͖͍ͨύλʔϯϚον

    View Slide

  10. ͜Εʢ஫̍
    ͕ಈ͘ͱ

    આಘྗ͕͋ΔΜ͡Όͳ͍͔…ʢ஫2
    ஫1 ࠷ॳ͸࢓༷͚ͩເݟͯͨ

    ஫2 ݁ہઆಘྗ͕͔͋ͬͨ͸ෆ໌

    View Slide

  11. ΍ͬͯΈͨ

    ʢRUBY KAIGI YOUTUBE

    HTTPS://WWW.YOUTUBE.COM/WATCH?
    V=1M4IPJH0K0E&INDEX=19&T=6S&LIST=PL

    View Slide

  12. RUBYΠϯλϓϦλͷCͷίʔυͷࠩ෼͸͜Ε͚ͩ
    ΍ͬͯΈͨ
    compile.c
    parse.y

    View Slide

  13. ʢಈ͚͹ྑ͠ɺύϑΥʔϚϯεͳͲ͸ߟ͑ͳ͍΋ͷͱ͢Δʣ
    ઃܭ
    Ruby script
    Parse
    Compile
    Ruby byte code Evaluator

    View Slide

  14. ʢಈ͚͹ྑ͠ɺύϑΥʔϚϯεͳͲ͸ߟ͑ͳ͍΋ͷͱ͢Δʣ
    ઃܭ
    Ruby script
    Parse
    Compile
    Ruby byte code
    PatternMatching
    %p([a, ‘bc’]) =~ [3, ‘bc’]
    “[a, ‘bc’]”
    ม਺Ϧετ[“a”]
    ม਺ͷఆٛ
    Evaluator
    pattern_match obj
    Parse pattern
    Binding ΛͱΔͨΊʹ͝ʹΐΔ
    ASTߏங
    Ϛον͢Δ͔νΣοΫ
    ม਺୅ೖ
    RubyͷClass

    View Slide

  15. ʢಈ͚͹ྑ͠ɺύϑΥʔϚϯεͳͲ͸ߟ͑ͳ͍΋ͷͱ͢Δʣ
    ઃܭ
    Ruby script
    Parse
    Compile
    Ruby byte code
    PatternMatching
    %p([a, ‘bc’]) =~ [3, ‘bc’]
    “[a, ‘bc’]”
    ม਺Ϧετ[“a”]
    ม਺ͷఆٛ
    Evaluator
    pattern_match obj
    Parse pattern
    Binding ΛͱΔͨΊʹ͝ʹΐΔ
    RubyͷClass
    ࠓ೔ͷ࿩͸ίί‼︎
    ASTߏங
    Ϛον͢Δ͔νΣοΫ
    ม਺୅ೖ

    View Slide

  16. ʢPATTERN MATCHING ΫϥεͷʣPARSERͷ΍Δ͜ͱ
    ྫ͑͹ `%p ([a, ‘bc’])`ͱ͍͏ύλʔϯ͕ࢦఆ͞Εͨ৔߹ɺ

    “[a, ‘bc’]” ͱ͍͏จࣈྻΛड͚औͬͯ…
    • छྨ:ʮ഑ྻʯͰ͋Δ
    • ཁૉͷҰ൪໨͕ม਺aͰ͋Δ
    • ཁૉͷೋ൪໨͕จࣈྻ ͷ ‘bc’ Ͱ͋Δ
    • ඞཁͳύλʔϯม਺ͷϦετɿ[a]Ͱ͋Δ
    ͜ͱΛղੳͯ͠ɺߏ଄ʹ͢Δ

    View Slide

  17. %p([a, ‘bc’]) =~ [3, ‘bc’]
    PARSEͷྲྀΕ
    “[a, `bc`]”
    [ ͱ
    a ͱ
    , ͱ
    `bc` ͱ
    ]
    Tokenize
    จࣈྻ
    Tokens
    AST
    ASTߏங
    String

    Node (‘bc’)
    Array 

    Node
    Variable

    Node (a)
    ஫1 AST࡞Δͱ͖ʹࠓճ͸ύλʔϯม਺Ϧετ΋࡞Δ

    View Slide

  18. PARSEͷྲྀΕ
    “{status: 200, users: [a, b] }”
    { ͱ
    status: ͱ
    200 ͱ
    , ͱ

    users: ͱ
    [ ͱ

    , ͱ
    b ͱ
    ] ͱ
    }
    Tokenize
    จࣈྻ
    Tokens
    ASTߏங
    %p({status: 200, users: [a, b] }) =~ {status: 200, users: [1, 3] }
    AST
    Variable

    Node (b)
    val:Array 

    Node
    Variable

    Node (a)
    Hash

    Node
    val: 

    Integer

    Node (200)
    key: Symbol

    Node

    (:status)
    key: Symbol

    Node

    (:users)

    View Slide

  19. ࠷ऴతʹཉ͍͠ͷ͸AST
    %p({status: 200, users: [a, b] }) =~ {status: 200, users: [1, 3] }
    AST
    Variable

    Node (b)
    val:Array 

    Node
    Variable

    Node (a)
    Hash

    Node
    val: 

    Integer

    Node (200)
    key: Symbole

    Node

    (:status)
    key: Symbole

    Node

    (:users)
    {status: 200, users: [a, b] }
    ɹASTΛḷͬͯɺ஋ͱύλʔϯͱϚον͢Δ͔Λௐ΂Δ
    Ϛονର৅
    ͦ΋ͦ΋hash?
    key ͕ status: ͷ val ͸ 200?
    key ͕users: ͷ val ͸ ഑ྻʁ
    ഑ྻͷཁૉ਺͸2?
    ഑ྻͷཁૉ਺ͷ1൪໨Λม਺aʹ֨ೲ͠Αʔ
    ഑ྻͷཁૉ਺ͷ2൪໨Λม਺bʹ֨ೲ͠Αʔ

    View Slide

  20. Ή͔͔ͣͬͨ͠

    ʢͱ͘ʹTOKENIZE ʣ

    View Slide

  21. Tokenize
    “[a, `bc`]”
    [ ͱ
    a ͱ
    , ͱ
    `bc` ͱ
    ]
    Tokenize
    จࣈྻ
    Tokens
    Tokens
    Token ͷλΠϓΛݟͯɺʮ͓ͬ഑ྻͷ։͖ه߸͕དྷ͔ͨΒɺ͜ͷޙ͸഑ྻ͕ด͡
    Δ·Ͱ഑ྻͷத਎ͩͳʯΈ͍ͨʹASTΛ࡞ͬͯΏ͘
    ஋ λΠϓ
    [ ഑ྻͷ։͖ه߸
    a ม਺
    , ΧϯϚ
    `bc` จࣈྻɹ
    ] ഑ྻͷด͡ه߸

    View Slide

  22. ࢖ͬͨͷ͸StringScanner#scan
    Tokenize
    • StringScanner#scan
    • จࣈྻΛ಄͔ΒεΩϟϯͯ͠ɺਖ਼نදݱʹϚονͨ͠ΒϚον෦෼
    Λฦͯͦ͠ͷޙΖ·ͰindexΛ͢͢ΊΔ
    “[a, `bc`]” [
    a
    ,
    `bc`
    ]
    ਖ਼نදݱ λΠϓ
    /\[/ ഑ྻͷ։͖
    ه߸
    /[a-z_][a-z0-9_]*/ ม਺
    /,/ ΧϯϚ
    /'.*?'/ จࣈྻɹ
    /\]/ ഑ྻͷด͡
    ه߸
    “a, `bc`]”
    “`bc`]”
    “]”
    “[a, `bc`]”
    จࣈྻ Tokens
    Scan

    View Slide

  23. ίϛοτ೔࣌ʹ஫໨
    ྫɿεϖʔε͕2ͭҎ্ʹͳΔͱࣦഊ͢Δόά
    “[a, `bc`]” “[a, `bc`]”

    View Slide

  24. ௚ͯ͠ͳ͍ʢ͕Μ͹Δʣ
    ྫɿࣗ͘͝વʹ{} Λলུͯ͠͏͔͝ͳ͍ϋογϡ
    %p({ user: 1, from: ‘Fukuoka’})
    %p( user: 1, from: ‘Fukuoka’ )

    View Slide

  25. TOKENIZEʹҰ཯ͷਖ਼نදݱηοτ͔͠దԠͰ͖ͳ͍
    ྫɿ͋ΔλΠϓͷTOKENIZEಠࣗϧʔϧͳͲ͕ѻ͍͑ͯͳ͍
    “Name is #{user.name}”

    View Slide

  26. • %p( [x, :y, { "array" => [5, v] }] ) 

    ͘Β͍·Ͱ͸ParseͰ͖ΔΑ͏ʹͳͬͨ
    • ࣗྗͰҰ͔ΒParserΛॻ͘ͷ͸͔ͳΓߝ౉Γ
    • ֦ுੑʹݶք΋͋ΔʢΘͨ͠ʹ͸ʣ

    View Slide

  27. PARSERΛॻ͍ͯΈΔͱ…
    • ࠓ·Ͱࣗ͘͝વʹಡΈॻ͖͍ͯͨ͠`[1, 2, 3]` ΍`{status: 200, users:
    [1, 2] }`ͳͲ͕ɺ

    ಥવʮ͜Ε͔Βղऍ͞ΕΔʢ·ͩҙຯΛ࣋ͨͳ͍ʣจࣈྻʯͱ
    ͯ͠໨ͷલʹݱΕΔ
    • εϖʔεɺΧϯϚɺ͢΂ͯʹҙຯ͕͋Δ
    • Rubyຊମͷparse͍͢͝
    • ਓؒͷ໨΋͍͢͝

    View Slide

  28. ·͞ʹʮ΋͏Ұ౓ɺ
    RUBYͱग़ձ͏ʯମݧ

    View Slide

  29. https://github.com/cjheath/treetop
    ͱ͜ΖͰTreetopͱ͍͏gem͕͋Γ·͢
    • PEGϕʔεͷಠࣗͷهड़ํࣜͰਖ਼نදݱͳͲΛ࢖ͬͯจ๏ϧʔϧΛఆ
    ٛ͢Δ.treetopϑΝΠϧΛͭ͘Δ
    • ttίϚϯυʹͦͷϑΝΠϧΛ౉͢ͱɺͦΕΛݩʹrubyͷparserϑΝΠ
    ϧΛ࡞ͬͯ͘ΕΔ
    • ੜ੒͞ΕͨrubyϑΝΠϧΛrequire ͢Δ͜ͱͰɺsyntaxnode, ͍ΘΏ
    ΔASTΛߏங͢ΔParserΛ࢖͏͜ͱ͕Ͱ͖Δ
    • ϧʔϧͷωετͷهड़΋༰қ

    View Slide

  30. ࠷ॳ͔ΒTREETOPΛ

    ࢖͑͹ྑ͔ͬͨͷͰ͸…

    View Slide

  31. ࣗ࡞PARSERͱTREETOPൺֱද
    ࣗ࡞ Treetop
    هड़ͷચ࿅
    ϧʔϧͷωετ
    όάͷग़ʹ͘͞
    Rubyͱग़ձ͑Δ

    View Slide

  32. ʢෛ͚੯͠Έ͚ͩͰ͸ͳ͍ʣ
    ͦΕͰ΋ָ͍ࣗ͠࡞PARSER
    • ͦ΋ͦ΋࡞Γ࢝Ίͨஈ֊ͰʮParserʯͱ͍͏΋ͷ͕΅Μ΍Γ͔͠ཧ
    ղͰ͖ͯͳ͔ͬͨ
    • ͜ͷஈ֊ͰTreetopΛ࢖ͬͯ΋ɺந৅౓ෛ͚ͯ͠࢖͍͜ͳͤͳ͔ͬͨ
    ͷͰ͸ͳ͍͔
    • ͍·࢖͍ํ͕Θ͔Βͳͯ͘Treetopͷੜ੒ͨ͠Ruby ParserΛಡΉͱ
    ؾ͕࣋ͪΘ͔Δ
    • ࣗ෼ͷίʔυ͕શ෦จࣈྻʹݟ͑ΔମݧϓϥΠεϨε
    • ंྠͷ࠶ൃ໌Ͱ΋͍͍ɺंྠ͕৺ͷதʹ૊ΈཱͯΒΕΔͷେࣄ

    View Slide

  33. ˎ͋Δఔ౓Ҏ্ෳࡶͳ͜ͱΛ
    ͠Α͏ͱ͢Δͱߦ͖٧·Δɺ
    ͦΖͦΖ৐Γ׵͑Δͷ͕٢ˎ

    View Slide

  34. Կ౓Ͱ΋

    RUBYʹग़ձ͍͖ͬͯ·͠ΐ͏ɺ

    ͋Γ͕ͱ͏͍͟͝·ͨ͠ɻ

    View Slide