Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ruby Parser progress report 2024

yui-knk
August 31, 2024

Ruby Parser progress report 2024

RubyKaigi 2024 follow up
https://rhc.connpass.com/event/320709/

リンク:
Ruby Parser開発日誌 (19) - 最高の構文木の設計 2024年版
https://yui-knk.hatenablog.com/entry/2024/08/23/113543

yui-knk

August 31, 2024
Tweet

More Decks by yui-knk

Other Decks in Programming

Transcript

  1. Ruby Parser progress report 2024 August 31, 2024 in RubyKaigi

    2024 follow up @yui-knk Yuichiro Kaneko
  2. About me Yuichiro Kaneko yui-knk (GitHub) / spikeolaf (Twitter) Treasure

    Data Engineering Manager of Applications Backend CRuby committer, mainly develop parser generator and parser Lrama LALR (1) parser generator (2023, Ruby 3.3) Love LR parser
  3. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan The Bison Slayer The parser monster Parser界の黎明卿 The world is now in the great age of parsers. People are setting sail into the vast sea of parsers. - RubyKaigi 2023 LT- Yuichiro Kaneko https://twitter.com/kakutani/status/1657762294431105025/ NEW !!! NEW !!!
  4. The grand strategy of Ruby Parser The grand strategy of

    Ruby Parser Long term goals Provide platform for LSP and other tools Provide Universal parser Keep both Ruby grammar and parser to be maintainable Solution LR parser and parser generator are the best approach for Ruby
  5. Universal Parser Decouple AST from imemo Remove Object from Node

    Refactoring Ripper LSP Optimize Node memory management Delete parser level optimization Union to Struct (Node) User friendly node structure parse.y for Under graduate More declarative parser Ef fi cient data structure (Cactuses) Delete operation support Integration to parse.y More accurate recovery Parameterizing rules Replace hand written parser with Racc User de fi ned stack Scanner state update syntax Scannerless parser IELR RBS Error tolerance Parser Generator (Lrama) Parser ✅ ✅ ✅ ✅ ✅ ✅ ✅ 💪 💪 💪 💪 💪 💪 💪
  6. Universal Parser Decouple AST from imemo Remove Object from Node

    Refactoring Ripper LSP Optimize Node memory management Delete parser level optimization Union to Struct (Node) User friendly node structure parse.y for Under graduate More declarative parser Ef fi cient data structure (Cactuses) Delete operation support Integration to parse.y More accurate recovery Parameterizing rules Replace hand written parser with Racc User de fi ned stack Scanner state update syntax Scannerless parser IELR RBS Error tolerance Parser Generator (Lrama) Parser ✅ ✅ ✅ ✅ ✅ ✅ ✅ 💪 💪 💪 💪 💪 💪 💪
  7. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan 10分でわかる 構文木のユースケースと設計
  8. RuboCop の書き換え if condition_a action_a else if condition_b action_b else

    action_c end end if condition_a action_a elsif condition_b if condition_b action_b else action_c end end if condition_a action_a elsif condition_b action_b action_b else action_c end end if condition_a action_a elsif condition_b action_b action_b else action_c end if condition_a action_a elsif condition_b action_b else action_c end 1. elseΛelsif΁ 2. if condition_b Λ࡟আ 3. ༨෼ͳendΛ ࡟আ 4.ॏෳ͍ͯ͠Δaction_b Λ࡟আ
  9. TreeRewriterの問題点 #1 ࣮૷͕ෳࡶ TreeRewriter͕௚઀จࣈྻΛॻ͖׵͑ΔΘ͚Ͱ͸ͳ͍ TreeRewriter::ActionͷΠϯελϯεΛͭͬͯ͘ɺ࠷ޙʹҰؾʹมߋΛՃ͑ Δ Action. :replace (2, 0)-(2,

    4) “elsif condition_b” Action. :replace (3, 2)-(3, 16) “action_b” Action. :replace (7, 0)-(7, 6) “” Action. :replace (4, 0)-(4, 13) “”
  10. Actionを用いる理由 #1 จࣈྻΛ౎౓ॻ͖׵͑Δͱίετ͕ߴ͍͔Β ͲͪΒͷέʔε΋elseҎ߱ͷจࣈྻΛҠಈ(ίϐʔ)͠ͳ͍ͱ͍ ͚ͳ͍ if condition_a action_a else action_b

    end if condition_a action_a action_b end elseΛ࡟আ if condition_a action_a else action_b end if condition_a action_a elsif action_b end elsifʹஔ׵
  11. Actionを用いる理由 #2 จࣈྻΛ௚઀ॻ͖׵͑ΔͱଞͷϊʔυʹӨڹ͢Δ͔Β if condition_a action_a else action_b end Parser::Source::Bu

    ff er if condition_a action_a action_b end Parser::Source::Bu ff er NODE_VCALL action_b Range (3, 2)-(3, 10) elseΛ࡟আ
  12. ॻ͖׵͑࣌ͷૢ࡞͕൥ࡶ ֤ૢ࡞ͰࠓͲͷΑ͏ͳঢ়ଶ͔ཧղ͠ͳ͍ͱ͍͚ͳ͍ TreeRewriterの問題点 #2 if condition_a action_a else if condition_b

    action_b else action_c end end if condition_a action_a elsif condition_b if condition_b action_b else action_c end end if condition_a action_a elsif condition_b action_b action_b else action_c end end if condition_a action_a elsif condition_b action_b action_b else action_c end if condition_a action_a elsif condition_b action_b else action_c end 1. elseΛelsif΁ 2. if condition_b Λ࡟আ 3. ༨෼ͳendΛ ࡟আ 4.ॏෳ͍ͯ͠Δaction_b Λ࡟আ
  13. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan 簡単だとおもった?
  14. 問 1. 位置情報を与えよ NodeͷΠϯελϯε࡞੒࣌ʹRangeΛ౉͢ඞཁ͕͋Δ փ৭ͷ৽͍͠ίʔυͷҐஔ৘ใΛܭࢉͤΑ if condition_a action_a else if

    condition_b action_b else action_c end end if condition_a action_a elsif condition_b action_b else action_c end
  15. 問 1. 位置情報を与えよ ੜ੒͞ΕΔίʔυΛΠϝʔδ͠ͳ͕ΒҐஔ৘ใΛߟ͑ͳ͍ͱ͍ ͚ͳ͍ͷͰ൥ࡶ ͔ͤͬ͘ߏจ໦ͷॻ͖׵͑ʹͨ͠ͷʹʂ if condition_a action_a else

    if condition_b action_b else action_c end end if condition_a action_a elsif condition_b action_b else action_c end ։࢝Ґஔ͸ else ͷҐஔ ऴྃҐஔ(ߦ)͸ “action_cͷߦ - 1” ऴྃҐஔ(ΧϥϜ)͸ “action_cͷ຤ඌ - 2”
  16. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan 構文木の設計と実装
  17. 具象構文木: コードを復元する ۩৅ߏจ໦ (Concrete Syntax Tree: CST) ׅހͳͲASTͰ͸ࣦΘΕͯ͠·͏৘ใ΋࢒ͨ͠ߏจ໦ AST͕ҙຯ(Semantics)ʹ஫໨͍ͯ͠Δͷʹରͯ͠ɺCST͸ߏจ(Syntax)ʹ஫ ໨͍ͯ͠Δ

    ಛ௃ τʔΫϯΛද͢σʔλߏ଄Λಋೖ͢Δ lexerͰམͱͯ͠͠·͏৘ใΛτʔΫϯʹඥ͚ͮΔ ϊʔυ͕τʔΫϯΛ࣋ͭΑ͏ʹ͢Δ ߏจ໦͔Β΋ͱͷίʔυΛ׬શʹ෮ݩͰ͖ΔΑ͏ʹ͢Δ
  18. 具象構文木: コードを復元する ۩৅ߏจ໦ (Concrete Syntax Tree: CST) ׅހͳͲASTͰ͸ࣦΘΕͯ͠·͏৘ใ΋࢒ͨ͠ߏจ໦ AST͕ҙຯ(Semantics)ʹ஫໨͍ͯ͠Δͷʹରͯ͠ɺCST͸ߏจ(Syntax)ʹ஫ ໨͍ͯ͠Δ

    ಛ௃ τʔΫϯΛද͢σʔλߏ଄Λಋೖ͢Δ lexerͰམͱͯ͠͠·͏৘ใΛτʔΫϯʹඥ͚ͮΔ ϊʔυ͕τʔΫϯΛ࣋ͭΑ͏ʹ͢Δ ߏจ໦͔Β΋ͱͷίʔυΛ׬શʹ෮ݩͰ͖ΔΑ͏ʹ͢Δ ιʔείʔυͷ׬શͳ৘ใΛ࣋ͬͨߏจ໦
  19. Node, Token and Trivia τʔΫϯ͸લޙʹTriviaΛ΋ͭ ϊʔυ͸ϊʔυ / τʔΫϯΛ΋ͭ NODE_IF IF

    cond action_a END Token NODE ຌྫ space (1) NL (1) + space (2) NL (1) Trivia
  20. Red Green Tree: 編集容易な木 C# (Roslyn)ͷൃ໌ Swift (SwiftSyntax)΍ rust-analyzer (LSP)Ͱ΋࢖ΘΕ͍ͯΔ

    ߏจ໦ΛRed NodeͱGreen Nodeͱ͍͏2ͭͷσʔλߏ଄Ͱද ݱ͢Δ swift-syntaxΛಡ΋͏ʂ https://github.com/swiftlang/swift-syntax
  21. Red Green Tree Green Node ࢠ΁ͷࢀরΛ΋ͭ ࣗ਎ͷ෯(width)Λ΋ͭ Red Node ਌΁ͷࢀরΛ΋ͭ

    Ґஔ৘ใ(offset)Λ΋ͭ Token Green NODE ຌྫ Red NODE NODE_IF width: 90 IF width: 3 NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 NODE_IF o ff set: 0 NODE_ELSE o ff set: 25 NODE_IF o ff set: 30
  22. Offsetを持っている場合 มԽͷ͋ͬͨϊʔυ/τʔΫϯͷޙଓͷ͢΂ͯͷཁૉʹӨڹ͢Δ ӨڹΛड͚Δཁૉ NODE_IF o ff set: 0 IF o

    ff set: 0 NODE_IF o ff set: 30 condition_a o ff set: 3 action_a o ff set: 14 NODE_ELSE o ff set: 25 END o ff set: 86 ELSE o ff set: 25 IF o ff set: 30 condition_b o ff set: 35 action_b o ff set: 46 -> 47 END o ff set: 79 NODE_ELSE o ff set: 59 action_c o ff set: 66 ELSE o ff set: 59 ߋ৽!!
  23. 幅を持っている場合 มԽͷ͋ͬͨϊʔυ/τʔΫϯͷ਌ཁૉʹӨڹ͕ݶΒΕΔ ࢠཁૉ͔ΒܭࢉՄೳ ӨڹΛड͚Δཁૉ NODE_IF width: 90 -> 91 IF

    width: 3 NODE_IF width: 56 -> 57 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 -> 62 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 -> 14 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 ߋ৽!!
  24. まとめ ίʔυΛ࣮ߦ͍ͨ͠: compile.c ίʔυΛղੳ͍ͨ͠: LSP, Linter & Code Formatter ςΩετϕʔεͰͷίʔυͷॻ͖׵͕͑೉͍͠

    ߏจ໦ϕʔεͷίʔυॻ͖׵͑΁ ߏจ໦͔ΒίʔυΛ෮ݩ͍ͨ͠ ۩৅ߏจ໦ !! ߏจ໦ॻ͖׵͑ͷӨڹൣғΛڱ͍ͨ͘͠ Red Green Tree !!
  25. May 15th - 17th, 2024 NAHA CULTURAL ARTS THEATER NAHArt,

    Okinawa, Japan 進捗が出た結果、 やることが増えた!!