Upgrade to Pro — share decks privately, control downloads, hide ads and more …

最高の構文木の設計 2024年版

yui-knk
August 23, 2024

最高の構文木の設計 2024年版

yui-knk

August 23, 2024
Tweet

More Decks by yui-knk

Other Decks in Programming

Transcript

  1. About me • Yuichiro Kaneko • yui-knk (GitHub) / spikeolaf

    (Twitter) • Treasure Data • Engineering Manager of Applications Backend
  2. TD and Ruby committers twitter: @nalsh GitHub: @nurse twitter: @k_tsj

    GitHub: @k-tsj twitter: @ spikeolaf GitHub: @yui-knk twitter: @mineroaoki GitHub: @aamine twitter: @nahi GitHub: @nahi Applications Backend
  3. About me • CRuby committer, mainly develop parser generator and

    parser • Lrama LALR (1) parser generator (2023, Ruby 3.3) • The Bison Slayer • The parser monster • Parserքͷᴈ໌ᷭ • Ripper Rearchitecture (2024, Ruby 3.4) • Code positions to RNode (2018, Ruby 2.6) • RubyVM::AbstractSyntaxTree (2018, Ruby 2.6)
  4. if Λ࣮ߦͯ͠ΈΔ • ͜ͷ3ͭͷϓϩάϥϜΛ࣮ߦ͢Δ͜ͱΛߟ͑Δ • ৚݅ࣜ (`a == 1`) Λ࣮ߦ͢Δ

    • ࣮ߦ݁Ռ͕truthyͳΒ :t Λදࣔ͢Δ • ࣮ߦ݁Ռ͕falsyͳΒ :f Λදࣔ͢Δ • ஫: ޙஔifͰ͸else۟͸ଘࡏ͠ͳ͍ • ࣮ߦͷ࢓ํ͸ॻ͖ํʹґଘ͠ͳ͍
  5. if ͷٯ - unless - • bodyͱelseΛೖΕସ͑Δ͜ͱͰunless΋ಉ͡໦ߏ଄ͰදݱͰ͖Δ NODE_IF NODE_OPCALL "a

    == 1" NODE_FCALL "p :f” NODE_FCALL "p :t” cond body else NODE_UNLESS NODE_OPCALL "a == 1" NODE_FCALL "p :t" NODE_FCALL "p :f” cond body else
  6. ந৅ߏจ໦ͷา͖ํ • ߏจ໦Λ࣮ߦ͢Δ͍͞ʹ͸্(਌)͔ΒԼ(ࢠ)΁ḷΔ (2) nd_args NODE_CALL mid: :m NODE_VCALL nd_mid:

    :obj NODE_INTEGER val: 1 NODE_LIST (1) nd_recv NODE_INTEGER val: 2 (1) ϨγʔόʔͷίϯύΠϧ (2) Ҿ਺ͷίϯύΠϧ (3) ϝιου m ͷݺͼग़͠ (3)
  7. ίʔυղੳͷ۩ମྫ • LSP • ruby-lsp gem (Shopify/ruby-lsp) • Linter •

    RuboCop gem (rubocop/rubocop) • Code Formatter • RuboCop gem (rubocop/rubocop)
  8. τʔΫϯΛղੳ͢Δ • Semantic Tokens • def, foo, + ͳͲʹ৭Λ͚ͭΔػೳ •

    αʔόʔ͸֤τʔΫϯͷҐஔ৘ใͱछྨΛฦ͢ • ਖ਼֬ʹ͸Τϯίʔυͨ͠৘ใΛฦ͢ • ֤τʔΫϯͷҐஔ৘ใ΍छผΛऔಘ͍ͨ͠
  9. ॻ͖׵͑ͷ೉͠͞ • “Extract Variable”ͷΞϧΰϦζϜΛߟ͑ͯΈΔ • "var = ࢦఆ͞Εͨൣғ" ͱ͍͏ίʔυΛҰ্ͭͷߦʹ௥Ճ͢Δ •

    ࢦఆ͞ΕͨൣғΛ "var" Ͱஔ͖׵͑Δ 1 + 1 var = 1 + 1 var var = 1 + 1 Λ1ߦ্ʹ௥Ճ 1 + 1 Λ var Ͱஔ͖׵͑Δ
  10. ॻ͖׵͑ͷ೉͠͞ • ϫϯϥΠφʔͷϒϩοΫ಺෦ͷϦϑΝΫλϦϯάͰ͸1ߦ্ʹม਺Λఆٛ͠ ͯ΄͍͠Θ͚Ͱ͸ͳ͍ • "i + 1" ͱ͍͏ࣜͷલʹม਺Λఆٛͯ͠΄͍͠ •

    ϊʔυͷछྨ΍Ґஔ৘ใ͕ඞཁ 10.times {|i| i + 1 } var = i + 1 10.times {|i| var } i + 1 Λ var Ͱஔ͖׵͑Δ var = i + 1 Λ1ߦ্ʹ௥Ճ 10.times {|i| i + 1 } 10.times {|i| var = i + 1; var } i + 1 Λ var Ͱஔ͖׵͑Δ var = i + 1 Λ௚લʹ௥Ճ ⭕ ❌
  11. ༨ஊ: ίϝϯτΛͲ͏ѻ͍͍͔ͨ • ෳ਺ͷࣜͷ্ʹॻ͍ͯ͋Δίϝϯτ͸Ͳ͜ʹ͔͔Δͷ͔? • 1ͭͷࣜʹ͔͔Δέʔεͱෳ਺ࣜʹ͔͔Δέʔε͕͋Γͦ͏ • બ୒ൣғʹΑͬͯ੾Γग़͠ํΛม͑Δͷ͕͍͍͔ʁ def count

    # @obj can be nil until # #compute is called @obj&.count || 0 end def count # Default value is 0 @obj&.count || 0 end def count # @obj can be nil until # #compute is called var = @obj&.count var || 0 end def count var = @obj&.count # Default value is 0 var || 0 end
  12. Style::IfInsideElse ͷ࣮૷ • parser gemͰ͸unless΍ࡾ߲ԋࢉࢠ΍ޙஔif΋if nodeʹͳΔ • “if” nodeΛݕࡧͨ͠͏͑Ͱࡉ͔͘νΣοΫͯ͠໨త֎ͷϊʔυΛ஄͘ •

    #3. AllowIfModifier͕trueͳΒωετͨ͠ޙஔifͷέʔεΛ஄͘ 1. “if” nodeΛݕࡧ͢Δ 2. ࡾ߲ԋࢉࢠͱunlessΛ஄͘ 3. ޙஔifΛ஄͘ https://github.com/rubocop/rubocop/blob/v1.65.1/lib/rubocop/cop/style/if_inside_else.rb
  13. Style::IfInsideElseͷέʔε • ݕࡧ͍ͨ͠ifϊʔυ • ී௨ͷif • ࡾ߲ԋࢉࢠ • unless •

    ޙஔif (֎ଆͷϊʔυ) • `else_branch`͕nilʹͳ͍ͬͯΔ • ifϊʔυΛࡉ͔͘෼͚ͨ΄͏͕ݕࡧʹศར ??
  14. Code Formatterͷ৔߹ • Style::IfInsideElse ͷ #autocorrect ΛΈͯΈΔ • RuboCop::Cop::Corrector <

    Parser::Source::TreeRewriter ʹରͯ͠ૢ࡞͢Δ • ॻ͖׵͑Δൣғ (range)ͱ৽͍͠ςΩετ (content)Λࢦఆ͢Δ • #replace(range, content) • #remove(range) • #insert_before(range, content) • #insert_after(range, content)
  15. ίʔυॻ͖׵͑ͷ༷ࢠ if condition_a action_a else if condition_b action_b else action_c

    end end if condition_a action_a elsif condition_b if condition_b action_b else action_c end end if condition_a action_a elsif condition_b action_b action_b else action_c end end if condition_a action_a elsif condition_b action_b action_b else action_c end if condition_a action_a elsif condition_b action_b else action_c end 1. elseΛelsif΁ 2. if condition_b Λ࡟আ 3. ༨෼ͳendΛ ࡟আ 4.ॏෳ͍ͯ͠Δaction_b Λ࡟আ
  16. TreeRewriterͷ໰୊఺ #1 • ࣮૷͕ෳࡶ • TreeRewriter͕௚઀จࣈྻΛॻ͖׵͑ΔΘ͚Ͱ͸ͳ͍ • TreeRewriter::ActionͷΠϯελϯεΛͭͬͯ͘ɺ࠷ޙʹҰؾʹมߋΛՃ͑ Δ Action.

    :replace (2, 0)-(2, 4) “elsif condition_b” Action. :replace (3, 2)-(3, 16) “action_b” Action. :replace (7, 0)-(7, 6) “” Action. :replace (4, 0)-(4, 13) “”
  17. ActionΛ༻͍Δཧ༝ #1 • จࣈྻΛ౎౓ॻ͖׵͑Δͱίετ͕ߴ͍͔Β • ͲͪΒͷέʔε΋elseҎ߱ͷจࣈྻΛҠಈ(ίϐʔ)͠ͳ͍ͱ͍͚ͳ͍ if condition_a action_a else

    action_b end if condition_a action_a action_b end elseΛ࡟আ if condition_a action_a else action_b end if condition_a action_a elsif action_b end elsifʹஔ׵
  18. ActionΛ༻͍Δཧ༝ #2 • จࣈྻΛ௚઀ॻ͖׵͑ΔͱଞͷϊʔυʹӨڹ͢Δ͔Β if condition_a action_a else action_b end

    Parser::Source::Bu ff er if condition_a action_a action_b end Parser::Source::Bu ff er NODE_VCALL action_b Range (3, 2)-(3, 10) elseΛ࡟আ
  19. TreeRewriterͷ໰୊఺ #2 • ॻ͖׵͑࣌ͷૢ࡞͕൥ࡶ • ֤ૢ࡞ͰࠓͲͷΑ͏ͳঢ়ଶ͔ཧղ͠ͳ͍ͱ͍͚ͳ͍ if condition_a action_a else

    if condition_b action_b else action_c end end if condition_a action_a elsif condition_b if condition_b action_b else action_c end end if condition_a action_a elsif condition_b action_b action_b else action_c end end if condition_a action_a elsif condition_b action_b action_b else action_c end if condition_a action_a elsif condition_b action_b else action_c end 1. elseΛelsif΁ 2. if condition_b Λ࡟আ 3. ༨෼ͳendΛ ࡟আ 4.ॏෳ͍ͯ͠Δaction_b Λ࡟আ
  20. ߏจ໦Λॻ͖׵͑ͨ৔߹ • #autocorrectΛߏจ໦ͷॻ͖׵͑Ͱදݱ͢Δ • ߏจ໦ͷߏ଄Λੜ͔͍ͨ͠ • NODE_IFΛNODE_ELSIFʹͯ͠ɺNODE_ELSEΛ͚͢ NODE_IF condition_a action_a

    NODE_ELSE NODE_IF condition_b action_b NODE_ELSE action_c NODE_IF condition_a action_a NODE_ELSIF condition_b action_b action_c
  21. Extract Variable ࠶ߟ • ߏจ໦ϕʔεͷ৔߹ͷΞϧΰϦζϜ • "var = ࢦఆ͞Εͨൣғ" ͱ͍͏ίʔυΛҰͭखલͷࣜͱͯ͠௥Ճ͢Δ

    • ࢦఆ͞ΕͨൣғΛ "var" Ͱஔ͖׵͑Δ ❌ NODE_BLOCK NODE_OPCALL i + 1 NODE_BLOCK NODE_LVAR new_variable NODE_LASGN new_variable = i + 1 ⭕
  22. ·ͱΊ: ͞·͟·ͳϢʔεέʔε • ਌ϊʔυ͔Βࢠϊʔυɺࢠϊʔυ͔Β਌ϊʔυͷ྆ํ޲ʹḷΓ͍ͨ • τʔΫϯͷҐஔ৘ใΛऔಘ͍ͨ͠ • ίϝϯτͷ಺༰΍Ґஔ৘ใΛऔಘ͍ͨ͠ • ֤छϊʔυʹରͯ͠Syntax͔ΒΈͨͱ͖ͷϊʔυͷछྨͱɺSemantics͔ΒΈͨͱ͖

    ͷϊʔυͷछྨͷ2ͭͷ৘ใΛ͍࣋ͨͤͨ • ϊʔυͷॻ͖׵͑Λߦ͏͜ͱͰɺίʔυͷॻ͖׵͑Λߦ͍͍ͨ • ϊʔυͷ͍࣋ͬͯΔ৘ใͷΈΛ༻͍ͯݩͷίʔυΛ׬શʹ෮ݩ͍ͨ͠ • ϊʔυͷॻ͖׵͑ʹΑͬͯ΄͔ͷϊʔυͷҐஔ৘ใΛߋ৽͢ΔΑ͏ͳࣄଶ͸ආ͚ ͍ͨ
  23. ۩৅ߏจ໦ͱ͸ • ۩৅ߏจ໦ (Concrete Syntax Tree: CST) • ׅހͳͲASTͰ͸ࣦΘΕͯ͠·͏৘ใ΋࢒ͨ͠ߏจ໦ •

    AST͕ҙຯ(Semantics)ʹ஫໨͍ͯ͠Δͷʹରͯ͠ɺCST͸ߏจ(Syntax)ʹ஫໨ ͍ͯ͠Δ • ಛ௃ 1.τʔΫϯΛද͢σʔλߏ଄Λಋೖ͢Δ 2.lexerͰམͱͯ͠͠·͏৘ใΛτʔΫϯʹඥ͚ͮΔ 3.ϊʔυ͕τʔΫϯΛ࣋ͭΑ͏ʹ͢Δ • ߏจ໦͔Β΋ͱͷίʔυΛ׬શʹ෮ݩͰ͖ΔΑ͏ʹ͢Δ
  24. Node, Token and Trivia NODE_IF IF cond action_a END Token

    NODE ຌྫ space (1) NL (1) + space (2) NL (1) Trivia • ϊʔυ͸ϊʔυ / τʔΫϯΛ΋ͭ • τʔΫϯ͸લޙʹTriviaΛ΋ͭ
  25. Error Recovery • parser͕error recoveryΛ͢Δͱ͖ͷಈ࡞ • TokenΛແࢹ͢Δ • TokenΛ௥Ճ͢Δ •

    ௥Ճ͞ΕͨTokenΛ۩৅ߏจ໦ͰදݱͰ͖Δ NODE_IF IF true (a + b) END Token NODE ຌྫ Missing Token
  26. Triviaͷϧʔϧ • ϧʔϧ1: τʔΫϯ͸ࣗ਎ͷ͋ͱʹଓ͘TriviaΛอ࣋͢Δɻ͜ͷͱ͖վߦจࣈ͸ؚ·ͳ ͍ • ϧʔϧ 2: 1Ͱॴଐͷܾ·Βͳ͍Trivia͸ͦͷ௚ޙͷτʔΫϯʹॴଐ͢Δ •

    Ifͷ͋ͱͷۭന͸ifʹଐ͢Δ (ϧʔϧ1) • action_a ͸௚લͷվߦͱ2ͭͷۭനΛ΋ͭ (ϧʔϧ1&2) • end͸௚લͷվߦΛ΋ͭ (ϧʔϧ1&2) if cond\n action_a\n end if cond\n action_a\n end if cond\n action_a\n end
  27. ·ͱΊ: ۩৅ߏจ໦ • ਌ϊʔυ͔Βࢠϊʔυɺࢠϊʔυ͔Β਌ϊʔυͷ྆ํ޲ʹḷΓ͍ͨ • τʔΫϯͷҐஔ৘ใΛऔಘ͍ͨ͠ • ίϝϯτͷ಺༰΍Ґஔ৘ใΛऔಘ͍ͨ͠ • ֤छϊʔυʹରͯ͠Syntax͔ΒΈͨͱ͖ͷϊʔυͷछྨͱɺSemantics͔ΒΈͨͱ͖

    ͷϊʔυͷछྨͷ2ͭͷ৘ใΛ͍࣋ͨͤͨ • ϊʔυͷॻ͖׵͑Λߦ͏͜ͱͰɺίʔυͷॻ͖׵͑Λߦ͍͍ͨ • ϊʔυͷ͍࣋ͬͯΔ৘ใͷΈΛ༻͍ͯݩͷίʔυΛ׬શʹ෮ݩ͍ͨ͠ • ϊʔυͷॻ͖׵͑ʹΑͬͯ΄͔ͷϊʔυͷҐஔ৘ใΛߋ৽͢ΔΑ͏ͳࣄଶ͸ආ͚ ͍ͨ
  28. Red Green Treeͱ͸ • C# (Roslyn)ͷൃ໌ • Swift (SwiftSyntax)΍ rust-analyzer

    (LSP)Ͱ΋࢖ΘΕ͍ͯΔ • ߏจ໦ΛRed NodeͱGreen Nodeͱ͍͏2ͭͷσʔλߏ଄Ͱදݱ͢Δ • ͜ͷൃදͰ͸swift-syntaxͷ࣮૷Λओʹࢀߟʹ͍ͯ͠Δ • https://github.com/swiftlang/swift-syntax
  29. ͳͥRedͱGreenͳͷ͔ • https://learn.microsoft.com/en-us/archive/blogs/ericlippert/persistence- facades-and-roslyns-red-green-trees Incidentally, these are called "red/green trees"

    because those were the whiteboard marker colours we used to draw the data structure in the design meeting. There's no other meaning to the colours. • σʔλߏ଄ʹ͍ͭͯٞ࿦͍ͯ͠Δ࣌ʹ࢖͍ͬͯͨϚʔΧʔ͕੺ͱ྘ͩͬ ͨɻͦΕҎ্ͷཧ༝͸ͳ͍ɻ
  30. Green Nodeͷߏ଄ NODE_IF width: 90 IF width: 3 NODE_IF width:

    56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7
  31. Green Nodeͷಛ௃(1) • ࢠϊʔυ(Green Node)͓ΑͼτʔΫϯ΁ͷࢀরΛ΋ͭ Token NODE ຌྫ NODE_IF width:

    90 IF width: 3 NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7
  32. Green Nodeͷಛ௃(2) • ࣗ਎ͷ෯(width)Λ͍࣋ͬͯΔ • IF tokenͷ෯͕2͡Όͳ͍ͷ͸trivia͕͋Δ͔Β Green Node ຌྫ

    ௨ৗͷNode࣮૷ NODE_IF width: 90 IF width: 3 NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 NODE_IF o ff set: 0 NODE_IF o ff set: 30 Token
  33. OffsetΛ͍࣋ͬͯΔ৔߹ • มԽͷ͋ͬͨϊʔυ/τʔΫϯͷޙଓͷ͢΂ͯͷཁૉʹӨڹ͢Δ • action_bͷ຤ඌʹۭന͕ೖͬͨ࣌ ӨڹΛड͚Δཁૉ NODE_IF o ff set:

    0 IF o ff set: 0 NODE_IF o ff set: 30 condition_a o ff set: 3 action_a o ff set: 14 NODE_ELSE o ff set: 25 END o ff set: 86 ELSE o ff set: 25 IF o ff set: 30 condition_b o ff set: 35 action_b o ff set: 46 -> 47 END o ff set: 79 NODE_ELSE o ff set: 59 action_c o ff set: 66 ELSE o ff set: 59
  34. ෯Λ͍࣋ͬͯΔ৔߹ • มԽͷ͋ͬͨϊʔυ/τʔΫϯͷ਌ཁૉʹӨڹ͕ݶΒΕΔ • ໦ͷߴ͞(log n)ͰࡁΉ ӨڹΛड͚Δཁૉ NODE_IF width: 90

    -> 91 IF width: 3 NODE_IF width: 56 -> 57 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 -> 62 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 -> 14 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7
  35. Red Nodeͷߏ଄ NODE_IF width: 90 IF width: 3 NODE_IF width:

    56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 NODE_IF o ff set: 0 NODE_ELSE o ff set: 25 NODE_IF o ff set: 30
  36. Red Nodeͷಛ௃(1) • ࢠϊʔυ(Green Node)ͱ਌ϊʔυ(Red Node)΁ͷࢀরΛ΋ͭ • ࢠ͔Β਌΁ͷΞΫηε͕؆୯ Token Green

    NODE ຌྫ Red NODE NODE_IF width: 90 IF width: 3 NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 NODE_IF o ff set: 0 NODE_ELSE o ff set: 25 NODE_IF o ff set: 30
  37. Red Nodeͷಛ௃(2) • ϑΝΠϧͷઌ಄͔ΒͷҐஔ৘ใ(offset)Λ΋ͭ Token Green NODE ຌྫ Red NODE

    NODE_IF width: 90 IF width: 3 NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 NODE_IF o ff set: 0 NODE_ELSE o ff set: 25 NODE_IF o ff set: 30
  38. offsetͷܭࢉํ๏ • offset͸਌Red NodeͷoffsetͱࢠͷGreen Nodeͷwidth͔ΒܭࢉͰ͖Δ NODE_IF width: 90 IF width:

    3 NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 NODE_IF o ff set: 0 NODE_ELSE o ff set: 25 (3 + 11 + 11) NODE_IF o ff set: 30 25 + (5)
  39. Red Node & Green Node • Red Node΋Green Node΋immutableͰpersistentͳσʔλߏ଄ͱ͢Δ •

    ߏจ໦ͷϢʔβʔʹ͸Red NodeͷΈΛެ։͠ɺGreen Node͸internalͳσ ʔλߏ଄ͱ͢Δ
  40. Immutable • Ұ౓࡞ͬͨNodeͷଐੑΛޙ͔Βߋ৽͠ͳ͍/Ͱ͖ͳ͍ • มߋΛՃ͍͑ͨͱ͖͸ίϐʔΛͭͬͯ͘Ұ෦ͷଐੑΛॻ͖׵͑Δ • େ෦෼ͷNode΍Token͸ڞ༗Ͱ͖Δ IF width: 3

    END width: 4 NODE_IF width: 90 NODE_IF width: 90 IF width: 3 END width: 4 NODE_IF width: 91 IF width: 4 If ͷ͋ͱʹۭനΛ௥Ճ
  41. Ӭଓσʔλߏ଄ (Persistent) • ߏจ໦ͷมߋͷલޙɺͲͪΒͷόʔδϣϯʹ΋ΞΫηεͰ͖Δ NODE_IF (ver.1) width: 90 IF width:

    3 END width: 4 NODE_IF (ver.1) width: 90 IF width: 3 END width: 4 NODE_IF (ver.2) width: 91 IF width: 4
  42. ύʔε݁Ռ͸Red Node • ιʔείʔυΛύʔεͨ͠ͱ͖͸rootͷRed Node͕ฦͬͯ͘Δ parse ͜Ε͕ฦͬͯ͘Δ NODE_IF width: 90

    IF width: 3 NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 NODE_IF o ff set: 0
  43. ࢠཁૉ΋Red Node • ࢠϊʔυɺτʔΫϯʹΞΫηεͨ͠ͱ͖͸Red Nodeʹwrap͞Εͨཁૉ͕ ฦΔ NODE_IF width: 90 IF

    width: 3 NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 NODE_IF o ff set: 0 TOKEN_IF o ff set: 0 NODE_ELSE o ff set: 25 node#if_token node#else_node
  44. ৽ن࡞੒࣌΋Red Node • NodeΛ৽͘͠࡞Δͱ͖͸Red NodeΛ࡞Δ • ৽نʹelsifͱelseͷτʔΫϯΛ࡞Δ͕ɺ͜Ε΋Red Node (Red Token)

    • ৽͘͠࡞ͬͨRed Node͸Ҿ਺ͷRed Node͔ΒGreen NodeΛऔΓग़ͯ͠ɺ ࣗ෼ͷGreen Nodeʹඥ͚ͮΔ # good if condition_a action_a elsif condition_b action_b else action_c end
  45. Node࡞੒ͷཪଆ NODE_ELSE width: 61 NODE_IF width: 56 NODE_IF o ff

    set: 25 condition_b o ff set: 30 action_b o ff set: 41 action_c o ff set: 47 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7
  46. Node࡞੒ͷཪଆ NODE_ELSE width: 61 NODE_IF width: 56 NODE_IF o ff

    set: 25 condition_b o ff set: 30 action_b o ff set: 41 action_c o ff set: 47 ELSIF width: 7 ELSE width: 5 ELSIF o ff set: 0 ELSE o ff set: 0 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 Tokenͷੜ੒
  47. Node࡞੒ͷཪଆ NODE_ELSE width: 61 NODE_IF width: 56 NODE_IF o ff

    set: 25 condition_b o ff set: 30 action_b o ff set: 41 action_c o ff set: 47 ELSIF width: 7 ELSE width: 5 ELSIF o ff set: 0 ELSE o ff set: 0 ຌྫ NODE_ELSIFͷҾ਺ NODE_ELSIF o ff set: 0 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 ৽͍͠Node
  48. Node࡞੒ͷཪଆ NODE_ELSE width: 61 NODE_IF width: 56 NODE_IF o ff

    set: 25 condition_b o ff set: 30 action_b o ff set: 41 action_c o ff set: 47 NODE_ELSIF width: 49 ELSIF width: 7 ELSE width: 5 ELSIF o ff set: 0 ELSE o ff set: 0 ຌྫ NODE_ELSIFͷҾ਺ NODE_ELSIF o ff set: 0 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 ৽͍͠Node width: 11 + 13 + 13 + 7 + 5 = 49
  49. ॻ͖׵͑ͷཪଆ NODE_IF width: 90 IF width: 3 condition_a width: 11

    action_a width: 11 NODE_ELSE width: 61 END width: 4 NODE_IF o ff set: 0 NODE_ELSE o ff set: 25 • ॻ͖׵͑લͷঢ়ଶ
  50. ॻ͖׵͑ͷཪଆ NODE_IF width: 90 IF width: 3 condition_a width: 11

    action_a width: 11 NODE_ELSE width: 61 END width: 4 NODE_ELSIF width: 49 NODE_IF o ff set: 0 NODE_ELSE o ff set: 25 NODE_ELSIF o ff set: 0 • ઌ΄Ͳͭͬͨ͘NODE_ELSIFΛ΋ͬͯ͘Δ
  51. ॻ͖׵͑ͷཪଆ NODE_IF width: 90 IF width: 3 condition_a width: 11

    action_a width: 11 NODE_ELSE width: 61 END width: 4 NODE_ELSIF width: 49 NODE_IF o ff set: 0 NODE_ELSE o ff set: 25 NODE_IF o ff set: 0 NODE_IF width: 78 NODE_ELSIF o ff set: 0 • ৽͘͠NODE_IF (Red Node & Green Node)Λ ͭ͘Δ
  52. ਂ͍஍఺ͷॻ͖׵͑ • ྫ: action_c ͷtrivia͕มΘͬͨͱ͖ NODE_IF width: 90 IF width:

    3 NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 if condition_a action_a else if condition_b action_b else action_c ; end end ߋ৽
  53. ਂ͍஍఺ͷॻ͖׵͑ • ߋ৽લͷߏจ໦ NODE_IF width: 90 IF width: 3 NODE_IF

    width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 ߋ৽ action_c NODE_ELSE NODE_IF NODE_ELSE NODE_IF
  54. ਂ͍஍఺ͷॻ͖׵͑ • ৽͘͠action_cͷnodeΛͭ͘Δ NODE_IF width: 90 IF width: 3 NODE_IF

    width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 ߋ৽ action_c width: 15 action_c NODE_ELSE NODE_IF NODE_ELSE NODE_IF
  55. ਂ͍஍఺ͷॻ͖׵͑ • action_cͷRed NodeΛͭ͘Δ NODE_IF width: 90 IF width: 3

    NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 ߋ৽ action_c width: 15 action_c NODE_ELSE NODE_IF NODE_ELSE NODE_IF action_c
  56. ਂ͍஍఺ͷॻ͖׵͑ • ࢠཁૉ΋෯΋มΘΔͷͰGreen NodeΛͭ͘Δ NODE_IF width: 90 IF width: 3

    NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 ߋ৽ action_c width: 15 action_c NODE_ELSE NODE_IF NODE_ELSE NODE_IF NODE_ELSE width: 22 action_c
  57. ਂ͍஍఺ͷॻ͖׵͑ • NODE_ELSEͷRed NodeΛͭ͘Δ NODE_IF width: 90 IF width: 3

    NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 ߋ৽ action_c width: 15 action_c NODE_ELSE NODE_IF NODE_ELSE NODE_IF NODE_ELSE width: 22 action_c NODE_ELSE
  58. ਂ͍஍఺ͷॻ͖׵͑ • ಉ༷ʹͯ͠root·ͰͷશͯͷRed NodeΛ࡞Γ௚͢ඞཁ͕͋Δ NODE_IF width: 90 IF width: 3

    NODE_IF width: 56 condition_a width: 11 action_a width: 11 NODE_ELSE width: 61 END width: 4 ELSE width: 5 IF width: 5 condition_b width: 11 action_b width: 13 END width: 7 NODE_ELSE width: 20 action_c width: 13 ELSE width: 7 ߋ৽ action_c width: 15 action_c NODE_ELSE NODE_IF NODE_ELSE NODE_IF NODE_ELSE width: 22 NODE_IF width: 58 NODE_ELSE width: 63 NODE_IF width: 92 action_c NODE_ELSE NODE_IF NODE_ELSE NODE_IF
  59. Rewriterͷ࢓૊Έ • ਂ͞༏ઌͰϊʔυΛḷͬͯɺมԽ͕͋Ε͹ؼΓ͕͚ʹϊʔυΛ࠶ߏங͢Δ NODE_CLASS ver. 1 NODE_DEF ver. 1 NODE_DEF

    ver. 1 NODE_IF ver. 1 NODE_CALL ver. 1 NODE_CASE ver. 1 ߋ৽ NODE_CLASS ver. 1 NODE_DEF ver. 2 NODE_DEF ver. 1 NODE_IF ver. 2 NODE_CALL ver. 1 NODE_CASE ver. 2 ߋ৽ ৽ن ৽ن ࡞Γ௚͢લʹӈͷDEF΁͍͘
  60. Rewriterͷ࢓૊Έ • 1ϊʔυʹରͯ͠࠷େ1૊ͷϊʔυΛ࡞Δ͚ͩͰࡁΉ • ͦΕͧΕ͸root nodeͱͯ͠ϊʔυΛ࡞ΔͷͰ਌ͷߋ৽Λ͠ͳͯ͘ࡁΉ NODE_CLASS ver. 1 NODE_DEF

    ver. 2 NODE_DEF ver. 1 NODE_IF ver. 2 NODE_CALL ver. 1 NODE_CASE ver. 2 ߋ৽ NODE_CLASS ver. 1 NODE_DEF ver. 2 NODE_DEF ver. 2 NODE_IF ver. 2 NODE_CALL ver. 2 NODE_CASE ver. 2 NODE_CLASS ver. 2 NODE_DEF ver. 2 NODE_DEF ver. 2 NODE_IF ver. 2 NODE_CALL ver. 2 NODE_CASE ver. 2 ৽ن ৽ن
  61. Rewriter ଞݴޠͰͷ࣮૷ • Rewriter͸൚༻తͳ࣮૷ͳͷͰίΞʹ͍ۙͱ͜ΖͰఏڙ͢Δͷ͕Α͍ͷͰ ͸ͳ͍͔ • Roslyn: CSharpSyntaxRewriter • https://github.com/dotnet/roslyn/blob/v4.2.0/src/Compilers/CSharp/

    Portable/Syntax/CSharpSyntaxRewriter.cs • SwiftSyntax: SyntaxRewriter • https://github.com/swiftlang/swift-syntax/blob/600.0.0- prerelease-2024-07-30/Sources/SwiftSyntax/generated/ SyntaxRewriter.swift
  62. Ӭଓσʔλߏ଄ͷརศੑ • Code FormatterͷRule (Cop)͸ߏจ໦Λड͚औͬͯɺ৽͍͠෦෼໦Λੜ੒ ͯ͠ɺߏจ໦શମΛߋ৽͢Δ • Rule͸ߋ৽લޙͷ໦Λ΋ͯΔ • มߋલޙͷίʔυΛදࣔͰ͖Δ

    IfInsideElse if condition_a action_a else if condition_b action_b else action_c end end if condition_a action_a elsif condition_b action_b else action_c end NODE_IF width: 78 NODE_ELSE width: 49 NODE_IF o ff set: 0 NODE_ELSIF o ff set: 0 NODE_IF width: 90 NODE_ELSE width: 61 NODE_IF o ff set: 0 NODE_ELSE o ff set: 25
  63. ·ͱΊ: Red Green Tree • ߏจ໦ΛRed Tree/NodeͱGreen Tree/Nodeͱ͍͏2ͭͷσʔλߏ଄Ͱද͢ • Green

    Node͸ࣗ਎ͱࢠཁૉʹดͨ͡৘ใΛ΋ͭ • Red Node͸ͦΕҎ֎ͷ৘ใΛ΋ͭ • Red Node΋Green Node΋immutableͰpersistentͳσʔλߏ଄ • ߏจ໦ͷϢʔβʔʹ͸Red NodeͷΈΛެ։͠ɺGreen Node͸internalͳσ ʔλߏ଄ͱ͢Δ
  64. ·ͱΊ: Red Green Tree • ਌ϊʔυ͔Βࢠϊʔυɺࢠϊʔυ͔Β਌ϊʔυͷ྆ํ޲ʹḷΓ͍ͨ • τʔΫϯͷҐஔ৘ใΛऔಘ͍ͨ͠ • ίϝϯτͷ಺༰΍Ґஔ৘ใΛऔಘ͍ͨ͠

    • ֤छϊʔυʹରͯ͠Syntax͔ΒΈͨͱ͖ͷϊʔυͷछྨͱɺSemantics͔ΒΈͨͱ͖ ͷϊʔυͷछྨͷ2ͭͷ৘ใΛ͍࣋ͨͤͨ • ϊʔυͷॻ͖׵͑Λߦ͏͜ͱͰɺίʔυͷॻ͖׵͑Λߦ͍͍ͨ • ϊʔυͷ͍࣋ͬͯΔ৘ใͷΈΛ༻͍ͯݩͷίʔυΛ׬શʹ෮ݩ͍ͨ͠ • ϊʔυͷॻ͖׵͑ʹΑͬͯ΄͔ͷϊʔυͷҐஔ৘ใΛߋ৽͢ΔΑ͏ͳࣄଶ͸ආ͚ͨ ͍
  65. ·ͱΊ: ͞·͟·ͳϢʔεέʔε • ਌ϊʔυ͔Βࢠϊʔυɺࢠϊʔυ͔Β਌ϊʔυͷ྆ํ޲ʹḷΓ͍ͨ • τʔΫϯͷҐஔ৘ใΛऔಘ͍ͨ͠ • ίϝϯτͷ಺༰΍Ґஔ৘ใΛऔಘ͍ͨ͠ • ֤छϊʔυʹରͯ͠Syntax͔ΒΈͨͱ͖ͷϊʔυͷछྨͱɺSemantics͔ΒΈͨͱ͖

    ͷϊʔυͷछྨͷ2ͭͷ৘ใΛ͍࣋ͨͤͨ • ϊʔυͷॻ͖׵͑Λߦ͏͜ͱͰɺίʔυͷॻ͖׵͑Λߦ͍͍ͨ • ϊʔυͷ͍࣋ͬͯΔ৘ใͷΈΛ༻͍ͯݩͷίʔυΛ׬શʹ෮ݩ͍ͨ͠ • ϊʔυͷॻ͖׵͑ʹΑͬͯ΄͔ͷϊʔυͷҐஔ৘ใΛߋ৽͢ΔΑ͏ͳࣄଶ͸ආ͚ ͍ͨ
  66. Կ͕trivia͔ • Rubyʹ͓͍ͯ͸վߦ͸τʔΫϯͱͯ͠ѻΘΕΔέʔεͱlexerͰམͱ͞ΕΔ έʔε (tIGNORED_NL) ͕͋Δ • ํ๏1: τʔΫϯʹͳͬͨվߦ͸τʔΫϯɺͦ͏Ͱͳ͍΋ͷ͸triviaʹ͢Δ •

    ํ๏2: શͯͷվߦΛtriviaʹ͢Δ • ηϛίϩϯ͸trivialʹͨ͠ํ͕ѻ͍΍͍͢ͷͰ͸ͳ͍͔ʁ • ()͸ϊʔυͷ΄͏͕͍͍ͷ͔ʁ
  67. ώΞυΩϡϝϯτ • ώΞυΩϡϝϯτ͸1ߦʹෳ਺ݸॻ͘͜ͱ͕Ͱ͖Δ • ͜ͷΑ͏ͳέʔεͰ΋ݩͷίʔυ͕෮ݩͰ͖ΔΑ͏ͳϊʔυʹ͠ͳ͍ͱ͍ ͚ͳ͍ • Green Tree͸۪௚ʹͭͬͯ͘ɺRed Treeͷoffset΍ςΩετʹ໭͢ͱ͖ʹ޻

    ෉͢Δͷ͕Α͍ͷͰ͸ͳ͍͔ puts <<~STR1 + <<~STR2 111 STR1 aaa STR2 # => # 111 # aaa NODE_HEREDOC width: 90 <<~ width: 3 111 width: 4 STR1 width: 4 STR1 width: 5
  68. ߏจ໦΁ͷཁ๬ • ਌ϊʔυ͔Βࢠϊʔυɺࢠϊʔυ͔Β਌ϊʔυͷ྆ํ޲ʹḷΓ͍ͨ • τʔΫϯͷҐஔ৘ใΛऔಘ͍ͨ͠ • ίϝϯτͷ಺༰΍Ґஔ৘ใΛऔಘ͍ͨ͠ • ֤छϊʔυʹରͯ͠Syntax͔ΒΈͨͱ͖ͷϊʔυͷछྨͱɺSemantics͔ΒΈͨͱ͖ ͷϊʔυͷछྨͷ2ͭͷ৘ใΛ͍࣋ͨͤͨ

    • ϊʔυͷॻ͖׵͑Λߦ͏͜ͱͰɺίʔυͷॻ͖׵͑Λߦ͍͍ͨ • ϊʔυͷ͍࣋ͬͯΔ৘ใͷΈΛ༻͍ͯݩͷίʔυΛ׬શʹ෮ݩ͍ͨ͠ • ϊʔυͷॻ͖׵͑ʹΑͬͯ΄͔ͷϊʔυͷҐஔ৘ใΛߋ৽͢ΔΑ͏ͳࣄଶ͸ආ͚ ͍ͨ
  69. ߏจ໦ͷઃܭͱ࣮૷ • ߏจ໦ͷઃܭ: • ۩৅ߏจ໦ • ߏจ໦ͷ࣮૷: • Red Green

    Tree • ந৅ߏจͷ৘ใΛ΋ͬͨ۩৅ߏจ໦ • ߏจ໦΁ͷཁ๬Λୡ੒Ͱ͖ͨʂ • ࠓޙ΍Δ͜ͱ • Rubyݻ༗ͷ໰୊Λղ͘ • ࣮૷Λ͢Δ