Link
Embed
Share
Beginning
This slide
Copy link URL
Copy link URL
Copy iframe embed code
Copy iframe embed code
Copy javascript embed code
Copy javascript embed code
Share
Tweet
Share
Tweet
Slide 1
Slide 1 text
GoͰParserΛॻ͘ kamakura.go #6 @karupanerura
Slide 2
Slide 2 text
@karupanerura • Perl / Go / Java / TypeScript / etc.. • Software Engineer @ DeNA, Co,. LTD. • දཧࣄ @ Japan Perl Association • GoConؾ͍ͨΒCfPऴΘͬͯͨ
Slide 3
Slide 3 text
Parserྺ • ͦͦ͜͜Parserॻ͍ͨ͜ͱ͋Δ • Perl (JSON5, TOML::Parser, Text::MustacheTemplate) • C (c-geohex3, MySQL::Dump::Parser::XS, etc..) • Go (gqlparser, google-cloud-work fl ow-emulator) • But ମܥతʹֶΜͩ͜ͱͳ͍ • ͠ͳΜ͔͓͔͠ͳ͜ͱݴͬͯͨΒποίϛ΄͍͠Ͱ͢
Slide 4
Slide 4 text
jpa.Promotion()
Slide 5
Slide 5 text
No content
Slide 6
Slide 6 text
2024/10/05 (Sat) ͩͯ͜ະདྷେֶ
Slide 7
Slide 7 text
gopherܴ
Slide 8
Slide 8 text
ຊ
Slide 9
Slide 9 text
͢͜ͱ • ςΩετΛରͱͨ͠Parserͷجૅ • طͷΞϧΰϦζϜπʔϧͷ؆୯ͳհ • GoͰ࣮ࡍʹॻ͍ͨParserͷ࣮ํ๏ͷհ
Slide 10
Slide 10 text
͞ͳ͍͜ͱ • ςΩετҎ֎ͷύʔεʢόΠφϦͷύʔεͳͲʣ • طͷΞϧΰϦζϜͷৄࡉ • طଘͷϥΠϒϥϦͷ͍ํ • ChatGPT-4o
Slide 11
Slide 11 text
Parser
Slide 12
Slide 12 text
ಛఆͷߏจΛղऍ͢Δͭ
Slide 13
Slide 13 text
Parser is Կ • ߏจʹԊͬͨςΩετΛղऍͯ͠ϓϩάϥϜ͕ѻ͑Δߏʹ͢Δॲཧ • ߏจղੳɺParseͱݺͿ • GopherʹҰ൪ೃછΈ͕ਂ͍ͱ͜ΖͰGoͷίʔυͷParser? • go/parser • go/ast
Slide 14
Slide 14 text
ߏจΛղੳ͢ΔҰൠతͳखॱ • (Tokenize) ςΩετΛҙຯͷ͋Δ୯Ґʹ·ͱΊΔ • ྫ: "123+456" -> "123", "+", "456" • ͜ΕΛΔͭLexerɺTokenizerɺScannerͳͲͱݺΕΔ • (Parse) (্هͷ·ͱ·ΓΛͱʹ)ͦͷҙຯΛදݱ͢Δߏʹม͢Δ • ྫ: "123", "+", "456" -> Plus{Left: 123, Right: 456} • ڱٛͷParserͱͯ͠Parserͱݴ͍ͭͭ͜Ε͚ͩΛࢦ͢͜ͱ͋Δ
Slide 15
Slide 15 text
ͦͷଞɺසग़୯ޠ • AST (Abstract Syntax Tree) ͋Δ͍ நߏจ • ςΩετΛ͋Δߏจͱͯ͠ղऍͨ͠ͱ͖ͷҙຯΛߏͱͯ͠දݱ • લड़ͷParserͷ݁Ռͱͯ͠େ͜ΕΛ࡞Δ • BNF • ߏจͷߏΛهड़͢ΔͨΊͷߏจ • ABNFͳͲੜܗ͋Δ
Slide 16
Slide 16 text
ࣜͷ߹
Slide 17
Slide 17 text
1+2
Slide 18
Slide 18 text
Integer(1) Symbol('+') Integer(2)
Slide 19
Slide 19 text
Binary{ OP: '+', L: Integer(1), R: Integer(2), }
Slide 20
Slide 20 text
123+456
Slide 21
Slide 21 text
Integer(123) Symbol('+') Integer(456)
Slide 22
Slide 22 text
Binary{ OP: '+', L: Integer(123), R: Integer(456), }
Slide 23
Slide 23 text
1+(2-1)*2+3
Slide 24
Slide 24 text
Integer(1) Symbol('+') Symbol('(') Integer(2) Symbol('-') Integer(1) Symbol(')') Symbol('*') Integer(2) Symbol('+') Integer(3)
Slide 25
Slide 25 text
Binary{ OP: '+', L: Binary{ OP: '+', L: Integer(1), R: Binary{ OP: '*', L: Binary{OP: '-', L: Integer(2), R: Integer(1)}, R: Integer(2), }, }, R: Integer(3), }
Slide 26
Slide 26 text
͜͏͍͏ײ͡ͷ͜ͱΛ͢Δ
Slide 27
Slide 27 text
༏ઌ͕བྷΉͱͪΐͬͱհ Platt Parserͱ͔ʹ͢Δඞཁ͕͋Δ
Slide 28
Slide 28 text
࣮͖͋Β͔ʹ໘ͦ͘͞͏
Slide 29
Slide 29 text
࿕ใ
Slide 30
Slide 30 text
ࣗಈੜͰ͖·͢
Slide 31
Slide 31 text
goyacc • yaccͷgo൛ • yaccͱ͍͏͕ͭੲ͔Β͋Δ • BNFΛͬͯߏจͷߏͱͦΕͧΕʹର͢ΔॲཧΛهड़͢Δ • ͍ΘΏΔڱٛͷParser͚ͩΛੜ͢Δ • Lexer࡞ͬͯ͘Εͳ͍ • ͳΜͱgoʹҰॹʹ͍ͭͯ͘Δ
Slide 32
Slide 32 text
goyaccͷྫ • άάΔͱ͍ͬͺ͍Ͱ͖ͯ·͢
Slide 33
Slide 33 text
goyaccҎ֎ͷࣅͨΑ͏ͳͭ͋Δ • ͜͏͍͏ͭΛҰൠʹParser Generatorͱ͍͍·͢ • github.com/alecthomas/participle • GoߏମλάͳͲͰߏจΛදݱ͢Δ • ANTLR • ͍ΖΜͳݴޠͰੜͰ͖Δ • ͳͲͳͲ
Slide 34
Slide 34 text
͜͏͍͏ͷͰ͍͍͡ΌΜ
Slide 35
Slide 35 text
ऴ
Slide 36
Slide 36 text
ͦ͏ͳΒΜ͜ͱ͋ΔΜ • goyaccͱ͔େ͚֓ͬ͜͏ͩΔ͍ʢ˞ݸਓͷײͰ͢ʣ • BNFͷྨॻ͍͍ͯͯ͋Μ·Γ໘ന͘ͳ͍ʢ˞ݸਓͷײͰ͢ʣ • πʔϧߏจࣗମΛͪΌΜͱֶश͢Δඞཁ͕͋Δ • ୭Ͱ΄͍ͦΕͱ͑Δ΄Ͳʹ؆୯Ͱͳ͍ • ࡉ͔͍ͱ͜ΖͷখճΓ͕ޮ͔ͳ͍ • ύϑΥʔϚϯεɾνϡʔχϯά͍͠߹͕͋Δ
Slide 37
Slide 37 text
※େ֓Parser Generator͏΄͏͕ྑ͍ • ParserΛॻ͘ͷେม • ߏจ͕ෳࡶͰ͋ΕෳࡶͰ͋Δ΄ͲԸܙͰ͔͍ • ࡉ͔͍ͱ͜ΖͷখճΓ͕ޮ͔ͳͯ͘ͳΜͱ͔ͳΔ͜ͱଟ͍
Slide 38
Slide 38 text
खॻ͖͢Δͧ
Slide 39
Slide 39 text
ࡐ: GQL Parser • GraphQLͰͳ͘Google Cloudʹ͋ΔಠࣗͷSQLతͳݴޠ • Cloud Firestore datastore mode (Cloud Datastore) ͷΫΤϦݴޠ • https://cloud.google.com/datastore/docs/reference/gql_reference • goyaccʹΑΔ࣮طग़ • https://github.com/nshmura/dsio/blob/master/gql/parser.go.y
Slide 40
Slide 40 text
GQLͷྫ • SELECT * FROM foo • SELECT __key__ FROM foo • SELECT DISTINCT f1, f2 FROM foo • SELECT DISTINCT ON (f1, f2) f1, f2, f3 FROM foo • SELECT * FROM foo WHERE f1 = 1 AND (f2 = 2 OR f3 = 3)
Slide 41
Slide 41 text
github.com/karupanerura/gqlparser • github.com/karupanerura/datastore-cli Λ࡞͍ͬͯΔͱ͖ʹੜ·Εͨ • github.com/nshmura/dsio ͕ઌߦͯ͠ଘࡏ͚ͨ͠ͲΒͳ͔ͬͨ • ͰͲͷΈͪࣗͷ͍ํʹ߹Θͳ͍ΠϯλʔϑΣʔεͩͬͨ • GQLΛParseͯ͠ASTΛ࡞ͬͯ͘ΕΔ
Slide 42
Slide 42 text
※͔͜͜Βgqlparserͷ࣮հ
Slide 43
Slide 43 text
Lexerͷख࡞Γ • దͰͳΜͱ͔ͳΔ • ෦Ұக͢ΔΑ͏ͳΩʔϫʔυͷѻ͍ʹҙ • ࠷ҰகͰѻ͏ͱ͔ϧʔϧΛܾΊΔ • SliceΛฦ͢ΑΓio.ReaderతͳͭΛฦ͢΄͏͕ศར • ઌಡΈΈ͍ͨͳ͜ͱΛͨ͘͠ͳΔ߹͕͋Δ • https://github.com/karupanerura/gqlparser/blob/main/lexer.go
Slide 44
Slide 44 text
github.com/karupanerura/runetrie • LexerͰ࠷Ұக͢Δͷ͕ͩΔͯ͘࡞ͬͨ • ҙͷจࣈྻͷηοτΛTrie(τϥΠ)ʹ·ͱΊΔ • จࣈྻͷઌ಄ʹ࠷Ұகͨ͠ཁૉΛฦ͢͜ͱ͕Ͱ͖Δ • LongestMatchPre fi xOf
Slide 45
Slide 45 text
Parserͷख࡞Γ • ؾ߹ͰͳΜͱ͔͢ΕͳΜͱ͔ͳΔ • ͨͩ͠Τϥʔॲཧ͕ύϥύϥͯ͠େม • ಡΈͮΒ͞Λղফ͢ΔͨΊʹ෦DSLͬΆ͍ײ͡Ͱॻ͚Δͱ͏Ε͍͠ • ࠓճࣗ࡞ͨ͠
Slide 46
Slide 46 text
tokenAcceptor type tokenAcceptor interface { accept(tokenReader) error } type tokenReader interface { Next() bool Read() (Token, error) } // ҎԼ͞·͟·ͳtokenAcceptorͷ࣮
Slide 47
Slide 47 text
tokenAcceptor༻ྫ func acceptQuery(query *Query) tokenAcceptor { return tokenAcceptors{ skipWhitespaceToken, acceptKeyword("SELECT"), acceptWhitespaceToken, acceptSelectQueryBody(query), } } func acceptSelectQueryBody(query *Query) tokenAcceptor { return tokenAcceptors{ &conditionalTokenAcceptor{ ifAccept: acceptKeyword("DISTINCT"), andThen: acceptDistinctBody(query), orElse: nopAcceptor, }, ...
Slide 48
Slide 48 text
͍ํ • ߏจʹ͋ΘͤͯtokenAcceptorΛ߹͍ͯ͘͠ • ݅ذΈ͍ͨͳ͜ͱͰ͖ΔΑ͏ʹͳ͍ͬͯΔ • ઌಡΈͷΘΓʹಡΜͩҐஔΛϩʔϧόοΫ͢ΔΈ͕͋Δ • https://github.com/karupanerura/gqlparser/blob/main/ parser.go#L396 • ͜͏͍͏ͷҰൠతʹͳΜ͍ͯ͏ΜͰ͔͢ʁ෦DSLͷPEGʁ
Slide 49
Slide 49 text
tokenAcceptorͰߏ͍ͯ͠ͳ͍෦ • ͜ͷ෦͚ͩ୯ಠͷtokenAcceptorͱͯ͠ఆٛͯ͠ಠཱ • WHERE۟ͷ෦ͷParser • Platt Parserʹͳ͍ͬͯΔ • ANDͱORͱͦΕҎ֎ͷԋࢉࢠͷ༏ઌॱҐͱ͔͕͋Δ • https://github.com/karupanerura/gqlparser/blob/main/ condition_parser.go
Slide 50
Slide 50 text
͏·͍͜ͱͰ͖ͨ
Slide 51
Slide 51 text
·ͱΊ • ParserΛॻ͘ͷ͍͠ • Parser Generatorͱ͔ͬͨ΄͏͕ݡ͍ • جຊతʹखॻ͖Ἒͷಓ • ParserΛॻ͘ͷָ͍͠ • ීஈWeb։ൃͰॻ͔ͳ͍Α͏ͳίʔυ͕ॻ͚Δ • ؾసʹͽͬͨΓͳͷͰ͓͢͢Ί