Slide 1

Slide 1 text

GoͰParserΛॻ͘ kamakura.go #6 @karupanerura

Slide 2

Slide 2 text

@karupanerura • Perl / Go / Java / TypeScript / etc.. • Software Engineer @ DeNA, Co,. LTD. • ୅දཧࣄ @ Japan Perl Association • GoCon͸ؾ෇͍ͨΒCfPऴΘͬͯͨ

Slide 3

Slide 3 text

Parserྺ • ͦͦ͜͜Parserॻ͍ͨ͜ͱ͋Δ • Perl (JSON5, TOML::Parser, Text::MustacheTemplate) • C (c-geohex3, MySQL::Dump::Parser::XS, etc..) • Go (gqlparser, google-cloud-work fl ow-emulator) • But ମܥతʹֶΜͩ͜ͱ͸ͳ͍ • ΋͠ͳΜ͔͓͔͠ͳ͜ͱݴͬͯͨΒποίϛ΄͍͠Ͱ͢

Slide 4

Slide 4 text

jpa.Promotion()

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

2024/10/05 (Sat) ͸ͩͯ͜ະདྷେֶ

Slide 7

Slide 7 text

gopher΋׻ܴ

Slide 8

Slide 8 text

ຊ୊

Slide 9

Slide 9 text

࿩͢͜ͱ • ςΩετΛର৅ͱͨ͠Parserͷجૅ • ط஌ͷΞϧΰϦζϜ΍πʔϧͷ؆୯ͳ঺հ • GoͰ࣮ࡍʹॻ͍ͨParserͷ࣮૷ํ๏ͷ঺հ

Slide 10

Slide 10 text

࿩͞ͳ͍͜ͱ • ςΩετҎ֎ͷύʔεʢόΠφϦͷύʔεͳͲʣ • ط஌ͷΞϧΰϦζϜͷৄࡉ • طଘͷϥΠϒϥϦͷ࢖͍ํ • ChatGPT-4o

Slide 11

Slide 11 text

Parser

Slide 12

Slide 12 text

ಛఆͷߏจΛղऍ͢Δ΍ͭ

Slide 13

Slide 13 text

Parser is Կ • ߏจʹԊͬͨςΩετΛղऍͯ͠ϓϩάϥϜ͕ѻ͑Δߏ଄ʹ͢Δॲཧ • ߏจղੳɺParseͱݺͿ • GopherʹҰ൪ೃછΈ͕ਂ͍ͱ͜ΖͰ͸GoͷίʔυͷParser? • go/parser • go/ast

Slide 14

Slide 14 text

ߏจΛղੳ͢ΔҰൠతͳखॱ • (Tokenize) ςΩετΛҙຯͷ͋Δ୯Ґʹ·ͱΊΔ • ྫ: "123+456" -> "123", "+", "456" • ͜ΕΛ΍Δ΍ͭ͸LexerɺTokenizerɺScannerͳͲͱݺ͹ΕΔ • (Parse) (্هͷ·ͱ·ΓΛ΋ͱʹ)ͦͷҙຯΛදݱ͢Δߏ଄ʹม׵͢Δ • ྫ: "123", "+", "456" -> Plus{Left: 123, Right: 456} • ڱٛͷParserͱͯ͠Parserͱݴ͍ͭͭ͜Ε͚ͩΛࢦ͢͜ͱ΋͋Δ

Slide 15

Slide 15 text

ͦͷଞɺසग़୯ޠ • AST (Abstract Syntax Tree) ͋Δ͍͸ ந৅ߏจ໦ • ςΩετΛ͋Δߏจͱͯ͠ղऍͨ͠ͱ͖ͷҙຯΛߏ଄ͱͯ͠දݱ • લड़ͷParserͷ݁Ռͱͯ͠͸େ఍͜ΕΛ࡞Δ • BNF • ߏจͷߏ଄Λهड़͢ΔͨΊͷߏจ • ABNFͳͲ೿ੜܗ΋͋Δ

Slide 16

Slide 16 text

਺ࣜͷ৔߹

Slide 17

Slide 17 text

1+2

Slide 18

Slide 18 text

Integer(1) Symbol('+') Integer(2)

Slide 19

Slide 19 text

Binary{ OP: '+', L: Integer(1), R: Integer(2), }

Slide 20

Slide 20 text

123+456

Slide 21

Slide 21 text

Integer(123) Symbol('+') Integer(456)

Slide 22

Slide 22 text

Binary{ OP: '+', L: Integer(123), R: Integer(456), }

Slide 23

Slide 23 text

1+(2-1)*2+3

Slide 24

Slide 24 text

Integer(1) Symbol('+') Symbol('(') Integer(2) Symbol('-') Integer(1) Symbol(')') Symbol('*') Integer(2) Symbol('+') Integer(3)

Slide 25

Slide 25 text

Binary{ OP: '+', L: Binary{ OP: '+', L: Integer(1), R: Binary{ OP: '*', L: Binary{OP: '-', L: Integer(2), R: Integer(1)}, R: Integer(2), }, }, R: Integer(3), }

Slide 26

Slide 26 text

͜͏͍͏ײ͡ͷ͜ͱΛ͢Δ

Slide 27

Slide 27 text

༏ઌ౓͕བྷΉͱͪΐͬͱ໽հ Platt Parserͱ͔ʹ͢Δඞཁ͕͋Δ

Slide 28

Slide 28 text

࣮૷͖͋Β͔ʹ໘౗ͦ͘͞͏

Slide 29

Slide 29 text

࿕ใ

Slide 30

Slide 30 text

ࣗಈੜ੒Ͱ͖·͢

Slide 31

Slide 31 text

goyacc • yaccͷgo൛ • yaccͱ͍͏΍͕ͭੲ͔Β͋Δ • BNFΛ࢖ͬͯߏจͷߏ଄ͱͦΕͧΕʹର͢ΔॲཧΛهड़͢Δ • ͍ΘΏΔڱٛͷParser͚ͩΛੜ੒͢Δ • Lexer͸࡞ͬͯ͘Εͳ͍ • ͳΜͱgoʹҰॹʹ͍ͭͯ͘Δ

Slide 32

Slide 32 text

goyaccͷྫ • άάΔͱ͍ͬͺ͍Ͱ͖ͯ·͢

Slide 33

Slide 33 text

goyaccҎ֎ͷࣅͨΑ͏ͳ΍ͭ΋͋Δ • ͜͏͍͏΍ͭΛҰൠʹParser Generatorͱ͍͍·͢ • github.com/alecthomas/participle • Goߏ଄ମλάͳͲͰߏจΛදݱ͢Δ • ANTLR • ͍ΖΜͳݴޠͰੜ੒Ͱ͖Δ • ͳͲͳͲ

Slide 34

Slide 34 text

͜͏͍͏ͷͰ͍͍͡ΌΜ

Slide 35

Slide 35 text

Slide 36

Slide 36 text

ͦ͏͸ͳΒΜ͜ͱ΋͋ΔΜ΍ • goyaccͱ͔΋େ͚֓ͬ͜͏ͩΔ͍ʢ˞ݸਓͷײ૝Ͱ͢ʣ • BNFͷྨ͸ॻ͍͍ͯͯ͋Μ·Γ໘ന͘ͳ͍ʢ˞ݸਓͷײ૝Ͱ͢ʣ • πʔϧ΍ߏจࣗମΛͪΌΜͱֶश͢Δඞཁ͕͋Δ • ୭Ͱ΋΄͍ͦΕͱ࢖͑Δ΄Ͳʹ͸؆୯Ͱ͸ͳ͍ • ࡉ͔͍ͱ͜ΖͷখճΓ͕ޮ͔ͳ͍ • ύϑΥʔϚϯεɾνϡʔχϯά΋೉͍͠৔߹͕͋Δ

Slide 37

Slide 37 text

※େ֓͸Parser Generator࢖͏΄͏͕ྑ͍ • ParserΛॻ͘ͷ͸େม • ߏจ͕ෳࡶͰ͋Ε͹ෳࡶͰ͋Δ΄ͲԸܙ͸Ͱ͔͍ • ࡉ͔͍ͱ͜ΖͷখճΓ͕ޮ͔ͳͯ͘΋ͳΜͱ͔ͳΔ͜ͱ΋ଟ͍

Slide 38

Slide 38 text

खॻ͖͢Δͧ

Slide 39

Slide 39 text

୊ࡐ: GQL Parser • GraphQLͰ͸ͳ͘Google Cloudʹ͋ΔಠࣗͷSQLతͳݴޠ • Cloud Firestore datastore mode (Cloud Datastore) ͷΫΤϦݴޠ • https://cloud.google.com/datastore/docs/reference/gql_reference • goyaccʹΑΔ࣮૷͸طग़ • https://github.com/nshmura/dsio/blob/master/gql/parser.go.y

Slide 40

Slide 40 text

GQLͷྫ • SELECT * FROM foo • SELECT __key__ FROM foo • SELECT DISTINCT f1, f2 FROM foo • SELECT DISTINCT ON (f1, f2) f1, f2, f3 FROM foo • SELECT * FROM foo WHERE f1 = 1 AND (f2 = 2 OR f3 = 3)

Slide 41

Slide 41 text

github.com/karupanerura/gqlparser • github.com/karupanerura/datastore-cli Λ࡞͍ͬͯΔͱ͖ʹੜ·Εͨ • github.com/nshmura/dsio ͕ઌߦͯ͠ଘࡏ͚ͨ͠Ͳ஌Βͳ͔ͬͨ • Ͱ΋ͲͷΈͪࣗ෼ͷ࢖͍ํʹ߹Θͳ͍ΠϯλʔϑΣʔεͩͬͨ • GQLΛParseͯ͠ASTΛ࡞ͬͯ͘ΕΔ

Slide 42

Slide 42 text

※͔͜͜Βgqlparserͷ࣮૷঺հ

Slide 43

Slide 43 text

Lexerͷख࡞Γ • ద౰Ͱ΋ͳΜͱ͔ͳΔ • ෦෼Ұக͢ΔΑ͏ͳΩʔϫʔυͷѻ͍ʹ஫ҙ • ࠷௕ҰகͰѻ͏ͱ͔ϧʔϧΛܾΊΔ • SliceΛฦ͢ΑΓio.Readerతͳ΍ͭΛฦ͢΄͏͕ศར • ઌಡΈΈ͍ͨͳ͜ͱΛͨ͘͠ͳΔ৔߹͕͋Δ • https://github.com/karupanerura/gqlparser/blob/main/lexer.go

Slide 44

Slide 44 text

github.com/karupanerura/runetrie • LexerͰ࠷௕Ұக͢Δͷ͕ͩΔͯ͘࡞ͬͨ • ೚ҙͷจࣈྻͷηοτΛTrie(τϥΠ໦)ʹ·ͱΊΔ • จࣈྻͷઌ಄ʹ࠷௕Ұகͨ͠ཁૉΛฦ͢͜ͱ͕Ͱ͖Δ • LongestMatchPre fi xOf

Slide 45

Slide 45 text

Parserͷख࡞Γ • ؾ߹ͰͳΜͱ͔͢Ε͹ͳΜͱ͔ͳΔ • ͨͩ͠Τϥʔॲཧ͕ύϥύϥͯ͠େม • ಡΈͮΒ͞Λղফ͢ΔͨΊʹ಺෦DSLͬΆ͍ײ͡Ͱॻ͚Δͱ͏Ε͍͠ • ࠓճ͸ࣗ࡞ͨ͠

Slide 46

Slide 46 text

tokenAcceptor type tokenAcceptor interface { accept(tokenReader) error } type tokenReader interface { Next() bool Read() (Token, error) } // ҎԼ͞·͟·ͳtokenAcceptorͷ࣮૷

Slide 47

Slide 47 text

tokenAcceptor࢖༻ྫ func acceptQuery(query *Query) tokenAcceptor { return tokenAcceptors{ skipWhitespaceToken, acceptKeyword("SELECT"), acceptWhitespaceToken, acceptSelectQueryBody(query), } } func acceptSelectQueryBody(query *Query) tokenAcceptor { return tokenAcceptors{ &conditionalTokenAcceptor{ ifAccept: acceptKeyword("DISTINCT"), andThen: acceptDistinctBody(query), orElse: nopAcceptor, }, ...

Slide 48

Slide 48 text

࢖͍ํ • ߏจʹ͋ΘͤͯtokenAcceptorΛ߹੒͍ͯ͘͠ • ৚݅෼ذΈ͍ͨͳ͜ͱ΋Ͱ͖ΔΑ͏ʹͳ͍ͬͯΔ • ઌಡΈͷ୅ΘΓʹಡΜͩҐஔΛϩʔϧόοΫ͢Δ࢓૊Έ͕͋Δ • https://github.com/karupanerura/gqlparser/blob/main/ parser.go#L396 • ͜͏͍͏ͷҰൠతʹͳΜ͍ͯ͏ΜͰ͔͢ʁ಺෦DSLͷPEGʁ

Slide 49

Slide 49 text

tokenAcceptorͰߏ੒͍ͯ͠ͳ͍෦෼ • ͜ͷ෦෼͚ͩ୯ಠͷtokenAcceptorͱͯ͠ఆٛͯ͠ಠཱ • WHERE۟ͷ෦෼ͷParser • Platt Parserʹͳ͍ͬͯΔ • ANDͱORͱͦΕҎ֎ͷԋࢉࢠͷ༏ઌॱҐͱ͔͕͋Δ • https://github.com/karupanerura/gqlparser/blob/main/ condition_parser.go

Slide 50

Slide 50 text

͏·͍͜ͱͰ͖ͨ

Slide 51

Slide 51 text

·ͱΊ • ParserΛॻ͘ͷ͸೉͍͠ • Parser Generatorͱ͔࢖ͬͨ΄͏͕ݡ͍ • جຊతʹखॻ͖͸Ἒͷಓ • ParserΛॻ͘ͷ͸ָ͍͠ • ීஈWeb։ൃͰॻ͔ͳ͍Α͏ͳίʔυ͕ॻ͚Δ • ؾ෼స׵ʹͽͬͨΓͳͷͰ͓͢͢Ί