Slide 1

Slide 1 text

What is expected? ۚࢠ༤Ұ࿠(@yui-knk) 2019/12/14 ฏ੒Rubyձٞ01 Keynote

Slide 2

Slide 2 text

ࣗݾ঺հ • ۚࢠ ༤Ұ࿠ • Arm Treasure Data ॴଐ • Audience νʔϜ (RailsΞϓϦΛॻ͍͍ͯ·͢) • CRuby Committer 2015/12~ • GitHub (yui-knk)

Slide 3

Slide 3 text

8FBSFIJSJOH https://www.treasuredata.com/company/careers/jobs/?team=Engineering

Slide 4

Slide 4 text

8FBSFIJSJOH https://www.treasuredata.com/company/careers/jobs/?team=Engineering

Slide 5

Slide 5 text

What is expected?

Slide 6

Slide 6 text

What is expected? • ^^^ͷ෦෼ʹ͸ͲͷΑ͏ͳ” จࣈྻ”͕͸͍Δ͔ʁ def xxx ^^^

Slide 7

Slide 7 text

What is expected? • “m1”ͱ͍͏໊લͷϝιου Λఆٛ͢Δ def m1 end

Slide 8

Slide 8 text

What is expected? • “def”͕͘Δ͜ͱ΋͋Δ def def

Slide 9

Slide 9 text

What is expected? • “def”͕͘Δ͜ͱ΋͋Δ def def def def a; end

Slide 10

Slide 10 text

What is expected? $ ruby -wc -e 'def def a; end' Syntax OK

Slide 11

Slide 11 text

What is expected? • ໊લ͕”def"Ͱ1Ҿ਺"a"ΛͱΔϝιουΛఆ͍ٛͯ͠Δ • 2ͭ໨ͷ`def`͸`reswords`Ͱɺ͜Ε͸`fname`Ͱ`k_def fname ...` def def a ^^^ ^ | |-- args +------ method name end

Slide 12

Slide 12 text

What is expected? • ͋Δ࣌఺Ͱ࣍ʹདྷΔ͜ͱ͕Ͱ͖Δ”จࣈྻ” (expected tokens)ΛͲ͏΍ͬͯ஌Δ͜ͱ͕Ͱ͖Δͷ͔ʁ

Slide 13

Slide 13 text

How Ruby script is processed

Slide 14

Slide 14 text

How Ruby script is processed 4UFQ *OQVU 0VUQVU %FCVH 4PVSDF 5PLFOJ[BUJPO 3VCZTDSJQU 5PLFOT EVNQZ QBSTFZ 1BSTJOH 5PLFOT "45 EVNQQ QBSTFZ $PNQJMF "45 #ZUFDPEF EVNQJ DPNQJMFD Parsing ___ \ Ruby script -> Tokens -> AST -> Byte code (insns / ISeq) __/ __/ Tokenization Compile

Slide 15

Slide 15 text

Ruby script 1 + 2

Slide 16

Slide 16 text

Tokenization Parsing ___ \ Ruby script -> Tokens -> AST -> Byte code (insns / ISeq) __/ __/ Tokenization Compile

Slide 17

Slide 17 text

Tokenization • Token͸ҎԼͷ2ͭͷ৘ใΛ΋ͭ • a token type (tINTEGER) • a semantic value (1) 1 + 2 ^ ^ ^^ | | |+--- '\n' / "end-of-input" | | +---- tINTEGER (2) | +------ '+' +-------- tINTEGER (1)

Slide 18

Slide 18 text

Tokenization $ ruby --dump=y -e '1 + 2' | grep Shifting Shifting token "integer literal" (1.0-1.1: 1) Shifting token '+' (1.2-1.3: ) Shifting token "integer literal" (1.4-1.5: 2) Shifting token '\n' (1.5-1.5: ) Shifting token "end-of-input" (1.5-1.5: ) On Ruby 2.7.0preview3

Slide 19

Slide 19 text

Parsing Parsing ___ \ Ruby script -> Tokens -> AST -> Byte code (insns / ISeq) __/ __/ Tokenization Compile

Slide 20

Slide 20 text

Parsing • Tokenྻ͕Rubyͷߏจʹ߹க͢Δ͔ΛνΣοΫ͢Δ • AST (Abstract Syntax Tree)Λߏங͢Δ

Slide 21

Slide 21 text

Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ; simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt parse.y

Slide 22

Slide 22 text

Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ; simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt parse.y W W W Rules

Slide 23

Slide 23 text

Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ; simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt parse.y

Slide 24

Slide 24 text

Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ; simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt parse.y 1 2.1 3r 4i

Slide 25

Slide 25 text

Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ; simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt parse.y 1

Slide 26

Slide 26 text

Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ; simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt parse.y 1

Slide 27

Slide 27 text

Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ; simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt parse.y Goal

Slide 28

Slide 28 text

Parsing $ ruby --dump=y -e '1 + 2' # Shifting token “integer literal” (1) “integer literal” simple_numeric numeric literal primary arg

Slide 29

Slide 29 text

Parsing # Shifting token '+' arg '+' # Shifting token “integer literal” (2) arg '+' “integer literal” arg '+' simple_numeric arg '+' numeric arg '+' literal arg '+' primary arg '+' arg arg expr stmt top_stmt top_stmts

Slide 30

Slide 30 text

Parsing # Shifting token '\n' top_stmts '\n' top_stmts term top_stmts terms top_stmts opt_terms top_compstmt program # Completed # Shifting token "end-of-input"

Slide 31

Slide 31 text

Build AST $ ruby --dump=p -e '1 + 2' NODE_SCOPE NODE_OPCALL (:+) NODE_LIT (1) NODE_LIST NODE_LIT (2) NODE_SCOPE NODE_OPCALL (:+) NODE_LIT (1) NODE_LIST NODE_LIT (2)

Slide 32

Slide 32 text

Compile Parsing ___ \ Ruby script -> Tokens -> AST -> Byte code (insns / ISeq) __/ __/ Tokenization Compile

Slide 33

Slide 33 text

Compile • ίϯύΠϧΛ͢Δ • See “compile.c” $ ruby --dump=i -e '1 + 2' == disasm: #@-e:1 (1,0)-(1,5)> (catch: FALSE) 0000 putobject_INT2FIX_1_ ( 1)[Li] 0001 putobject 2 0003 opt_plus 0005 leave

Slide 34

Slide 34 text

1BSTFSJTUPEBZ`TUPQJD • expected tokens͸ߏจղੳ্ͷ໰୊ Parsing ___ \ Ruby script -> Tokens -> AST -> Byte code (insns / ISeq) __/ __/ Tokenization Compile

Slide 35

Slide 35 text

What is parser?

Slide 36

Slide 36 text

• ೖྗจࣈྻ͕จ๏͔ΒͲͷΑ͏ʹੜ੒Ͱ͖Δ͔Λܭࢉ͢ Δ͜ͱ • RubyͰ͸GNU BisonΛ͔ͭͬͯLALR(1)ύʔβΛੜ੒͢Δ • GNU Bison͸parse.yΛೖྗͱͯ͠ύʔβͷCίʔυΛ ग़ྗ͢Δ • ߏจղੳදΛͭͬͯ͘ϓογϡμ΢ϯΦʔτϚτϯΛར༻ ͯ͠ߏจΛղੳ͢Δ What is parser?

Slide 37

Slide 37 text

• ੜ੒نଇ (Production Rule): • ྫ) simple_numeric: tINTEGER • ͖͞΄ͲͷྫͰ͸programΛى఺ʹੜ੒نଇΛద༻ͯ͠ ͍ͬͯ࡞Δ͜ͱͷͰ͖Δ΋ͷ͕ɺͦͷจ๏ʹΑͬͯఆ ٛ͞ΕΔݴޠͱͳΔ simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; 8IBUJTHSBNNBS

Slide 38

Slide 38 text

• ඇऴ୺ه߸ (Nonterminal): RuleͷࠨลʹදΕΔه߸ • ྫ) program, numeric, simple_numeric • ӈลʹ΋දΕΔ͜ͱ͕͋Δ • ऴ୺ه߸ (Terminal): Ruleͷӈลʹ͔͠දΕͳ͍ه߸ • ྫ) tINTEGER, tUMINUS_NUM, tLOWEST numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ; 8IBUJTHSBNNBS ऴ୺ه߸ ඇऴ୺ه߸

Slide 39

Slide 39 text

How parser works? L : L ';' E /* Rule 1 */ | E /* Rule 2 */ ; E : E ',' P /* Rule 3 */ | P /* Rule 4 */ ; P : 'a' /* Rule 5 */ | '(' M ')' /* Rule 6 */ ; M : /* nothing */ /* Rule 7 */ | L /* Rule 8 */ ; a;()

Slide 40

Slide 40 text

• Shift: ࣍ͷtokenΛstackʹϓογϡ͢Δ • Reduce: ͋ΔϧʔϧΛ࢖ͬͯӈ͔ΒnݸͷtokenΛஔ͖׵͑Δ How parser works? "a" ";" "(" ")" $end # Shift "a" ";" "(" ")" $end # Reduce by rule 5 P ";" "(" ")" $end # Reduce by rule 4 E ";" "(" ")" $end # Reduce by rule 2 L ";" "(" ")" $end # Shift L ";" "(" ")" $end # Shift L ";" "(" ")" $end # Reduce by rule 7 L ";" "(" M ")" $end # Shift L ";" "(" M ")" $end # Reduce by rule 6 L ";" P $end # Reduce by rule 4 L ";" E $end # Reduce by rule 1 L $end # Shift L $end # accept Rule 6. P: (M) ʹΑΔreduce stack ೖྗτʔΫϯ

Slide 41

Slide 41 text

How parser works? • Shift/ReduceΛͦΕͧΕ͍ͭߦ͏΂͖͔ʁ • ߏจղੳදʹج͍ͮͯ൑அ͢Δ

Slide 42

Slide 42 text

-"-3 UBCMF • sN: Shiftͯ͠NΛstackʹpush • rN: ϧʔϧNͰReduce͢Δ • acc: डཧ • ۭന: ߏจΤϥʔ • GOTO: ඇऴ୺ه߸༻ɻNΛstack ʹpush https://www.cs.uic.edu/~spopuri/cparser.html#dragonbook-tables

Slide 43

Slide 43 text

)PXUPDSFBUFUIFUBCMF • Ruleͷӈลʹ ‘.’ ΛՃ͑ͨ΋ͷΛߟ͑Δ 1~4 • ‘.’ ͸ruleͷͲ͜·ͰಡΜ͔ͩΛ͋ΒΘ͢ • 2Ͱ͋Ε͹࣍ʹ ‘;’ ͕͘Δ͜ͱΛظ଴͍ͯ͠ΔͷͰɺ ‘;’ ͳ Βshift͢Δ • 4ͳΒreduceͯ͠Lʹ͢Δ • ͜ΕΒΛLR(0)߲ͱ͍ͬͨΓ͢Δ 0 L: L ';' E 1 L: . L ';' E 2 L: L . ';' E 3 L: L ';' . E 4 L: L ';' E .

Slide 44

Slide 44 text

)PXUPDSFBUFUIFUBCMF • LALR(1)ͷ৔߹LR(1)߲Λ࢖͏ • LR(0)߲ʹ͞Βʹtoken 1ͭΛઌಡΈͨ͠΋ͷ L: E . / [$end, ';', ')']

Slide 45

Slide 45 text

)PXUPDSFBUFUIFUBCMF • 1ͷ৔߹ɺ࣍ʹظ଴͞ΕΔͷ͸E • E͸ผͷruleͰreduce͢Δ͜ͱʹΑͬͯൃੜ͢Δ͔΋͠Ε ͳ͍(3, 4) • P͸ผͷruleͰ… (5, 6) • ‘a’ ΍ ‘(’ ͸ऴ୺ه߸ͳͷͰผͷrule͔Β͸ൃੜ͠ͳ͍ • ߲͸͍͔ͭ͘ͷάϧʔϓʹ෼͚ΒΕΔ 1 L: L ';' . E 3 E: . E ',' P 4 | . P 5 P: . 'a' 6 | . '(' M ')'

Slide 46

Slide 46 text

)PXUPDSFBUFUIFUBCMF

Slide 47

Slide 47 text

)PXUPDSFBUFUIFUBCMF ॳظঢ়ଶ डཧ

Slide 48

Slide 48 text

)PXUPDSFBUFUIFUBCMF PΛshift

Slide 49

Slide 49 text

)PXUPDSFBUFUIFUBCMF reduce

Slide 50

Slide 50 text

)PXUPDSFBUFUIFUBCMF shift΋͘͠͸reduce

Slide 51

Slide 51 text

)PXUPHFUFYQFDUFE UPLFOT • state 3ʹ͍ΔͳΒɺ‘;’ ‘,’ ‘)’ ‘$’͕expected tokensʹͳΔ https://www.cs.uic.edu/~spopuri/cparser.html#dragonbook-tables

Slide 52

Slide 52 text

Bison’s compressed table

Slide 53

Slide 53 text

5BCMFJTTQBSTF • actionςʔϒϧ͸ 37/78 = 47% ͔͠ຒ·ͬͯͳ͍ • GOTOςʔϒϧ͸ 10/52 = 19% ͔͠ຒ·ͬͯͳ͍ • Ruby 2.7.3 pre3Ͱ͸ 1234 state, 411 symbols https://www.cs.uic.edu/~spopuri/cparser.html#dragonbook-tables

Slide 54

Slide 54 text

Compress table (1) • Default Reductions, Default GOTOsʹΑΔѹॖ • actionςʔϒϧ͸ԣํ޲ɺ GOTOςʔϒϧ͸ॎํ޲ʹѹ ॖ • state 5 -> r3 • PͷGOTO -> 9 https://www.cs.uic.edu/~spopuri/cparser.html#dragonbook-tables

Slide 55

Slide 55 text

Compress table (2) • Default ReductionsΛಋೖͯ͠΋ɺ·ͩ sparse • double displacementʹΑΔѹॖ https://www.cs.uic.edu/~spopuri/cparser.html#table-compression ΛҰ෦मਖ਼

Slide 56

Slide 56 text

Compress table (2) • ࣮ࡍͷ஋Λ΋ͭyytable, guard tableͰ͋Δyycheck, offset Λ؅ཧ͢Δyypactͷ3ͭͰදݱ͢Δ 0: [ , , , , , 1, 2, ] 2: [ , , , , , 1, 2, ] 3: [ 8, , , 9, , , , ] 4: [ , , , , 10, , , ] 6: [ , , , 9, , , , ] 7: [ , , , , , , , 11] 9: [ , , , , , 1, 2, ] 10: [ , , , , , 1, 2, ] 12: [ , , , , 10, , , ] yycheck [0, 5, 6, 3, 7, 4, 3, 2, 9, -1, -1, -1, 10] yytable [8, 1, 2, 9, 11, 10, 9, 6, 12, 0, 0, 0, 13] yypact [-4, -5, -4, 0, 1, -5, 3, -3, -5, -4, -4, -5, 1, -5]

Slide 57

Slide 57 text

• state 0ͷέʔεΛߟ͑ͯΈΔ Compress table (2) 0: [ , , , , , 1, 2, ] 2: [ , , , , , 1, 2, ] 3: [ 8, , , 9, , , , ] 4: [ , , , , 10, , , ] 6: [ , , , 9, , , , ] 7: [ , , , , , , , 11] 9: [ , , , , , 1, 2, ] 10: [ , , , , , 1, 2, ] 12: [ , , , , 10, , , ] yycheck [0, 5, 6, 3, 7, 4, 3, 2, 9, -1, -1, -1, 10] yytable [8, 1, 2, 9, 11, 10, 9, 6, 12, 0, 0, 0, 13] yypact [-4, -5, -4, 0, 1, -5, 3, -3, -5, -4, -4, -5, 1, -5] yypact[0] = -4

Slide 58

Slide 58 text

Compress table (2) index = 5 Ͱ͸yycheckͷ஋ͱҰக͢Δ 0: [ , , , , , 1, 2, ] 2: [ , , , , , 1, 2, ] 3: [ 8, , , 9, , , , ] 4: [ , , , , 10, , , ] 6: [ , , , 9, , , , ] 7: [ , , , , , , , 11] 9: [ , , , , , 1, 2, ] 10: [ , , , , , 1, 2, ] 12: [ , , , , 10, , , ] yycheck [0, 5, 6, 3, 7, 4, 3, 2, 9, -1, -1, -1, 10] yytable [8, 1, 2, 9, 11, 10, 9, 6, 12, 0, 0, 0, 13] yypact [-4, -5, -4, 0, 1, -5, 3, -3, -5, -4, -4, -5, 1, -5]

Slide 59

Slide 59 text

Compress table (2) index = 5 ͷ஋͸ 1 (= yytable[1]) 0: [ , , , , , 1, 2, ] 2: [ , , , , , 1, 2, ] 3: [ 8, , , 9, , , , ] 4: [ , , , , 10, , , ] 6: [ , , , 9, , , , ] 7: [ , , , , , , , 11] 9: [ , , , , , 1, 2, ] 10: [ , , , , , 1, 2, ] 12: [ , , , , 10, , , ] yycheck [0, 5, 6, 3, 7, 4, 3, 2, 9, -1, -1, -1, 10] yytable [8, 1, 2, 9, 11, 10, 9, 6, 12, 0, 0, 0, 13] yypact [-4, -5, -4, 0, 1, -5, 3, -3, -5, -4, -4, -5, 1, -5]

Slide 60

Slide 60 text

DPNQSFTTFEUBCMFͷಛ௃ • double displacement • ෮ݩՄೳͳͷͰ໰୊ͳ͍ • default reductions • errorൃੜ͕஗Ԇͯ͠expected tokens͕มΘͬͯ͠·͏ • expected tokensͷܭࢉ͕ͦͷ࣌఺ͷstate stackʹґଘ ͢Δ

Slide 61

Slide 61 text

&SSPSൃੜͷ஗Ԇ • ҎԼͷྫ͸ຊ౰ʹ”when”͚ͩͳͷ͔ʁ $ ruby -e 'case a;' -e:1: syntax error, unexpected end-of-input, expecting `when'

Slide 62

Slide 62 text

&SSPSൃੜͷ஗Ԇ • “in”΋ॻ͘͜ͱ͕Ͱ͖Δ $ ruby -wce 'case a; in b; end' -e:1: warning: Pattern matching is experimental, and the behavior may change in future versions of Ruby! Syntax OK

Slide 63

Slide 63 text

&SSPSൃੜͷ஗Ԇ case a; ^ State 375 737 opt_terms: terms . ["`when'", "`in'"] ... $default reduce using rule 737 (opt_terms) State 587 ... $default reduce using rule 330 (@18) State 717 331 primary: k_case expr_value opt_terms @18 . case_body k_end 367 k_when: . "`when'" 464 case_body: . k_when case_args then compstmt cases "`when'" shift, and go to state 719 k_when go to state 720 case_body go to state 841 default reduceʹΑͬͯstate 717·ͰҠಈ͢Δ

Slide 64

Slide 64 text

4UBDL΁ͷґଘ

Slide 65

Slide 65 text

4UBDL΁ͷґଘ • ‘;’ ‘,’ ‘$end’͸ຊདྷͳΒerror • ຊདྷerror͔Ͳ͏͔Λ֬ೝ͢Δ ͨΊʹ࣮ࡍʹreduce͢Δඞཁ ͕͋Δ • ͜ͷܭࢉͷ݁Ռ͸ͦͷ࣌఺ͷ stackʹґଘ͢Δ • ςετͷෳࡶ͕͞૿͢ https://www.cs.uic.edu/~spopuri/cparser.html#modified-tables ΛҰ෦मਖ਼

Slide 66

Slide 66 text

8PSLBSPVOE • lr.default-reduction Λ accepting ʹ͢Δ͜ͱͰdefault reduce͞ΕΔ෦෼ΛݮΒ͢͜ͱ͕Ͱ͖Δ • https://www.gnu.org/software/bison/manual/html_node/ Default-Reductions.html

Slide 67

Slide 67 text

1JUGBMMPGUIFXPSLBSPVOE • ruby͕buildͰ͖ͳ͘ͳΔ • ຊདྷ͸tLABELͱ͍͏τʔΫϯʹͳΔ $ ./miniruby -v :33: syntax error, unexpected ':' def self.start full_mark: true, immediate_mark: true, ...

Slide 68

Slide 68 text

• tLABELΛు͖ग़͢ʹ͸EXPR_LABEL|EXPR_ENDFN͕ඞ ཁ 1JUGBMMPGUIFXPSLBSPVOE #define IS_LABEL_POSSIBLE() (\ (IS_lex_state(EXPR_LABEL|EXPR_ENDFN) && !cmd_state) || \ IS_ARG()) static enum yytokentype parse_ident(struct parser_params *p, int c, int cmd_state) { ... if (IS_LABEL_POSSIBLE()) { if (IS_LABEL_SUFFIX(0)) { SET_LEX_STATE(EXPR_ARG|EXPR_LABELED); nextc(p); set_yylval_name(TOK_INTERN()); return tLABEL; } }

Slide 69

Slide 69 text

• fnameͷ࣍ͷΞΫγϣϯͰlex_stateΛηοτ͍ͯ͠Δ 1JUGBMMPGUIFXPSLBSPVOE | k_def singleton dot_or_colon {SET_LEX_STATE(EXPR_FNAME);} fname { $4 = p->in_def; p->in_def = 1; SET_LEX_STATE(EXPR_ENDFN|EXPR_LABEL); /* force for args */ local_push(p, 0); $$ = p->cur_arg; p->cur_arg = 0; } f_arglist

Slide 70

Slide 70 text

#FGPSF • ͱΓ͋͑ͣΞΫγϣϯ(@26)Λ࣮ߦ͢Δ State 858 346 @26: . %empty 347 primary: k_def singleton dot_or_colon @25 fname . @26 f_arglist bodystmt k_end $default reduce using rule 346 (@26) @26 go to state 961

Slide 71

Slide 71 text

"GUFS • ΞΫγϣϯΛ࣮ߦ͢ΔલʹtLABELΛཁٻ͢ΔΑ͏ʹͳͬ ͨ State 858 346 @26: . %empty ["local variable or method", "global variable", "instance variable", "constant", "class variable", tLABEL, "**", "(", "*", "**arg", "&", '&', '*', '(', ';', '\n'] 347 primary: k_def singleton dot_or_colon @25 fname . @26 f_arglist bodystmt k_end "local variable or method" reduce using rule 346 (@26) ... tLABEL reduce using rule 346 (@26) ... @26 go to state 961

Slide 72

Slide 72 text

Current implementation

Slide 73

Slide 73 text

static enum yytokentype -yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p) +yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p, int yystate, short *yyss, short *yyssp) { enum yytokentype t; + VALUE yysstack; p->lval = lval; lval->val = Qundef; + + yysstack = yysstack_new(yyss, yyssp); + + if (p->debug) { + VALUE tokens = expected_tokens(yystate, yysstack); + rb_parser_printf(p, "\nexpected_tokens (state = %d): %"PRIsVALUE"\n", yystate, tokens); + } + stackΛίϐʔ͢Δ stateͱstackΛҾ਺ʹ௥Ճ https://github.com/ruby/ruby/compare/master...yui-knk:feature/ expected_tokens_v2_7_0_preview3_heisei_01?expand=1

Slide 74

Slide 74 text

/* See also: yysyntax_error and yybackup */ static VALUE expected_tokens(const int yystate, VALUE yysstack) { VALUE ary = rb_ary_new(); for (int yytoken = 0; yytoken < YYNTOKENS; ++yytoken) { push_expected_token(ary, yystate, yytoken, rb_ary_dup(yysstack)); } return ary; } શtokenʹରͯ͠push_expected_tokenΛݺͼग़͢

Slide 75

Slide 75 text

static void push_expected_token(VALUE ary, const int yystate, const int yytoken, VALUE yysstack) { int yyn = yypact[yystate]; /* See: yydefault label */ if (yypact_value_is_default(yyn)) { int new_state; if ((new_state = default_reduce(yystate, yytoken, yysstack)) >= 0) { /* yysstack is changed */ push_expected_token(ary, new_state, yytoken, yysstack); } return; } default reduction ͷͱ͖ ࣮ࡍʹreduceͯࣗ͠਎Λ࠶ؼݺͼग़͠

Slide 76

Slide 76 text

yyn += yytoken; if (yyn < 0 || YYLAST < yyn || yycheck[yyn] != yytoken) { int new_state; if ((new_state = default_reduce(yystate, yytoken, yysstack)) >= 0) { /* yysstack is changed */ push_expected_token(ary, new_state, yytoken, yysstack); } return; } default reduction ͷͱ͖ ࣮ࡍʹreduceͯࣗ͠਎Λ࠶ؼݺͼग़͠

Slide 77

Slide 77 text

yyn = yytable[yyn]; if (yyn <= 0) { if (!yytable_value_is_error(yyn)) { rb_ary_push(ary, rb_str_new2(yytname[yytoken])); return; } } else { rb_ary_push(ary, rb_str_new2(yytname[yytoken])); return; } } reduction ͷͱ͖ shift ͷͱ͖ expected tokensʹ௥Ճ͢Δ

Slide 78

Slide 78 text

$ ./miniruby --dump=y -e 'case a;' ... Entering state 375 Reading a token: expected_tokens (state = 375): ["\"`when'\"", "\"`in'\"", "';'"] ... -e:1: syntax error, unexpected end-of-input, expecting `when' ...

Slide 79

Slide 79 text

• yypact΍yytableͳͲBisonͷ࣮૷ʹڧ͘ґଘ͢Δ࣮૷ʹ ͳ͍ͬͯΔ yyn += yytoken; if (yyn < 0 || YYLAST < yyn || yycheck[yyn] != yytoken) { int new_state; if ((new_state = default_reduce(yystate, yytoken, yysstack)) >= 0) { /* yysstack is changed */ push_expected_token(ary, new_state, yytoken, yysstack); } return; }

Slide 80

Slide 80 text

• ੈͷதʹ͸ tool/ytab.sed ͷΑ͏ͳίʔυ΋͋Δ #!/bin/sed -f # This file is used when generating code for the Ruby parser. ... s/^yysyntax_error (/&struct parser_params *p, / s/ yysyntax_error (/&p, / s/\( YYFPRINTF *(\)yyoutput,/\1p,/ s/\( YYFPRINTF *(\)yyo,/\1p,/ s/\( YYFPRINTF *(\)stderr,/\1p,/ s/\( YYDPRINTF *((\)stderr,/\1p,/ s/^\([ ]*\)\(yyerror[ ]*([ ]*parser,\)/\1parser_\2/ s!^ *extern char \*getenv();!/* & */! s/^\(#.*\)".*\.tab\.c"/\1"parse.c"/ /^\(#.*\)".*\.y"/s:\\\\:/:g

Slide 81

Slide 81 text

'VUVSFXPSL • ςετํ๏Λߟ͑Δඞཁ͕͋Δ • yyss, yysspʹґଘͨ͠ίʔυ͸SEGV͢ΔՄೳੑ͕͋Δͷ Ͱผͷํ๏Λߟ͑Δ

Slide 82

Slide 82 text

·ͱΊ • expected tokens͸parserͷ৘ใ͔Βܾఆ͢Δ͜ͱ͕Ͱ͖ Δ • Default ReductionsΛ͏·͘ѻ͏ͷ͸೉͍͠ • ҰํͰRuby͸Default Reductionsͷڍಈʹґଘ͍ͯ͠Δͷ Ͱ͏·͘෇͖߹͍ͬͯ͘ඞཁ͕͋Δ

Slide 83

Slide 83 text

"DLOPXMFEHFNFOUT • @takeshinoda and @hkdnet • Reviewing the slide

Slide 84

Slide 84 text

ࢀߟจݙ • Rubyॲཧܥશൠ • http://i.loveruby.net/ja/rhg/book/ • "Rubyͷ͘͠ΈɹRuby Under a Microscope” • Parser • http://i.loveruby.net/ja/rhg/book/ • "ίϯύΠϥ―ݪཧɾٕ๏ɾπʔϧ (Information & Computing)” • Bison • https://www.cs.uic.edu/~spopuri/cparser.html

Slide 85

Slide 85 text

Thank you !!