Slide 1

Slide 1 text

Dissecting and Reconstructing RubySyntacticStructures @ydah / Yudai Takada RubyKaigi 2025ʔMatsuyama, Ehime Ehime Prefectural Convention Hall 17 April 2025

Slide 2

Slide 2 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Software engineer @ x.com/@ydah_ github.com/@ydah https://ydah.net/ Yudai Takada / ydah "Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah

Slide 3

Slide 3 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah "Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Drinks Sponsor

Slide 4

Slide 4 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah "Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Ask the Speakers

Slide 5

Slide 5 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Ruby Committer (2024-12~) Mainly develop parser generator and parser Multiple Node Location, Refacor parse.y A committer of Lrama Parameterizing Rules, Inlining Syntax Diagrams Yudai Takada / ydah "Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah

Slide 6

Slide 6 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah I'm in "Five Chariot Stars" A castle inhabited by the creator of the Ruby, Two monsters, and the Organizers of TRICK Residents of parse.y ❯ git shortlog - ns parse.y 1354 Nobuyoshi Nakada 362 Yukihiro "Matz" Matsumoto 252 yui - knk 174 Yusuke Endoh 75 ydah The Patch Monster The Creator of the Ruby The Parser Monster The Organizer of TRICK Me

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

Unsolicited Ads

Slide 10

Slide 10 text

KansaiRubyKaigi08 Kyoto Pontocho Kaburenjo Theater 2025-06-28(Sat) @shimbaco / @pocke RubyKansai, Kyoto.rb, Kobe.rb, AKASHI.rb, RubyMaizuru Kyobashi.rb, Ruby Tuesday, Shinosaka.rb, naniwa.rb Unsolicited Ads

Slide 11

Slide 11 text

CFP IS NOW LIVE!! Submit your talk by May 7th Unsolicited Ads

Slide 12

Slide 12 text

Beginning of the Journey "Our ignorance can be divided into problems and mysteries. When we face a problem, we may not know its solution, but we have insight, increasing knowledge, and an inkling of what we are looking for. When we face a mystery, however, we can only stare in wonder and bewilderment, not knowing what an explanation would even look like.” -- Noam Chomsky

Slide 13

Slide 13 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah What is Grammar Structure?

Slide 14

Slide 14 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Grammar Overview: The rules or system for combining words to create meaningful sentences in a language we use daily Analogy: It functions like a "blueprint" or "tra ff i c rules" for speaking and writing Importance: It serves as the foundation for smooth communication Bene fi t: Grammar enables us to accurately convey and understand each other's intentions

Slide 15

Slide 15 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Programming Languages A formal language for describing instructions to be executed by a computer Elements that represent di ff erent aspects: Syntax Semantics ALGOL (1958) C (1972) Simula (1967) Pascal (1970) BCPL (1966) PL/I (1964) LISP (1958) Scheme (1975) Common Lisp (1984) Clojure (2007) ML (1973) Ruby (1995) Smalltalk (1972) C++ (1985) Eiffel (1986) Java (1995) Python (1991) Objective-C (1984) Groovy (2003) Scala (2004) Perl (1987) Go (2009) Rust (2010) D (2001) PHP (1995) JavaScript (1995) Lua (1993) Ada Haskell (1990) OCaml (1996) F# (2005) Standard ML (1983) C# (2000) Kotlin (2011) Crystal (2014) Elixir (2011) CoffeeScript (2009) Swift (2014) TypeScript (2012) Dart (2011) Julia (2012) CLU

Slide 16

Slide 16 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Syntax A combination of letters, symbols, and words (keywords) that make up a program "Grammar rules" for description format Syntax deals with the "form" and "structure" of the code

Slide 17

Slide 17 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah A rule that de fi nes the following for a syntactically correct program: What does it mean? What happens when it is executed? "Colorless green ideas sleep furiously" Semantics APPLE

Slide 18

Slide 18 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Syntactic Structures

Slide 19

Slide 19 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Syntactic Structures Rules governing how language elements combine to form valid programs De fi nes meaningful arrangements of tokens (keywords, identi fi ers, operators) Parallel to grammar in natural languages Speci fi es correct token order and structural patterns

Slide 20

Slide 20 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Backus-Naur Form A meta-syntactic notation used to describe the syntax of formal languages. To provide a means for de fi ning structure rules of a language clearly and without ambiguity. %union { int i; } %token number % % program : expr ; expr : term '+' expr | term ; term : factor '*' term | factor ; factor : number ;

Slide 21

Slide 21 text

Ruby Grammar Structure "Language is the most massive and inclusive art we know, a mountainous and anonymous work of unconscious generations." --Edward Sapir

Slide 22

Slide 22 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah What is parse.y?

Slide 23

Slide 23 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah parse.y File de fi ning the rules and tokens needed to parse Ruby programs and generate abstract syntax trees Parser generated using GNU Bison for Ruby 3.2 and earlier Parser generated using Lrama parser generator for Ruby 3.3 and later

Slide 24

Slide 24 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Blueprint of Parser Lrama (Parser Generator) parse.y Parser

Slide 25

Slide 25 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Structure of parse.y %{ / / C code %} %union { NODE * node; . . . } %token keywo r d_class "'class'" . . . %type singleton st r ings . . . % % p r og r am : top_compstmt . . . % % / / C code

Slide 26

Slide 26 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Generative grammar part %{ / / C code %} %union { NODE * node; . . . } %token keywo r d_class "'class'" . . . %type singleton st r ings . . . % % p r og r am : top_compstmt . . . % % / / C code

Slide 27

Slide 27 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah %token NUMBER % % expr : NUMBER { $$ = $1; } | expr '+' expr { $$ = $1 + $3; } ; Backus-Naur Form

Slide 28

Slide 28 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah %token NUMBER % % expr : NUMBER { $$ = $1; } | expr '+' expr { $$ = $1 + $3; } ; LHS (Left Hand Side)

Slide 29

Slide 29 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah %token NUMBER % % expr : NUMBER { $$ = $1; } | expr '+' expr { $$ = $1 + $3; } ; RHS (Right Hand Side)

Slide 30

Slide 30 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah %token NUMBER % % expr : NUMBER { $$ = $1; } | expr '+' expr { $$ = $1 + $3; } ; Terminal Symbols

Slide 31

Slide 31 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah %token NUMBER % % expr : NUMBER { $$ = $1; } | expr '+' expr { $$ = $1 + $3; } ; Nonterminal Symbols

Slide 32

Slide 32 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah A synonym for parse.y "Demon Castle parse.y" (2017) "Monstrous" lex_state (2017) parse.y is "ຐڥ" (2019) The current parse.y is a hell (2021)

Slide 33

Slide 33 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Why is it so complicated? ruby-lang.org - About Ruby https://www.ruby-lang.org/en/about/ "Ruby is simple in appearance, but is very complex inside, just like our human body 1 ." In the design of programming languages, there is often a trade-o ff between " fl exibility" and "complexity" of grammar

Slide 34

Slide 34 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah State of parse.y 16,024 Line (Commit b68fe5) Terminal symbols: 162 Non-terminal symbols(Generative rules): 303 PHP (Terms: 184 / Non-Terms: 186) Perl (Terms: 132 / Non-Terms: 113)

Slide 35

Slide 35 text

Dissecting Ruby Syntactic Structures "A special kind of beauty exists which is born in language, of language, and for language." -- Gaston Bachelard

Slide 36

Slide 36 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Basic Structural Element Expression (expr) A piece of code that produces a single "Value" by being evaluated operations (1 + 2 * 3), command calls (method(arg)) ...etc Statement (stmt) A complete unit of instruction for performing an Action declaration (def foo; end), assignment (foo = bar), control fl ow (if foo) ...etc

Slide 37

Slide 37 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Relationship between expr and stmt Statements often contain expressions Assignment: `x = y + 5;` where `y + 5` is the expression and the command is assigned the evaluated value if: `if (score >= 60); ... end`, `score >= 60` is an expression that evaluates to a Boolean, and the result is a statement that determines whether subsequent blocks are executed.

Slide 38

Slide 38 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Relationship between expr and stmt Expression Statement counter++: `counter++` evaluates a value with the side e ff ect of changing the value of counter. The expression is treated as a statement by itself Function call: `process_data();` is an expression that calls the function process_data, but serves itself as a statement if it is intended only to perform processing without a return value

Slide 39

Slide 39 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Generation rule hierarchy Suggested that grammar has a hierarchical structure Hierarchical structures exist for parsers to resolve ambiguities

Slide 40

Slide 40 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah PHP - stmt and expr https://github.com/php/php-src/blob/master/Zend/zend_language_parser.y statement: '{' inner_statement_list '}' { $$ = $2; } | if_stmt { $$ = $1; } | alt_if_stmt { $$ = $1; } | T_WHILE '(' expr ')' while_statement { $$ = zend_ast_create(ZEND_AST_WHILE, $3, $5); } | T_DO statement T_WHILE '(' expr ')' ';' { $$ = zend_ast_create(ZEND_AST_DO_WHILE, $2, $5); } | T_FOR '(' for_exprs ';' for_cond_exprs ';' for_exprs ')' for_statement { $$ = zend_ast_create(ZEND_AST_FOR, $3, $5, $7, $9); } | T_SWITCH '(' expr ')' switch_case_list { $$ = zend_ast_create(ZEND_AST_SWITCH, $3, $5); } | T_BREAK optional_expr ';' { $$ = zend_ast_create(ZEND_AST_BREAK, $2); } | T_CONTINUE optional_expr ';' { $$ = zend_ast_create(ZEND_AST_CONTINUE, $2); } | T_RETURN optional_expr ';' { $$ = zend_ast_create(ZEND_AST_RETURN, $2); } | T_GLOBAL global_var_list ';' { $$ = $2; } | T_STATIC static_var_list ';' { $$ = $2; } | T_ECHO echo_expr_list ';' { $$ = $2; } | T_INLINE_HTML { $$ = zend_ast_create(ZEND_AST_ECHO, $1); } | expr ';' { $$ = $1; } | T_UNSET '(' unset_variables possible_comma ')' ';' { $$ = $3; } | T_FOREACH '(' expr T_AS foreach_variable ')' foreach_statement { $$ = zend_ast_create(ZEND_AST_FOREACH, $3, $5, NULL, $7); } | T_FOREACH '(' expr T_AS foreach_variable T_DOUBLE_ARROW foreach_variable ')' foreach_statement { $$ = zend_ast_create(ZEND_AST_FOREACH, $3, $7, $5, $9); } | T_DECLARE '(' const_list ')' { if (!zend_handle_encoding_declaration($3)) { YYERROR; } } declare_statement { $$ = zend_ast_create(ZEND_AST_DECLARE, $3, $6); } | ';' / * empty statement * / { $$ = NULL; } | T_TRY '{' inner_statement_list '}' catch_list f i nally_statement { $$ = zend_ast_create(ZEND_AST_TRY, $3, $5, $6); } | T_GOTO T_STRING ';' { $$ = zend_ast_create(ZEND_AST_GOTO, $2); } | T_STRING ':' { $$ = zend_ast_create(ZEND_AST_LABEL, $1); } | T_VOID_CAST expr ';' { $$ = zend_ast_create(ZEND_AST_CAST_VOID, $2); } ; expr: variable { $$ = $1; } | T_LIST '(' array_pair_list ')' '=' expr { $3 - > attr = ZEND_ARRAY_SYNTAX_LIST; $$ = zend_ast_create(ZEND_AST_ASSIGN, $3, $6); } | '[' array_pair_list ']' '=' expr { $2 - > attr = ZEND_ARRAY_SYNTAX_SHORT; $$ = zend_ast_create(ZEND_AST_ASSIGN, $2, $5); } | variable '=' expr { $$ = zend_ast_create(ZEND_AST_ASSIGN, $1, $3); } | variable '=' ampersand variable { $$ = zend_ast_create(ZEND_AST_ASSIGN_REF, $1, $4); } | T_CLONE expr { $$ = zend_ast_create(ZEND_AST_CLONE, $2); } | variable T_PLUS_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_ADD, $1, $3); } | variable T_MINUS_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_SUB, $1, $3); } | variable T_MUL_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_MUL, $1, $3); } | variable T_POW_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_POW, $1, $3); } | variable T_DIV_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_DIV, $1, $3); } | variable T_CONCAT_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_CONCAT, $1, $3); } | variable T_MOD_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_MOD, $1, $3); } | variable T_AND_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_BW_AND, $1, $3); } | variable T_OR_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_BW_OR, $1, $3); } | variable T_XOR_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_BW_XOR, $1, $3); } | variable T_SL_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_SL, $1, $3); } | variable T_SR_EQUAL expr { $$ = zend_ast_create_assign_op(ZEND_SR, $1, $3); } | variable T_COALESCE_EQUAL expr { $$ = zend_ast_create(ZEND_AST_ASSIGN_COALESCE, $1, $3); } | variable T_INC { $$ = zend_ast_create(ZEND_AST_POST_INC, $1); } | T_INC variable { $$ = zend_ast_create(ZEND_AST_PRE_INC, $2); } | variable T_DEC { $$ = zend_ast_create(ZEND_AST_POST_DEC, $1); } | expr '>' expr { $$ = zend_ast_create(ZEND_AST_GREATER, $1, $3); } | expr T_IS_GREATER_OR_EQUAL expr { $$ = zend_ast_create(ZEND_AST_GREATER_EQUAL, $1, $3); } | expr T_SPACESHIP expr { $$ = zend_ast_create_binary_op(ZEND_SPACESHIP, $1, $3); } | expr T_INSTANCEOF class_name_reference { $$ = zend_ast_create(ZEND_AST_INSTANCEOF, $1, $3); } | '(' expr ')' { $$ = $2; if ($$ - > kind = = ZEND_AST_CONDITIONAL) $$ - > attr = ZEND_PARENTHESIZED_CONDITIONAL; } | new_dereferenceable { $$ = $1; } | new_non_dereferenceable { $$ = $1; } | expr '?' expr ':' expr { $$ = zend_ast_create(ZEND_AST_CONDITIONAL, $1, $3, $5); } | expr '?' ':' expr { $$ = zend_ast_create(ZEND_AST_CONDITIONAL, $1, NULL, $4); } | expr T_COALESCE expr { $$ = zend_ast_create(ZEND_AST_COALESCE, $1, $3); } | internal_functions_in_yacc { $$ = $1; } | T_INT_CAST expr { $$ = zend_ast_create_cast(IS_LONG, $2); } | T_DOUBLE_CAST expr { $$ = zend_ast_create_cast(IS_DOUBLE, $2); } | T_STRING_CAST expr { $$ = zend_ast_create_cast(IS_STRING, $2); } | T_ARRAY_CAST expr { $$ = zend_ast_create_cast(IS_ARRAY, $2); } | T_OBJECT_CAST expr { $$ = zend_ast_create_cast(IS_OBJECT, $2); } | T_BOOL_CAST expr { $$ = zend_ast_create_cast(_IS_BOOL, $2); } | T_UNSET_CAST expr { $$ = zend_ast_create_cast(IS_NULL, $2); } | T_EXIT ctor_arguments { zend_ast * name = zend_ast_create_zval_from_str(ZSTR_KNOWN(ZEND_STR_EXIT)); name - > attr = ZEND_NAME_FQ; $$ = zend_ast_create(ZEND_AST_CALL, name, $2); } | '@' expr { $$ = zend_ast_create(ZEND_AST_SILENCE, $2); } | scalar { $$ = $1; } | '`' backticks_expr '`' { $$ = zend_ast_create(ZEND_AST_SHELL_EXEC, $2); } | T_PRINT expr { $$ = zend_ast_create(ZEND_AST_PRINT, $2); } | T_YIELD { $$ = zend_ast_create(ZEND_AST_YIELD, NULL, NULL); CG(extra_fn_flags) |= ZEND_ACC_GENERATOR; } | T_YIELD expr { $$ = zend_ast_create(ZEND_AST_YIELD, $2, NULL); CG(extra_fn_flags) |= ZEND_ACC_GENERATOR; } | T_YIELD expr T_DOUBLE_ARROW expr { $$ = zend_ast_create(ZEND_AST_YIELD, $4, $2); CG(extra_fn_flags) |= ZEND_ACC_GENERATOR; } | T_YIELD_FROM expr { $$ = zend_ast_create(ZEND_AST_YIELD_FROM, $2); CG(extra_fn_flags) |= ZEND_ACC_GENERATOR; } | T_THROW expr { $$ = zend_ast_create(ZEND_AST_THROW, $2); } | inline_function { $$ = $1; } | attributes inline_function { $$ = zend_ast_with_attributes($2, $1); } | T_STATIC inline_function { $$ = $2; ((zend_ast_decl *) $$) - > flags |= ZEND_ACC_STATIC; } | attributes T_STATIC inline_function { $$ = zend_ast_with_attributes($3, $1); ((zend_ast_decl *) $$) - > flags |= ZEND_ACC_STATIC; } | match { $$ = $1; } ;

Slide 41

Slide 41 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Perl - stmt and expr https://github.com/Perl/perl5/blob/blead/perly.y barestmt: PLUGSTMT { $$ = $PLUGSTMT; } | KW_FORMAT startformsub formname formblock { CV * fmtcv = PL_compcv; newFORM($startformsub, $formname, $formblock); $$ = NULL; if (CvOUTSIDE(fmtcv) & & !CvEVAL(CvOUTSIDE(fmtcv))) { pad_add_weakref(fmtcv); } parser - > parsed_sub = 1; } | KW_SUB_named subname startsub / * sub declaration or def i nition not within scope of 'use feature "signatures"' * / { init_named_cv(PL_compcv, $subname); parser - > in_my = 0; parser - > in_my_stash = NULL; } proto subattrlist optsubbody { SvREFCNT_inc_simple_void(PL_compcv); $subname - > op_type = = OP_CONST ? newATTRSUB($startsub, $subname, $proto, $subattrlist, $optsubbody) : newMYSUB($startsub, $subname, $proto, $subattrlist, $optsubbody) ; $$ = NULL; intro_my(); parser - > parsed_sub = 1; } | sigsub_or_method_named subname startsub / * sub declaration or def i nition under 'use feature * "signatures"'. (Note that a signature isn't * allowed in a declaration) * / { init_named_cv(PL_compcv, $subname); if($sigsub_or_method_named = = KW_METHOD_named) { croak_kw_unless_class("method"); class_prepare_method_parse(PL_compcv); } parser - > in_my = 0; parser - > in_my_stash = NULL; } subattrlist optsigsubbody { OP * body = $optsigsubbody; SvREFCNT_inc_simple_void(PL_compcv); $subname - > op_type = = OP_CONST ? newATTRSUB($startsub, $subname, NULL, $subattrlist, body) : newMYSUB( $startsub, $subname, NULL, $subattrlist, body) ; $$ = NULL; intro_my(); parser - > parsed_sub = 1; } | PHASER startsub { switch($PHASER) { case KEY_ADJUST : croak_kw_unless_class("ADJUST"); class_prepare_method_parse(PL_compcv); break; default: NOT_REACHED; } } optsubbody { OP * body = $optsubbody; SvREFCNT_inc_simple_void(PL_compcv); CV * cv; switch($PHASER) { case KEY_ADJUST : cv = newATTRSUB($startsub, NULL, NULL, NULL, body); class_add_ADJUST(PL_curstash, cv); break; } $$ = NULL; } | KW_PACKAGE BAREWORD[version] BAREWORD[package] PERLY_SEMICOLON / * version and package appear in the reverse order to what may be * expected, because toke.c has already pushed both of them to a stack * by calling force_next() from within force_version(). * When the parser pops them back out again they appear swapped * / { package($package); if ($version) package_version($version); $$ = NULL; } | KW_CLASS BAREWORD[version] BAREWORD[package] subattrlist PERLY_SEMICOLON { package($package); if ($version) package_version($version); $$ = NULL; class_setup_stash(PL_curstash); if ($subattrlist) { class_apply_attributes(PL_curstash, $subattrlist); } } | KW_USE_or_NO startsub { CvSPECIAL_on(PL_compcv); / * It's a BEGIN {} * / } BAREWORD[version] BAREWORD[module] optlistexpr PERLY_SEMICOLON / * version and package appear in reverse order for the same reason as * KW_PACKAGE; see comment above * / { SvREFCNT_inc_simple_void(PL_compcv); utilize($KW_USE_or_NO, $startsub, $version, $module, $optlistexpr); parser - > parsed_sub = 1; $$ = NULL; } | KW_IF PERLY_PAREN_OPEN remember mexpr PERLY_PAREN_CLOSE mblock else { $$ = block_end($remember, newCONDOP(0, $mexpr, op_scope($mblock), $else)); parser - > copline = (line_t)$KW_IF; } | KW_UNLESS PERLY_PAREN_OPEN remember mexpr PERLY_PAREN_CLOSE mblock else { $$ = block_end($remember, newCONDOP(0, $mexpr, $else, op_scope($mblock))); parser - > copline = (line_t)$KW_UNLESS; } | KW_GIVEN PERLY_PAREN_OPEN remember mexpr PERLY_PAREN_CLOSE mblock { $$ = block_end($remember, newGIVENOP($mexpr, op_scope($mblock), 0)); parser - > copline = (line_t)$KW_GIVEN; } | KW_WHEN PERLY_PAREN_OPEN remember mexpr PERLY_PAREN_CLOSE mblock { $$ = block_end($remember, newWHENOP($mexpr, op_scope($mblock))); } | KW_DEFAULT block { $$ = newWHENOP(0, op_scope($block)); } | KW_WHILE PERLY_PAREN_OPEN remember texpr PERLY_PAREN_CLOSE mintro mblock cont { $$ = block_end($remember, newWHILEOP(0, 1, NULL, $texpr, $mblock, $cont, $mintro)); parser - > copline = (line_t)$KW_WHILE; } | KW_UNTIL PERLY_PAREN_OPEN remember iexpr PERLY_PAREN_CLOSE mintro mblock cont { $$ = block_end($remember, newWHILEOP(0, 1, NULL, $iexpr, $mblock, $cont, $mintro)); parser - > copline = (line_t)$KW_UNTIL; } | KW_FOR PERLY_PAREN_OPEN remember mnexpr[init_mnexpr] PERLY_SEMICOLON { parser - > expect = XTERM; } texpr PERLY_SEMICOLON { parser - > expect = XTERM; } mintro mnexpr[iterate_mnexpr] PERLY_PAREN_CLOSE mblock { OP * initop = $init_mnexpr; OP * forop = newWHILEOP(0, 1, NULL, scalar($texpr), $mblock, $iterate_mnexpr, $mintro); if (initop) { forop = op_prepend_elem(OP_LINESEQ, initop, op_append_elem(OP_LINESEQ, newOP(OP_UNSTACK, OPf_SPECIAL), forop)); } PL_hints |= HINT_BLOCK_SCOPE; $$ = block_end($remember, forop); parser - > copline = (line_t)$KW_FOR; } barestmt: PLUGSTMT | KW_PACKAGE BAREWORD[version] BAREWORD[package] PERLY_SEMICOLON / * version and package appear in the reverse order to what may be * expected, because toke.c has already pushed both of them to a stack * by calling force_next() from within force_version(). * When the parser pops them back out again they appear swapped * / { package($package); if ($version) package_version($version); $$ = NULL; } | KW_CLASS BAREWORD[version] BAREWORD[package] subattrlist PERLY_SEMICOLON { package($package); if ($version) package_version($version); $$ = NULL; class_setup_stash(PL_curstash); if ($subattrlist) { class_apply_attributes(PL_curstash, $subattrlist); } } | KW_USE_or_NO startsub { CvSPECIAL_on(PL_compcv); / * It's a BEGIN {} * / } BAREWORD[version] BAREWORD[module] optlistexpr PERLY_SEMICOLON / * version and package appear in reverse order for the same reason as * KW_PACKAGE; see comment above * / { SvREFCNT_inc_simple_void(PL_compcv); utilize($KW_USE_or_NO, $startsub, $version, $module, $optlistexpr); parser - > parsed_sub = 1; $$ = NULL; } | KW_IF PERLY_PAREN_OPEN remember mexpr PERLY_PAREN_CLOSE mblock else { $$ = block_end($remember, newCONDOP(0, $mexpr, op_scope($mblock), $else)); parser - > copline = (line_t)$KW_IF; } | KW_UNLESS PERLY_PAREN_OPEN remember mexpr PERLY_PAREN_CLOSE mblock else { $$ = block_end($remember, newCONDOP(0, $mexpr, $else, op_scope($mblock))); parser - > copline = (line_t)$KW_UNLESS; } | KW_GIVEN PERLY_PAREN_OPEN remember mexpr PERLY_PAREN_CLOSE mblock { $$ = block_end($remember, newGIVENOP($mexpr, op_scope($mblock), 0)); parser - > copline = (line_t)$KW_GIVEN; } | KW_WHEN PERLY_PAREN_OPEN remember mexpr PERLY_PAREN_CLOSE mblock { $$ = block_end($remember, newWHENOP($mexpr, op_scope($mblock))); } | KW_DEFAULT block { $$ = newWHENOP(0, op_scope($block)); } | KW_WHILE PERLY_PAREN_OPEN remember texpr PERLY_PAREN_CLOSE mintro mblock cont { $$ = block_end($remember, newWHILEOP(0, 1, NULL, $texpr, $mblock, $cont, $mintro)); parser - > copline = (line_t)$KW_WHILE; } | KW_UNTIL PERLY_PAREN_OPEN remember iexpr PERLY_PAREN_CLOSE mintro mblock cont { $$ = block_end($remember, newWHILEOP(0, 1, NULL, $iexpr, $mblock, $cont, $mintro)); parser - > copline = (line_t)$KW_UNTIL; } | KW_FOR PERLY_PAREN_OPEN remember mnexpr[init_mnexpr] PERLY_SEMICOLON { parser - > expect = XTERM; } texpr PERLY_SEMICOLON { parser - > expect = XTERM; } mintro mnexpr[iterate_mnexpr] PERLY_PAREN_CLOSE mblock { OP * initop = $init_mnexpr; OP * forop = newWHILEOP(0, 1, NULL, scalar($texpr), $mblock, $iterate_mnexpr, $mintro); if (initop) { forop = op_prepend_elem(OP_LINESEQ, initop, op_append_elem(OP_LINESEQ, newOP(OP_UNSTACK, OPf_SPECIAL), forop)); } PL_hints |= HINT_BLOCK_SCOPE; $$ = block_end($remember, forop); parser - > copline = (line_t)$KW_FOR; } | KW_FOR KW_MY remember my_scalar PERLY_PAREN_OPEN mexpr PERLY_PAREN_CLOSE mblock cont { $$ = block_end($remember, newFOROP(0, $my_scalar, $mexpr, $mblock, $cont)); parser - > copline = (line_t)$KW_FOR; } | KW_FOR KW_MY remember PERLY_PAREN_OPEN my_list_of_scalars PERLY_PAREN_CLOSE PERLY_PAREN_OPEN mexpr PERLY_PAREN_CLOSE mblock cont { if ($my_list_of_scalars - > op_type = = OP_PADSV) / * degenerate case of 1 var: for my ($x) .... F l ag it so it can be special - cased in newFOROP * / $my_list_of_scalars - > op_flags |= OPf_PARENS; $$ = block_end($remember, newFOROP(0, $my_list_of_scalars, $mexpr, $mblock, $cont)); parser - > copline = (line_t)$KW_FOR; } | KW_FOR scalar PERLY_PAREN_OPEN remember mexpr PERLY_PAREN_CLOSE mblock cont { $$ = block_end($remember, newFOROP(0, op_lvalue($scalar, OP_ENTERLOOP), $mexpr, $mblock, $cont)); parser - > copline = (line_t)$KW_FOR; barestmt: PLUGSTMT { $$ = $PLUGSTMT; } | KW_FORMAT startformsub formname formblock { CV * fmtcv = PL_compcv; newFORM($startformsub, $formname, $formblock); $$ = NULL; if (CvOUTSIDE(fmtcv) & & !CvEVAL(CvOUTSIDE(fmtcv))) { pad_add_weakref(fmtcv); } parser - > parsed_sub = 1; } | KW_SUB_named subname startsub / * sub declaration or def i nition not within scope of 'use feature "signatures"' * / { init_named_cv(PL_compcv, $subname); parser - > in_my = 0; parser - > in_my_stash = NULL; } proto subattrlist optsubbody { SvREFCNT_inc_simple_void(PL_compcv); $subname - > op_type = = OP_CONST ? newATTRSUB($startsub, $subname, $proto, $subattrlist, $optsubbody) : newMYSUB($startsub, $subname, $proto, $subattrlist, $optsubbody) ; $$ = NULL; intro_my(); parser - > parsed_sub = 1; } expr : expr[lhs] ANDOP expr[rhs] { $$ = newLOGOP(OP_AND, 0, $lhs, $rhs); } | expr[lhs] PLUGIN_LOGICAL_AND_LOW_OP[op] expr[rhs] { $$ = build_inf i x_plugin($lhs, $rhs, $op); } | expr[lhs] OROP[operator] expr[rhs] { $$ = newLOGOP($operator, 0, $lhs, $rhs); } | expr[lhs] PLUGIN_LOGICAL_OR_LOW_OP[op] expr[rhs] { $$ = build_inf i x_plugin($lhs, $rhs, $op); } | listexpr %prec PREC_LOW ;

Slide 42

Slide 42 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Ruby - stmt and expr https://github.com/ruby/ruby/blob/master/parse.y stmt : keyword_alias f i tem {SET_LEX_STATE(EXPR_FNAME|EXPR_FITEM);} f i tem { $$ = NEW_ALIAS($2, $4, &@$, &@1); / * % ripper: alias!($ : 2, $ : 4) % * / } | keyword_alias tGVAR tGVAR { $$ = NEW_VALIAS($2, $3, &@$, &@1); / * % ripper: var_alias!($ : 2, $ : 3) % * / } | keyword_alias tGVAR tBACK_REF { char buf[2]; buf[0] = '$'; buf[1] = (char)RNODE_BACK_REF($3) - > nd_nth; $$ = NEW_VALIAS($2, rb_intern2(buf, 2), &@$, &@1); / * % ripper: var_alias!($ : 2, $ : 3) % * / } | keyword_alias tGVAR tNTH_REF { static const char mesg[] = "can't make alias for the number variables"; / * %%% * / yyerror1(&@3, mesg); / * % % * / $$ = NEW_ERROR(&@$); / * % ripper[error] : alias_error!(ERR_MESG(), $ : 3) % * / } | keyword_undef undef_list { nd_set_f i rst_loc($2, @1.beg_pos); RNODE_UNDEF($2) - > keyword_loc = @1; $$ = $2; / * % ripper: undef!($ : 2) % * / } | stmt modif i er_if expr_value { $$ = new_if(p, $3, remove_begin($1), 0, &@$, &@2, &NULL_LOC, &NULL_LOC); f i xpos($$, $3); / * % ripper: if_mod!($ : 3, $ : 1) % * / } | stmt modif i er_unless expr_value { $$ = new_unless(p, $3, remove_begin($1), 0, &@$, &@2, &NULL_LOC, &NULL_LOC); f i xpos($$, $3); / * % ripper: unless_mod!($ : 3, $ : 1) % * / } | stmt modif i er_while expr_value { clear_block_exit(p, false); if ($1 & & nd_type_p($1, NODE_BEGIN)) { $$ = NEW_WHILE(cond(p, $3, &@3), RNODE_BEGIN($1) - > nd_body, 0, &@$, &@2, &NULL_LOC); } else { $$ = NEW_WHILE(cond(p, $3, &@3), $1, 1, &@$, &@2, &NULL_LOC); } / * % ripper: while_mod!($ : 3, $ : 1) % * / } | stmt modif i er_until expr_value { clear_block_exit(p, false); if ($1 & & nd_type_p($1, NODE_BEGIN)) { $$ = NEW_UNTIL(cond(p, $3, &@3), RNODE_BEGIN($1) - > nd_body, 0, &@$, &@2, &NULL_LOC); } else { $$ = NEW_UNTIL(cond(p, $3, &@3), $1, 1, &@$, &@2, &NULL_LOC); } / * % ripper: until_mod!($ : 3, $ : 1) % * / } | stmt modif i er_rescue after_rescue stmt { p - > ctxt.in_rescue = $3.in_rescue; NODE * resq; YYLTYPE loc = code_loc_gen(&@2, &@4); resq = NEW_RESBODY(0, 0, remove_begin($4), 0, &loc); $$ = NEW_RESCUE(remove_begin($1), resq, 0, &@$); / * % ripper: rescue_mod!($ : 1, $ : 4) % * / } | k_END allow_exits '{' compstmt(stmts) '}' { if (p - > ctxt.in_def) { rb_warn0("END in method; use at_exit"); } restore_block_exit(p, $allow_exits); p - > ctxt = $k_END; { NODE * scope = NEW_SCOPE2(0 / * tbl * / , 0 / * args * / , $compstmt / * body * / , &@$); $$ = NEW_POSTEXE(scope, &@$, &@1, &@3, &@5); } / * % ripper: END!($:compstmt) % * / } | command_asgn | mlhs '=' lex_ctxt command_call_value { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: massign!($ : 1, $ : 4) % * / } | asgn(mrhs) | mlhs '=' lex_ctxt mrhs_arg modif i er_rescue after_rescue stmt[resbody] { p - > ctxt.in_rescue = $3.in_rescue; YYLTYPE loc = code_loc_gen(&@modif i er_rescue, &@resbody); $resbody = NEW_RESBODY(0, 0, remove_begin($resbody), 0, &loc); loc.beg_pos = @mrhs_arg.beg_pos; $mrhs_arg = NEW_RESCUE($mrhs_arg, $resbody, 0, &loc); $$ = node_assign(p, (NODE *)$mlhs, $mrhs_arg, $lex_ctxt, &@$); / * % ripper: massign!($ : 1, rescue_mod!($ : 4, $ : 7)) % * / } | mlhs '=' lex_ctxt mrhs_arg { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: massign!($ : 1, $ : 4) % * / } | expr | error { (void)yynerrs; $$ = NEW_ERROR(&@$); } ; expr : command_call | expr keyword_and expr { $$ = logop(p, idAND, $1, $3, &@2, &@$); / * % ripper: binary!($ : 1, ID2VAL(idAND), $ : 3) % * / } | expr keyword_or expr { $$ = logop(p, idOR, $1, $3, &@2, &@$); / * % ripper: binary!($ : 1, ID2VAL(idOR), $ : 3) % * / } | keyword_not '\n'? expr { $$ = call_uni_op(p, method_cond(p, $3, &@3), METHOD_NOT, &@1, &@$); / * % ripper: unary!(ID2VAL(idNOT), $ : 3) % * / } | '!' command_call { $$ = call_uni_op(p, method_cond(p, $2, &@2), '!', &@1, &@$); / * % ripper: unary!(ID2VAL('\'!\''), $ : 2) % * / } | arg tASSOC { value_expr($arg); } p_in_kwarg[ctxt] p_pvtbl p_pktbl p_top_expr_body[body] { pop_pktbl(p, $p_pktbl); pop_pvtbl(p, $p_pvtbl); p - > ctxt.in_kwarg = $ctxt.in_kwarg; $$ = NEW_CASE3($arg, NEW_IN($body, 0, 0, &@body), &@$, &NULL_LOC, &NULL_LOC); / * % ripper: case!($:arg, in!($:body, Qnil, Qnil)) % * / } | arg keyword_in { value_expr($arg); } p_in_kwarg[ctxt] p_pvtbl p_pktbl p_top_expr_body[body] { pop_pktbl(p, $p_pktbl); pop_pvtbl(p, $p_pvtbl); p - > ctxt.in_kwarg = $ctxt.in_kwarg; $$ = NEW_CASE3($arg, NEW_IN($body, NEW_TRUE(&@body), NEW_FALSE(&@body), &@body), &@$, &NULL_LOC, &NULL_LOC); / * % ripper: case!($:arg, in!($:body, Qnil, Qnil)) % * / } | arg %prec tLBRACE_ARG ;

Slide 43

Slide 43 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Generation rule hierarchy in PHP

Slide 44

Slide 44 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Generation rule hierarchy in Perl

Slide 45

Slide 45 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Generation rule hierarchy in Ruby

Slide 46

Slide 46 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Generation rule hierarchy in Ruby

Slide 47

Slide 47 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Generation rule hierarchy in Ruby

Slide 48

Slide 48 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Generation rule hierarchy in Ruby

Slide 49

Slide 49 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Generation rule hierarchy in Ruby

Slide 50

Slide 50 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Characteristic of Grammatical Structure Many programming languages typically just distinguish between statements (stmt) and expressions (expr) Ruby's grammar is more nuanced, breaking things down into `stmt`, `expr`, `arg`, and `primary` categories Are these gradated generation rules the source of a fl exible grammar?

Slide 51

Slide 51 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Role of arg A generation rule specialized primarily for representing arguments passed to method calls It deals with context-speci fi c syntax and interpretation rules for method argument lists Keyword argument(key: value), assign(meth(a = v)), Brackets omitted(puts 'foo')

Slide 52

Slide 52 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah The reason expr and arg were split Disambiguation f x + y

Slide 53

Slide 53 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah The reason expr and arg were split Disambiguation f x + y f(x + y) f(x) + y

Slide 54

Slide 54 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah The reason expr and arg were split Ruby-Speci fi c Flexible Syntax Support puts "hello", 1 args

Slide 55

Slide 55 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Role of primary Represents the most basic building block of an expression Provides the basis for building more complex expressions Literals(123), variable references(foo), parenthesized expressions((1+2)), and syntax with some keywords(if/ unless/return)

Slide 56

Slide 56 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah The reason arg and primary were split Class de fi nition with super class class A : : B < C : : D; end

Slide 57

Slide 57 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah The reason arg and primary were split Class de fi nition with super class class A : : B < C : : D; end class inheritance? comparison operation?

Slide 58

Slide 58 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah The reason arg and primary were split then that becomes ambiguous begin; rescue = > foo then; end

Slide 59

Slide 59 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah The reason arg and primary were split then that becomes ambiguous begin; rescue = > foo then; end LHS of rescue clause? conditional expression?

Slide 60

Slide 60 text

Reconstructing Ruby Syntactic Structures "I’m in favor or tradition. I’m respectful of and a lover of the tradition. There’s no deconstruction without the memory of the tradition." -- Jacques Derrida

Slide 61

Slide 61 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Method of Organizing

Slide 62

Slide 62 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Think in the code we usually write Duplication of code reduces maintainability and readability def insert(*args) conn = checkout_connenction conn.exec(insert_sql(*args)) ensure checkin_connenction if conn end def update(*args) conn = checkout_connenction conn.exec(update_sql(*args)) ensure checkin_connenction if conn end

Slide 63

Slide 63 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Extract redundant processing to methods def insert(*args) conn = checkout_connenction conn.exec(insert_sql(*args)) ensure checkin_connenction if conn end def update(*args) conn = checkout_connenction conn.exec(update_sql(*args)) ensure checkin_connenction if conn end def checkout conn = checkout_connenction yield conn ensure checkin_connenction if conn end def insert(*args) checkout { | conn| conn.exec(insert_sql(*args))} end def update(*args) checkout { | conn| conn.exec(update_sql(*args))} end

Slide 64

Slide 64 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Find the same structure stmt : keyword_alias f i tem . . . | lhs '=' lex_ctxt mrhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } command_asgn : lhs '=' lex_ctxt command_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } arg : lhs '=' lex_ctxt arg_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / }

Slide 65

Slide 65 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Find the same structure stmt : keyword_alias f i tem . . . | lhs '=' lex_ctxt mrhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } command_asgn : lhs '=' lex_ctxt command_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } arg : lhs '=' lex_ctxt arg_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / }

Slide 66

Slide 66 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Extract the structure %rule asgn(rhs) : lhs '=' lex_ctxt rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; stmt: keyword_alias f i tem . . . | asgn(rhs) ; command_asgn: asgn(command_rhs) ; arg: asgn(arg_rhs) ;

Slide 67

Slide 67 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Structuring power of Lrama Parameterizing Rules A feature that allows you to pass parameters to grammar rules Enables de fi ning generic rules that abstract over similar grammatical structures Think of it like generics (Java/C#) or templates (C++) but for your grammar de fi nitions

Slide 68

Slide 68 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Basic Syntax %rule asgn(rhs) : lhs '=' lex_ctxt rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; command_asgn: asgn(command_rhs) ; arg: asgn(arg_rhs) ;

Slide 69

Slide 69 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah %rule asgn(rhs) : lhs '=' lex_ctxt rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; Identifying pre fi x command_asgn: asgn(command_rhs) ; arg: asgn(arg_rhs) ;

Slide 70

Slide 70 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah %rule asgn(rhs) : lhs '=' lex_ctxt rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; Rule name command_asgn: asgn(command_rhs) ; arg: asgn(arg_rhs) ;

Slide 71

Slide 71 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah %rule asgn(rhs) : lhs '=' lex_ctxt rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; Parameter command_asgn: asgn(command_rhs) ; arg: asgn(arg_rhs) ;

Slide 72

Slide 72 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Pass parameters %rule asgn(rhs) : lhs '=' lex_ctxt rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; arg: asgn(arg_rhs) ; command_asgn: asgn(command_rhs) ;

Slide 73

Slide 73 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Pass parameters %rule asgn(rhs) : lhs '=' lex_ctxt rhsarg_rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; command_asgn: asgn(command_rhs) ; arg: asgn(arg_rhs) ;

Slide 74

Slide 74 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Pass parameters %rule asgn(rhs) : lhs '=' lex_ctxt rhscommand_rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; command_asgn: asgn(command_rhs) ; arg: asgn(arg_rhs) ;

Slide 75

Slide 75 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Expand parameters %rule asgn(rhs) : lhs '=' lex_ctxt rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; arg: asgn(arg_rhs) ; arg : lhs '=' lex_ctxt arg_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / }

Slide 76

Slide 76 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Expand parameters %rule asgn(rhs) : lhs '=' lex_ctxt rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; arg: asgn(arg_rhs) ; arg : lhs '=' lex_ctxt arg_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / }

Slide 77

Slide 77 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Expand parameters %rule asgn(rhs) : lhs '=' lex_ctxt rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; arg: asgn(arg_rhs) ; arg : lhs '=' lex_ctxt arg_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / }

Slide 78

Slide 78 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Expand parameters %rule asgn(rhs) : lhs '=' lex_ctxt rhs { $$ = node_assign(p, (NODE *)$1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } ; arg: asgn(arg_rhs) ; arg : lhs '=' lex_ctxt arg_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / }

Slide 79

Slide 79 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Bene fi ts Reduce Code Duplication: Avoid writing nearly identical rules multiple times (e.g., lists separated by commas, lists separated by semicolons) Increase Reusability: De fi ne a generic pattern once and instantiate it with di ff erent speci fi c tokens or types Improve Maintainability: Changes or fi xes only need to happen in the single generic rule de fi nition Enhance Readability: Makes the overall structure and intent of the grammar clearer and more declarative

Slide 80

Slide 80 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Actual result command_asgn: lhs '=' lex_ctxt command_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } | var_lhs tOP_ASGN lex_ctxt command_rhs { $$ = new_op_assign(p, $1, $2, $4, $3, &@$); / * % ripper: opassign!($ : 1, $ : 2, $ : 4) % * / } | primary_value '[' opt_call_args rbracket tOP_ASGN lex_ctxt command_rhs { $$ = new_ary_op_assign(p, $1, $3, $5, $7, &@3, &@$, &NULL_LOC, &@2, &@4, &@5); / * % ripper: opassign!(aref_f i eld!($ : 1, $ : 3), $ : 5, $ : 7) % * / } | primary_value call_op ident_or_const tOP_ASGN lex_ctxt command_rhs { $$ = new_attr_op_assign(p, $1, $2, $3, $4, $6, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tCONSTANT tOP_ASGN lex_ctxt command_rhs { YYLTYPE loc = code_loc_gen(&@1, &@3); $$ = new_const_op_assign(p, NEW_COLON2($1, $3, &loc), $4, $6, $5, &@$); / * % ripper: opassign!(const_path_f i eld!($ : 1, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tIDENTIFIER tOP_ASGN lex_ctxt command_rhs { $$ = new_attr_op_assign(p, $1, idCOLON2, $3, $4, $6, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | defn_head[head] f_opt_paren_args[args] '=' endless_command[bodystmt] { endless_method_name(p, $head - > nd_mid, &@head); restore_defun(p, $head); $bodystmt = new_scope_body(p, $args, $bodystmt, &@$); ($$ = $head - > nd_def) - > nd_loc = @$; RNODE_DEFN($$) - > nd_defn = $bodystmt; / * % ripper: bodystmt!($:bodystmt, Qnil, Qnil, Qnil) % * / / * % ripper: def!($:head, $:args, $:$) % * / local_pop(p); } | defs_head[head] f_opt_paren_args[args] '=' endless_command[bodystmt] { endless_method_name(p, $head - > nd_mid, &@head); restore_defun(p, $head); $bodystmt = new_scope_body(p, $args, $bodystmt, &@$); ($$ = $head - > nd_def) - > nd_loc = @$; RNODE_DEFS($$) - > nd_defn = $bodystmt; / * % ripper: bodystmt!($:bodystmt, Qnil, Qnil, Qnil) % * / / * % ripper: defs!(*$:head[0 . . 2], $:args, $:$) % * / local_pop(p); } | backref tOP_ASGN lex_ctxt command_rhs { VALUE MAYBE_UNUSED(e) = rb_backref_error(p, $1); $$ = NEW_ERROR(&@$); / * % ripper[error] : assign_error!(?e, opassign!(var_f i eld!($ : 1), $ : 2, $ : 4)) % * / } ;

Slide 81

Slide 81 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Assignment command_asgn: lhs '=' lex_ctxt command_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } | var_lhs tOP_ASGN lex_ctxt command_rhs { $$ = new_op_assign(p, $1, $2, $4, $3, &@$); / * % ripper: opassign!($ : 1, $ : 2, $ : 4) % * / } | primary_value '[' opt_call_args rbracket tOP_ASGN lex_ctxt command_rhs { $$ = new_ary_op_assign(p, $1, $3, $5, $7, &@3, &@$, &NULL_LOC, &@2, &@4, &@5); / * % ripper: opassign!(aref_f i eld!($ : 1, $ : 3), $ : 5, $ : 7) % * / } | primary_value call_op ident_or_const tOP_ASGN lex_ctxt command_rhs { $$ = new_attr_op_assign(p, $1, $2, $3, $4, $6, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tCONSTANT tOP_ASGN lex_ctxt command_rhs { YYLTYPE loc = code_loc_gen(&@1, &@3); $$ = new_const_op_assign(p, NEW_COLON2($1, $3, &loc), $4, $6, $5, &@$); / * % ripper: opassign!(const_path_f i eld!($ : 1, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tIDENTIFIER tOP_ASGN lex_ctxt command_rhs { $$ = new_attr_op_assign(p, $1, idCOLON2, $3, $4, $6, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | defn_head[head] f_opt_paren_args[args] '=' endless_command[bodystmt] { endless_method_name(p, $head - > nd_mid, &@head); restore_defun(p, $head); $bodystmt = new_scope_body(p, $args, $bodystmt, &@$); ($$ = $head - > nd_def) - > nd_loc = @$; RNODE_DEFN($$) - > nd_defn = $bodystmt; / * % ripper: bodystmt!($:bodystmt, Qnil, Qnil, Qnil) % * / / * % ripper: def!($:head, $:args, $:$) % * / local_pop(p); } | defs_head[head] f_opt_paren_args[args] '=' endless_command[bodystmt] { endless_method_name(p, $head - > nd_mid, &@head); restore_defun(p, $head); $bodystmt = new_scope_body(p, $args, $bodystmt, &@$); ($$ = $head - > nd_def) - > nd_loc = @$; RNODE_DEFS($$) - > nd_defn = $bodystmt; / * % ripper: bodystmt!($:bodystmt, Qnil, Qnil, Qnil) % * / / * % ripper: defs!(*$:head[0 . . 2], $:args, $:$) % * / local_pop(p); } | backref tOP_ASGN lex_ctxt command_rhs { VALUE MAYBE_UNUSED(e) = rb_backref_error(p, $1); $$ = NEW_ERROR(&@$); / * % ripper[error] : assign_error!(?e, opassign!(var_f i eld!($ : 1), $ : 2, $ : 4)) % * / } ;

Slide 82

Slide 82 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Operator assignment command_asgn: lhs '=' lex_ctxt command_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } | var_lhs tOP_ASGN lex_ctxt command_rhs { $$ = new_op_assign(p, $1, $2, $4, $3, &@$); / * % ripper: opassign!($ : 1, $ : 2, $ : 4) % * / } | primary_value '[' opt_call_args rbracket tOP_ASGN lex_ctxt command_rhs { $$ = new_ary_op_assign(p, $1, $3, $5, $7, &@3, &@$, &NULL_LOC, &@2, &@4, &@5); / * % ripper: opassign!(aref_f i eld!($ : 1, $ : 3), $ : 5, $ : 7) % * / } | primary_value call_op ident_or_const tOP_ASGN lex_ctxt command_rhs { $$ = new_attr_op_assign(p, $1, $2, $3, $4, $6, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tCONSTANT tOP_ASGN lex_ctxt command_rhs { YYLTYPE loc = code_loc_gen(&@1, &@3); $$ = new_const_op_assign(p, NEW_COLON2($1, $3, &loc), $4, $6, $5, &@$); / * % ripper: opassign!(const_path_f i eld!($ : 1, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tIDENTIFIER tOP_ASGN lex_ctxt command_rhs { $$ = new_attr_op_assign(p, $1, idCOLON2, $3, $4, $6, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | defn_head[head] f_opt_paren_args[args] '=' endless_command[bodystmt] { endless_method_name(p, $head - > nd_mid, &@head); restore_defun(p, $head); $bodystmt = new_scope_body(p, $args, $bodystmt, &@$); ($$ = $head - > nd_def) - > nd_loc = @$; RNODE_DEFN($$) - > nd_defn = $bodystmt; / * % ripper: bodystmt!($:bodystmt, Qnil, Qnil, Qnil) % * / / * % ripper: def!($:head, $:args, $:$) % * / local_pop(p); } | defs_head[head] f_opt_paren_args[args] '=' endless_command[bodystmt] { endless_method_name(p, $head - > nd_mid, &@head); restore_defun(p, $head); $bodystmt = new_scope_body(p, $args, $bodystmt, &@$); ($$ = $head - > nd_def) - > nd_loc = @$; RNODE_DEFS($$) - > nd_defn = $bodystmt; / * % ripper: bodystmt!($:bodystmt, Qnil, Qnil, Qnil) % * / / * % ripper: defs!(*$:head[0 . . 2], $:args, $:$) % * / local_pop(p); } | backref tOP_ASGN lex_ctxt command_rhs { VALUE MAYBE_UNUSED(e) = rb_backref_error(p, $1); $$ = NEW_ERROR(&@$); / * % ripper[error] : assign_error!(?e, opassign!(var_f i eld!($ : 1), $ : 2, $ : 4)) % * / } ;

Slide 83

Slide 83 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Endless method de fi nition command_asgn: lhs '=' lex_ctxt command_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } | var_lhs tOP_ASGN lex_ctxt command_rhs { $$ = new_op_assign(p, $1, $2, $4, $3, &@$); / * % ripper: opassign!($ : 1, $ : 2, $ : 4) % * / } | primary_value '[' opt_call_args rbracket tOP_ASGN lex_ctxt command_rhs { $$ = new_ary_op_assign(p, $1, $3, $5, $7, &@3, &@$, &NULL_LOC, &@2, &@4, &@5); / * % ripper: opassign!(aref_f i eld!($ : 1, $ : 3), $ : 5, $ : 7) % * / } | primary_value call_op ident_or_const tOP_ASGN lex_ctxt command_rhs { $$ = new_attr_op_assign(p, $1, $2, $3, $4, $6, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tCONSTANT tOP_ASGN lex_ctxt command_rhs { YYLTYPE loc = code_loc_gen(&@1, &@3); $$ = new_const_op_assign(p, NEW_COLON2($1, $3, &loc), $4, $6, $5, &@$); / * % ripper: opassign!(const_path_f i eld!($ : 1, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tIDENTIFIER tOP_ASGN lex_ctxt command_rhs { $$ = new_attr_op_assign(p, $1, idCOLON2, $3, $4, $6, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | defn_head[head] f_opt_paren_args[args] '=' endless_command[bodystmt] { endless_method_name(p, $head - > nd_mid, &@head); restore_defun(p, $head); $bodystmt = new_scope_body(p, $args, $bodystmt, &@$); ($$ = $head - > nd_def) - > nd_loc = @$; RNODE_DEFN($$) - > nd_defn = $bodystmt; / * % ripper: bodystmt!($:bodystmt, Qnil, Qnil, Qnil) % * / / * % ripper: def!($:head, $:args, $:$) % * / local_pop(p); } | defs_head[head] f_opt_paren_args[args] '=' endless_command[bodystmt] { endless_method_name(p, $head - > nd_mid, &@head); restore_defun(p, $head); $bodystmt = new_scope_body(p, $args, $bodystmt, &@$); ($$ = $head - > nd_def) - > nd_loc = @$; RNODE_DEFS($$) - > nd_defn = $bodystmt; / * % ripper: bodystmt!($:bodystmt, Qnil, Qnil, Qnil) % * / / * % ripper: defs!(*$:head[0 . . 2], $:args, $:$) % * / local_pop(p); } | backref tOP_ASGN lex_ctxt command_rhs { VALUE MAYBE_UNUSED(e) = rb_backref_error(p, $1); $$ = NEW_ERROR(&@$); / * % ripper[error] : assign_error!(?e, opassign!(var_f i eld!($ : 1), $ : 2, $ : 4)) % * / } ;

Slide 84

Slide 84 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Actual result command_asgn : asgn(command_rhs) | op_asgn(command_rhs) | def_endless_method(endless_command) ; command_asgn: lhs '=' lex_ctxt command_rhs { $$ = node_assign(p, $1, $4, $3, &@$); / * % ripper: assign!($ : 1, $ : 4) % * / } | var_lhs tOP_ASGN lex_ctxt command_rhs { $$ = new_op_assign(p, $1, $2, $4, $3, &@$); / * % ripper: opassign!($ : 1, $ : 2, $ : 4) % * / } | primary_value '[' opt_call_args rbracket tOP_ASGN lex_ctxt command_rhs { $$ = new_ary_op_assign(p, $1, $3, $5, $7, &@3, &@$, &NULL_LOC, &@2, &@4, &@5); / * % ripper: opassign!(aref_f i eld!($ : 1, $ : 3), $ : 5, $ : 7) % * / } | primary_value call_op ident_or_const tOP_ASGN lex_ctxt command_rhs { $$ = new_attr_op_assign(p, $1, $2, $3, $4, $6, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tCONSTANT tOP_ASGN lex_ctxt command_rhs { YYLTYPE loc = code_loc_gen(&@1, &@3); $$ = new_const_op_assign(p, NEW_COLON2($1, $3, &loc), $4, $6, $5, &@$); / * % ripper: opassign!(const_path_f i eld!($ : 1, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tIDENTIFIER tOP_ASGN lex_ctxt command_rhs { $$ = new_attr_op_assign(p, $1, idCOLON2, $3, $4, $6, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | defn_head[head] f_opt_paren_args[args] '=' endless_command[bodystmt] { endless_method_name(p, $head - > nd_mid, &@head); restore_defun(p, $head); $bodystmt = new_scope_body(p, $args, $bodystmt, &@$); ($$ = $head - > nd_def) - > nd_loc = @$; RNODE_DEFN($$) - > nd_defn = $bodystmt; / * % ripper: bodystmt!($:bodystmt, Qnil, Qnil, Qnil) % * / / * % ripper: def!($:head, $:args, $:$) % * / local_pop(p); } | defs_head[head] f_opt_paren_args[args] '=' endless_command[bodystmt] { endless_method_name(p, $head - > nd_mid, &@head); restore_defun(p, $head); $bodystmt = new_scope_body(p, $args, $bodystmt, &@$); ($$ = $head - > nd_def) - > nd_loc = @$; RNODE_DEFS($$) - > nd_defn = $bodystmt; / * % ripper: bodystmt!($:bodystmt, Qnil, Qnil, Qnil) % * / / * % ripper: defs!(*$:head[0 . . 2], $:args, $:$) % * / local_pop(p); } | backref tOP_ASGN lex_ctxt command_rhs { VALUE MAYBE_UNUSED(e) = rb_backref_error(p, $1); $$ = NEW_ERROR(&@$); / * % ripper[error] : assign_error!(?e, opassign!(var_f i eld!($ : 1), $ : 2, $ : 4)) % * / } ;

Slide 85

Slide 85 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah https://bugs.ruby-lang.org/issues/21153 Precise feedback on grammar # ok Foo | | = p 1 # syntax error : : Foo | | = p 1

Slide 86

Slide 86 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Generation rules before fi x arg : asgn(lhs, arg_rhs) | op_asgn(arg_rhs) | tCOLON3 tCONSTANT tOP_ASGN lex_ctxt arg_rhs { YYLTYPE loc = code_loc_gen(&@1, &@2); $$ = new_const_op_assign(p, NEW_COLON3($2, &loc), $3, $5, $4, &@$); / * % ripper: opassign!(top_const_f i eld!($ : 2), $ : 3, $ : 5) % * / } : (snip) command_asgn : asgn(command_rhs) | op_asgn(command_rhs) | def_endless_method(endless_command) ;

Slide 87

Slide 87 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Also an operator assignment arg : asgn(lhs, arg_rhs) | op_asgn(arg_rhs) | tCOLON3 tCONSTANT tOP_ASGN lex_ctxt arg_rhs { YYLTYPE loc = code_loc_gen(&@1, &@2); $$ = new_const_op_assign(p, NEW_COLON3($2, &loc), $3, $5, $4, &@$); / * % ripper: opassign!(top_const_f i eld!($ : 2), $ : 3, $ : 5) % * / } : (snip) command_asgn : asgn(command_rhs) | op_asgn(command_rhs) | def_endless_method(endless_command) ;

Slide 88

Slide 88 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah How to Fix %rule op_asgn : var_lhs tOP_ASGN lex_ctxt rhs { $$ = new_op_assign(p, $1, $2, $4, $3, &@$); / * % ripper: opassign!($ : 1, $ : 2, $ : 4) % * / } | primary_value '[' opt_call_args rbracket tOP_ASGN lex_ctxt rhs { $$ = new_ary_op_assign(p, $1, $3, $5, $7, &@3, &@$, &NULL_LOC, &@2, &@4, &@5); / * % ripper: opassign!(aref_f i eld!($ : 1, $ : 3), $ : 5, $ : 7) % * / } | primary_value call_op tIDENTIFIER tOP_ASGN lex_ctxt rhs { $$ = new_attr_op_assign(p, $1, $2, $3, $4, $rhs, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | primary_value call_op tCONSTANT tOP_ASGN lex_ctxt rhs { $$ = new_attr_op_assign(p, $1, $2, $3, $4, $rhs, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tIDENTIFIER tOP_ASGN lex_ctxt rhs { $$ = new_attr_op_assign(p, $1, idCOLON2, $3, $4, $rhs, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tCONSTANT tOP_ASGN lex_ctxt rhs { YYLTYPE loc = code_loc_gen(&@1, &@3); $$ = new_const_op_assign(p, NEW_COLON2($1, $3, &loc), $4, $6, $5, &@$); / * % ripper: opassign!(const_path_f i eld!($ : 1, $ : 3), $ : 4, $ : 6) % * / } | backref tOP_ASGN lex_ctxt rhs { VALUE MAYBE_UNUSED(e) = rb_backref_error(p, $1); $$ = NEW_ERROR(&@$); / * % ripper[error] : assign_error!(?e, opassign!(var_f i eld!($ : 1), $ : 2, $ : 4)) % * / } ; arg : asgn(lhs, arg_rhs) | op_asgn(arg_rhs) | tCOLON3 tCONSTANT tOP_ASGN lex_ctxt arg_rhs { YYLTYPE loc = code_loc_gen(&@1, &@2); $$ = new_const_op_assign(p, NEW_COLON3($2, &loc), $3, $5, $4, &@$); / * % ripper: opassign!(top_const_f i eld!($ : 2), $ : 3, $ : 5) % * / } : (snip) command_asgn : asgn(command_rhs) | op_asgn(command_rhs) : (snip)

Slide 89

Slide 89 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah How to Fix %rule op_asgn : var_lhs tOP_ASGN lex_ctxt rhs { $$ = new_op_assign(p, $1, $2, $4, $3, &@$); / * % ripper: opassign!($ : 1, $ : 2, $ : 4) % * / } | primary_value '[' opt_call_args rbracket tOP_ASGN lex_ctxt rhs { $$ = new_ary_op_assign(p, $1, $3, $5, $7, &@3, &@$, &NULL_LOC, &@2, &@4, &@5); / * % ripper: opassign!(aref_f i eld!($ : 1, $ : 3), $ : 5, $ : 7) % * / } | primary_value call_op tIDENTIFIER tOP_ASGN lex_ctxt rhs { $$ = new_attr_op_assign(p, $1, $2, $3, $4, $rhs, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | primary_value call_op tCONSTANT tOP_ASGN lex_ctxt rhs { $$ = new_attr_op_assign(p, $1, $2, $3, $4, $rhs, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tIDENTIFIER tOP_ASGN lex_ctxt rhs { $$ = new_attr_op_assign(p, $1, idCOLON2, $3, $4, $rhs, &@$, &@2, &@3, &@4); / * % ripper: opassign!(f i eld!($ : 1, $ : 2, $ : 3), $ : 4, $ : 6) % * / } | primary_value tCOLON2 tCONSTANT tOP_ASGN lex_ctxt rhs { YYLTYPE loc = code_loc_gen(&@1, &@3); $$ = new_const_op_assign(p, NEW_COLON2($1, $3, &loc), $4, $6, $5, &@$); / * % ripper: opassign!(const_path_f i eld!($ : 1, $ : 3), $ : 4, $ : 6) % * / } | tCOLON3 tCONSTANT tOP_ASGN lex_ctxt rhs { YYLTYPE loc = code_loc_gen(&@tCOLON3, &@tCONSTANT); $$ = new_const_op_assign(p, NEW_COLON3($2, &loc), $3, $5, $4, &@$); / * % ripper: opassign!(top_const_f i eld!($ : 2), $ : 3, $ : 5) % * / } | backref tOP_ASGN lex_ctxt rhs { VALUE MAYBE_UNUSED(e) = rb_backref_error(p, $1); $$ = NEW_ERROR(&@$); / * % ripper[error] : assign_error!(?e, opassign!(var_f i eld!($ : 1), $ : 2, $ : 4)) % * / } ; arg : asgn(lhs, arg_rhs) | op_asgn(arg_rhs) | tCOLON3 tCONSTANT tOP_ASGN lex_ctxt arg_rhs { YYLTYPE loc = code_loc_gen(&@1, &@2); $$ = new_const_op_assign(p, NEW_COLON3($2, &loc), $3, $5, $4, &@$); / * % ripper: opassign!(top_const_f i eld!($ : 2), $ : 3, $ : 5) % * / } : (snip) command_asgn : asgn(command_rhs) | op_asgn(command_rhs) : (snip)

Slide 90

Slide 90 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah The di ff erence is clari fi ed by making it common arg : asgn(lhs, arg_rhs) | op_asgn(arg_rhs) | tCOLON3 tCONSTANT tOP_ASGN lex_ctxt arg_rhs { YYLTYPE loc = code_loc_gen(&@1, &@2); $$ = new_const_op_assign(p, NEW_COLON3($2, &loc), $3, $5, $4, &@$); / * % ripper: opassign!(top_const_f i eld!($ : 2), $ : 3, $ : 5) % * / } : (snip) command_asgn : asgn(command_rhs) | op_asgn(command_rhs) : (snip)

Slide 91

Slide 91 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah The di ff erence is clari fi ed by making it common arg : asgn(lhs, arg_rhs) | op_asgn(arg_rhs) | tCOLON3 tCONSTANT tOP_ASGN lex_ctxt arg_rhs { YYLTYPE loc = code_loc_gen(&@1, &@2); $$ = new_const_op_assign(p, NEW_COLON3($2, &loc), $3, $5, $4, &@$); / * % ripper: opassign!(top_const_f i eld!($ : 2), $ : 3, $ : 5) % * / } : (snip) command_asgn : asgn(command_rhs) | op_asgn(command_rhs) : (snip)

Slide 92

Slide 92 text

Conclusion "Language is a process of free creation; its laws and principles are fi xed, but the manner in which the principles of generation are used is free and in fi nitely varied. Even the interpretation and use of words involves a process of free creation." -- Norm Chomsky

Slide 93

Slide 93 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Dissecting and Reconstructing Ruby Syntactic Structures Ruby requires multiple production rule layers due to its syntax fl exibility Traditional BNF notation had limitations for expressing this complexity Lrama's new notation enables structure abstraction This improves maintainability while preserving Ruby's fl exible syntax

Slide 94

Slide 94 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Ruby 3.5 Standard library enhancements e.g. improve `list` actions to make them easier to use Inlining enhancements Re fl ects the result of dog feeding in parse.y Working Toward a Universal Parser Compatibility for Prism interfaces

Slide 95

Slide 95 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Acknowledgements LR parser gangs Dragon books club Yuichiro Kaneko (@yui-knk) Yuta Saito (@kateinoigakukun) My wife Mai and my child Mahiro

Slide 96

Slide 96 text

"Dissecting and Reconstructing Ruby Syntactic Structures" ʔ @ydah Stargaze at github.com/ruby/lrama 96