Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Does Ruby Parser dream of highly expressive grammar?

Does Ruby Parser dream of highly expressive grammar?

髙田 雄大 @ydah
2025 年 5 月 15 日
RubyKaigi 2024
The ruby parser grammar file has been subjected to many complicated workarounds due to the limitations by Bison. You've probably thought at least once or twice that if richer DSL support would make grammar files more readable.

For example, pattern involving multiple repetitions separated by commas, like method call arguments, are quite common. It is only natural to desire a concise way to express them given their frequency. In addition, resolving cases where the added grammar conflicts with the existing one is challenging in ruby, which boasts a flexible grammar. It would be easier to add/change the grammar if there are a solution that could easily resolve this conflict.

In this talk, cover into the outcomes of incorporating additional functionality to extend the DSL in the parser generator. I believe this extension aims to enhance the maintainability of the grammar, thereby contributing to the future development of ruby's syntax.

ANDPAD inc

May 16, 2024
Tweet

More Decks by ANDPAD inc

Other Decks in Programming

Transcript

  1. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Does Ruby Parser dream of highly expressive grammar? Yudai TAKADA @ydah 2024/05/16 RubyKaigi 2024
  2. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Yudai TAKADA ΫϦΤΠςΟϒεϖʔε ͜ͷൣғͷதͰσβΠϯ͍ͯͩ͘͠͞ About me Confidential @ydah Software Engineer at ANDPAD Member of Re-architecting team https://ydah.net
  3. © 2024 ANDPAD All Rights Reserved. Confidential Service Summary Office

    Construction-site Sales / Site Supervisor / Architect Office worker / Managerial position (Ruby logo Copyright © 2006, Yukihiro Matsumoto) Cloud-based Project Management Software Products powered by Ruby Carpenter / Vendor Manufacturer / Distribution
  4. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ ΫϦΤΠςΟϒεϖʔε ͜ͷൣғͷதͰσβΠϯ͍ͯͩ͘͠͞ We are Platinum (Party) Sponsor! Confidential
  5. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Yudai TAKADA ΫϦΤΠςΟϒεϖʔε ͜ͷൣғͷதͰσβΠϯ͍ͯͩ͘͠͞ About me Confidential @ydah RuboCop RSpec team Co-Founder of Kyobashi.rb Member of RubyKansai & Asakura.rb Maintainer of Committee Member of Dragon book study club
  6. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ About me Confidential
  7. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ About me Confidential
  8. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Yudai TAKADA ΫϦΤΠςΟϒεϖʔε ͜ͷൣғͷதͰσβΠϯ͍ͯͩ͘͠͞ About me Confidential @ydah RuboCop RSpec team Committer of Lrama (new!) Co-Founder of Kyobashi.rb Member of RubyKansai & Asakura.rb Maintainer of Committee Member of Dragon book study club
  9. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ “これが俺の選んだ仕事なんだ。一生を賭けてやっ てる仕事なんだよ。” —東 光太郎 Confidential Introduction
  10. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • https://github.com/ruby/lrama • LALR (1) parser generator written by Ruby • Output a LALR parser written by C with “parse.y” as input • Replace GNU Bison with Lrama in Ruby 3.3.0 • https://bugs.ruby-lang.org/issues/19637 What’s the Lrama? Confidential
  11. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • We needed to support multiple versions of Bison • Must be able to build with a different version of Bison • New features are added, but they are not available • Difficult to expand Bison functionality • Ruby is not the only user of Bison • Other common parser generators have similar problems Why use Lrama instead of Bison? Confidential
  12. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Maintainability • “parse.y” is difficult • be likened to a Demon Castle, Hell, “魔境” • What's the difficulty? • Grammar provided by Bison is primitive • Parser and Lexer are tightly coupled • Difficult to resolve S/R or R/R conflicts • Need to improve development experience What’s the issue? Confidential
  13. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Today’s Talk
  14. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Today’s Talk Confidential • Improve the developer experience of parse.y • Fight 3 challenges to grammar Maintainability • Consider the possibility of a grammatical extension approach Primitive Grammar Tightly coupled Lexer & Parser Difficult to resolve Conflicts
  15. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential “銀河連邦の一員たるを示すウルトラリングを今お 前たちに与えた。そのリングの光る時、お前たち は私の与えた大いなる力を知るだろう” — ウルトラマンA Primitive Grammar
  16. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ ʆ • Rules have same structure and same actions Primitive Grammar Confidential opt_args_tail : ',' args_tail { $$ = $2; ɹɹɹ / * % ripper: get_value($ : 2); % * / } | / * none * / { $$ = new_args_tail(p, 0, 0, 0, &@0); / * % ripper: rb_ary_new_from_args(3, Qnil, Qnil, Qnil); % * / } ; opt_block_args_tail : ',' block_args_tail { $$ = $2; ɹɹɹ / * % ripper: get_value($ : 2); % * / } | / * none * / { $$ = new_args_tail(p, 0, 0, 0, &@0); / * % ripper: rb_ary_new_from_args(3, Qnil, Qnil, Qnil); % * / } ;
  17. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Rules have same structure and same actions Confidential opt_args_tail : ',' args_tail { $$ = $2; ɹɹɹ / * % ripper: get_value($ : 2); % * / } | / * none * / { $$ = new_args_tail(p, 0, 0, 0, &@0); / * % ripper: rb_ary_new_from_args(3, Qnil, Qnil, Qnil); % * / } ; opt_block_args_tail : ',' block_args_tail { $$ = $2; ɹɹɹ / * % ripper: get_value($ : 2); % * / } | / * none * / { $$ = new_args_tail(p, 0, 0, 0, &@0); / * % ripper: rb_ary_new_from_args(3, Qnil, Qnil, Qnil); % * / } ; The only difference is this nonterminal symbol Primitive Grammar
  18. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential • > Code should be clear and simple straightforward logic, natural expression, conventional language use, meaningful names, neat formatting, helpful comments-and it should avoid clever tricks and unusual constructions. • Brian W. Kernighan, Rob Pike. 1999/02/04 The Practice of Programming. Addison-Wesley Primitive Grammar
  19. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Parameterizing Rules
  20. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • The definition of a non-terminal symbol can be parameterized with other (terminal or non-terminal) symbols • Idea of Parameterizing Rules comes from Menhir LR(1) parser generator • https://gallium.inria.fr/~fpottier/menhir/ manual.html#sec32 Parameterizing Rules Confidential
  21. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • How to define a Rule Confidential %rule option(X) : / * empty * / | X ; %rule preceded(opening, X) : opening X { $$ = $2; } ; Start with the directive %rule Parameterizing Rules
  22. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • How to define a Rule Confidential %rule option(X) : / * empty * / | X ; %rule preceded(opening, X) : opening X { $$ = $2; } ; Define the rule name and parameters pass to the rule Parameterizing Rules
  23. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • How to define a Rule Confidential %rule option(X) : / * empty * / | X ; %rule preceded(opening, X) : opening X { $$ = $2; } ; Define the RHS Parameterizing Rules
  24. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • How to use a Rule Confidential %rule option(X) : / * empty * / | X ; % % opt_foo: option(foo) ; Parameterizing Rules
  25. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • How to use a Rule Confidential %rule option(X) : / * empty * / | X ; % % opt_foo: option(foo) ; Parameterizing Rules foo
  26. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Registers a rule in resolver based on the definition of the rule Confidential %rule option(X) : / * empty * / { $$ = 0} | X ; Rule name Parameter name Parameter counts RHS User code Parameterizing Rules
  27. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Find parameterizing rule by rule name and param count Confidential opt_foo: option(number) ; • name: list, param_count: 1 • name: no_empty_list, param_count: 1 • name: pair, param_count: 2 • name: option, param_count: 1 • name: preceded, param_count: 2 Parameterizing rules Parameterizing Rules •Rule name: option •Parameter counts: 1
  28. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • If the same name and parameter count rules Confidential opt_foo: option(number) ; • name: list, param_count: 1 • name: no_empty_list, param_count: 1 • name: pair, param_count: 2 • name: option, param_count: 1 • name: preceded, param_count: 2 • name: option, param_count: 1 parameterizing_rules Parameterizing Rules •Rule name: option •Parameter counts: 1
  29. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Example of Rule expand Confidential Parameterizing Rules %rule option(X) : / * empty * / | X ; % % opt_foo: option(number) ; if foo else # <— this end
  30. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Example of Rule expand Confidential opt_foo option_number %empty number RHS RHS Rule name Concated arguments names opt_foo: option(number) ; %rule option(X) : / * empty * / | X ; Parameterizing Rules
  31. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Example of Rule expand Confidential opt_foo option_number %empty number RHS RHS opt_foo: option(number) ; %rule option(X) : / * empty * / | X ; Parameterizing Rules
  32. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential opt_args_tail : ',' args_tail { $$ = $2; ɹɹɹ / * % ripper: get_value($ : 2); % * / } | / * none * / { $$ = new_args_tail(p, 0, 0, 0, &@0); / * % ripper: rb_ary_new_from_args(3, Qnil, Qnil, Qnil); % * / } ; opt_block_args_tail : ',' block_args_tail { $$ = $2; ɹɹɹ / * % ripper: get_value($ : 2); % * / } | / * none * / { $$ = new_args_tail(p, 0, 0, 0, &@0); / * % ripper: rb_ary_new_from_args(3, Qnil, Qnil, Qnil); % * / } ; • Using Parameterizing rules ... Parameterizing Rules
  33. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential %rules args_tail(tail) : ',' tail { $$ = $2; ɹɹɹ / * % ripper: get_value($ : 2); % * / } | / * none * / { $$ = new_args_tail(p, 0, 0, 0, &@0); / * % ripper: rb_ary_new_from_args(3, Qnil, Qnil, Qnil); % * / } ; % % opt_args_tail: args_tail(args_tail) ; opt_block_args_tail : args_tail(block_args_tail) ; • Can be abstracted in this way!! Parameterizing Rules
  34. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential %rule foo(X) : X { $$ = $1; } ; % % program : foo(number) { $$ = $1; } ; • But we can't access the symbol as it is. Because must specify type information for all symbols referenced in the action. Parameterizing Rules with Tag
  35. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential %union { int num; char * str; } %token <num> tNUMBER %token <str> tSTRING %type <num> number • Provide type information for terminal/nonterminal symbols Parameterizing Rules with Tag
  36. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential %union { int num; char * str; } %token <num> tNUMBER %token <str> tSTRING %type <num> number • Provide type information for terminal/nonterminal symbols Parameterizing Rules with Tag Tag
  37. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • But hard to specify with `%token` or `%type` Confidential %rule foo(X) : X { $$ = $1; } ; % % program : foo(number, string) { $$ = $1; } ; Parameterizing Rules with Tag
  38. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • But hard to specify with `%token` or `%type` Confidential %rule foo(X) : X { $$ = $1; } ; % % program : foo(number, string) { $$ = $1; } ; program foo_number_string Parameterizing Rules with Tag
  39. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Example of how to specify a Tag: Confidential %rule foo(X) : X { $$ = $1; } ; % % program : foo(number) <i> { $$ = $1; } ; Specify a Tag after the parameterizing rules call. Parameterizing Rules with Tag
  40. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Other examples: Confidential %union { int num; char * str; } %token <num> number %token <str> string %rule foo(X) : X { $$ = $1; } ; % % program : foo(number) <num> { $$ = $1; } : foo(string) <str> { $$ = $1; } ; Parameterizing Rules with Tag
  41. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Other examples: Confidential %union { int num; char * str; } %token <num> number %token <str> string %rule foo(X) : X { $$ = $1; } ; % % program : foo(number) <num> { $$ = $1; } : foo(string) <str> { $$ = $1; } ; int char* int or char* Parameterizing Rules with Tag
  42. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Reduce redundant <tag> specifications on callers Confidential Parameterizing Rules with Tag %rule foo(X) : X { $$ = $1; } | foo(X) <i> { $$ = $1 + $2; } ; % % program : foo(number) <i> { $$ = $1; } ; other : bar foo(number) <i> baz { $$ = $1 + $2 + $3; } ;
  43. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Reduce redundant <tag> specifications on callers Confidential %rule foo(X) : X { $$ = $1; } | foo(X) <i> { $$ = $1 + $2; } ; % % program : foo(number) <i> { $$ = $1; } ; other : bar foo(number) <i> baz { $$ = $1 + $2 + $3; } ; Parameterizing Rules with Tag
  44. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Tag can be specified on the callee side! Confidential %rule foo(X) <i> : X { $$ = $1; } | foo(X) X { $$ = $1 + $2; } ; % % program : foo(number) { $$ = $1; } ; other : bar foo(number) baz { $$ = $1 + $2 + $3; } ; Parameterizing Rules with Tag
  45. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • About Tag Priority Confidential %rule foo(X) <int> : X { $$ = $1; } ; % % program : foo(number) <uint> { $$ = $1; } ; When specified on the callee and caller sides like this, the tag on the caller side is specified. Parameterizing Rules with Tag
  46. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • About Tag Priority Confidential %rule foo(X) <int> : X { $$ = $1; } ; % % program : foo(number) <uint> { $$ = $1; } ; In this example, `<uint>` is specified as the reference type pointed to by `$$`on the callee side and `$1` on the caller side. Parameterizing Rules with Tag
  47. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • > To help programmers be productive, I focus on making programs succinct in Ruby. I try to make programs compact. I also focus on providing very rich features in class libraries. If you have a library method to do the things that you want to do, you don't need to write much code. • “Dynamic Productivity with Ruby - A Conversation with Yukihiro Matsumoto, Part II” 2003/11/17 • https://www.artima.com/articles/dynamic-productivity-with-ruby Productivity, Efficiency, and Robustness Confidential
  48. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • The following comma-separated list structure is a common structure Confidential list : item | list ',' item ; def foo(bar, baz) # do something end [foo, bar, baz].each do |item| # do something end foo(bar, baz) Productivity, Efficiency, and Robustness
  49. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Standard Library
  50. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Provides a Standard Library of rules with a general structure, such as `option` or `list` • By default, it is joined with all grammar specifications • It is stored in a file name lib/lrama/grammar/stdlib.y, which is embedded inside grammar file when before parsing • If you don’t want to embed the Standard Library, you should add the `%no-stdlib` directive to the grammar file. Standard Library Confidential
  51. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • The following three rules can be created using the familiar regular expression-like syntax Confidential Name Recognizes Alias option(X) є | X X? list(X) a possibly empty sequence of X’s X* nonempty_list(X) a nonempty sequence of X’s X+ Standard Library
  52. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • In the actual case of the replacement, the following replaced the non-terminal symbol for an optional line break Confidential opt_nl : / * none * / | '\n' ; rparen : opt_nl ')' Standard Library
  53. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • In the actual case of the replacement, the following replaced the non-terminal symbol for an optional line break Confidential opt_nl : / * none * / | '\n' ; rparen : opt_nl ')' rparen : '\n'? ')' Standard Library
  54. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Grammar provided by Bison is primitive • We got a power to take out the structure in parse.y! • Parameterizing rules can name patterns to clarify the intent of the rule and allow patterns to be reused • Standard Library provides a simple way to write rules for common structures Summary Confidential
  55. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ “お待ちなさいタロウ。このブレスレットはあなた の新しい武器です” — ウルトラの母 Confidential Tightly coupled Lexer & Parser
  56. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • “Ruby Parser։ൃ೔ࢽ (6) - parse.yͷMaintainabilityͷ࿩” ͔Ͷ͜ʹ͖ͬ 2023/04/04. • https://yui-knk.hatenablog.com/entry/2023/04/04/190413 • > parserͱlexer͕ͳͥີ݁߹͍ͯ͠Δͷ͔ • > 2. ಛఆͷੜ੒نଇΛ͋Δ৚݅ԼͰ͸ແޮʹ͢ΔͨΊ • > Why parser and lexer are tightly coupled? • > 2. To invalidate certain generation rules under certain conditions Tightly coupled Lexer & Parser Confidential
  57. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential • lambda's temporary arguments must not have ` ... ` cannot be specified irb(main) : 001> - > (a) {} = > #<Proc:0 x 000000012316e698 (irb) : 1 (lambda)> irb(main) : 002> - > (a=1) {} = > #<Proc:0 x 0000000123119918 (irb) : 2 (lambda)> irb(main) : 003> - > ( . . . ) {} <internal:kernel > : 187:in `loop': (irb) : 3 : syntax error, unexpected . . . , expecting ')' (SyntaxError) - > ( . . . ) {} ^ ~ ~ Tightly coupled Lexer & Parser
  58. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential • There's a hack that looks at the lex_state and returns a token that will cause a syntax error if it's a lambda temporary argument case '.': { int is_beg = IS_BEG(); SET_LEX_STATE(EXPR_BEG); if ((c = nextc(p)) = = '.') { if ((c = nextc(p)) = = '.') { if (p - > ctxt.in_argdef) { SET_LEX_STATE(EXPR_ENDARG); return tBDOT3; } if (p - > lex.paren_nest = = 0 & & looking_at_eol_p(p)) { rb_warn0(" . . . at EOL, should be parenthesized?"); } else if (p - > lex.lpar_beg > = 0 & & p - > lex.lpar_beg+1 = = p - > lex.paren_nest) { if (IS_lex_state_for(last_state, EXPR_LABEL)) return tDOT3; } Tightly coupled Lexer & Parser
  59. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential • There's a hack that looks at the lex_state and returns a token that will cause a syntax error if it's a lambda temporary argument case '.': { int is_beg = IS_BEG(); SET_LEX_STATE(EXPR_BEG); if ((c = nextc(p)) = = '.') { if ((c = nextc(p)) = = '.') { if (p - > ctxt.in_argdef) { SET_LEX_STATE(EXPR_ENDARG); return tBDOT3; } if (p - > lex.paren_nest = = 0 & & looking_at_eol_p(p)) { rb_warn0(" . . . at EOL, should be parenthesized?"); } else if (p - > lex.lpar_beg > = 0 & & p - > lex.lpar_beg+1 = = p - > lex.paren_nest) { if (IS_lex_state_for(last_state, EXPR_LABEL)) return tDOT3; } Tightly coupled Lexer & Parser
  60. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Very abbreviated diagram of the RHS: Tightly coupled Lexer & Parser RHS args_forward tBDOT3 args_tail RHS RHS f_larglist f_arglist f_args RHS RHS Confidential
  61. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • (Maybe) simplest solution Tightly coupled Lexer & Parser RHS RHS RHS f_arglist RHS args_tail f_args args_forward tBDOT3 RHS f_larglist RHS largs_tail f_largs Confidential
  62. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • (Maybe) simplest solution Tightly coupled Lexer & Parser RHS RHS RHS f_arglist RHS args_tail f_args args_forward tBDOT3 RHS f_larglist RHS largs_tail f_largs It's not good to have to make the same changes in all… Confidential
  63. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Parameterizing Rules with Conditional Statements
  64. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Conceptual diagram: Parameterizing Rules with Conditional Statements f_arglist args_tail args_forward tBDOT3 f_larglist args_tail f_args lambda? No f_args Confidential
  65. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Parameterizing Rules with Conditional Statements • args_forward is included in RHS unless lambda f_arglist args_tail f_args args_forward tBDOT3 f_larglist args_tail f_args lambda? No Confidential
  66. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • args_forward should not be included in RHS if lambda Parameterizing Rules with Conditional Statements f_arglist args_tail f_args args_forward tBDOT3 f_larglist args_tail f_args lambda? No Confidential
  67. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential • [PoC] Intoroduce parameterizing rules with conditional statement #418 • https://github.com/ruby/lrama/pull/418 Parameterizing Rules with Conditional Statements
  68. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential • How to use a conditional (1) %rule def i ned_rule(X, condition) : X { $$ = $1; } %if(condition) ; % % r_true : def i ned_rule(number, %true) ; r_false : def i ned_rule(number, %false) ; This syntax is like the Ruby postfix if. If condition is false, the row creation rules are not expanded. Parameterizing Rules with Conditional Statements
  69. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential • Result (1) ❯ exe/lrama - - trace=rules sample/calc.y Grammar rules: $accept - > r_true YYEOF def i ned_rule_number_true - > number r_true - > def i ned_rule_number_true r_false - > ε Parameterizing Rules with Conditional Statements
  70. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential • How to use a conditional (2) %rule def i ned_rule(X, condition) : %if(condition) X %endif X { $$ = $1; } ; % % r_true : def i ned_rule(number, %true) ; r_false : def i ned_rule(number, %false) ; If statementIf condition is false, it is equivalent to missing RHS between `%if` and `%endif` Parameterizing Rules with Conditional Statements
  71. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential • Result (2) ❯ exe/lrama - - trace=rules sample.y Grammar rules: $accept - > r_true YYEOF def i ned_rule_number_true - > number number r_true - > def i ned_rule_number_true def i ned_rule_number_false - > number r_false - > def i ned_rule_number_false Parameterizing Rules with Conditional Statements
  72. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Parser and Lexer are tightly coupled • Parameterizing rules still have room to evolve! • Suggested that evolution makes them more flexible • What to parameterize? Maybe we can solve more other problems if we consider Summary Confidential
  73. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential “太陽、この私をもっと強くしてくれ。お前がお前 の子である地球を愛しているなら、この私にベム スターと互角に戦える力を与えてくれ!” — 郷秀樹 Difficult to resolve Conflicts
  74. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • It is known that the following arithmetic grammar does not work as expected Difficult to resolve Conflicts Confidential expr : number { $$ = $1; } | expr op expr { if ($2 = = '+') $$ = $1 + $3; else $$ = $1 * $3; } ; op : PLUS { $$ = '+'; } | TIMES { $$ = ‘*'; } ;
  75. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ shift/reduce conflict on token PLUS : op: • PLUS (rule 3) expr: expr op expr • (rule 2) : (snip) shift/reduce conflict on token TIMES : op: • TIMES (rule 4) expr: expr op expr • (rule 2) • There is a S/R conflict despite the declaration of priority Confidential Difficult to resolve Conflicts
  76. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • So how do you resolve conflicts from here? Confidential expr : number { $$ = $1; } | expr op expr { if ($2 = = '+') $$ = $1 + $3; if ($2 = = ‘*') $$ = $1 * $3; } ; op : PLUS { $$ = '+'; } | TIMES { $$ = ‘*'; } ; Difficult to resolve Conflicts
  77. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ expr : number { $$ = $1; } | expr PLUS expr { $$ = $1 + $3; } | expr TIMES expr { $$ = $1 * $3; } ; • Standard workaround: Confidential Part of the structure of the original generation rule is destroyed Difficult to resolve Conflicts
  78. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Inlining
  79. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Provides a solution to avoid conflicts without changing grammar • Idea of Inlining comes from Menhir LR(1) parser generator • https://gallium.inria.fr/~fpottier/menhir/ manual.html#sec37 Inlining Confidential
  80. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ %rule %inline op: PLUS { + } | TIMES { * } ; % % expr : number { $$ = $1; } | expr op expr { $$ = $1 $2 $3; } ; • How to use a Inline Rule Confidential Inlining
  81. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ %rule %inline op: PLUS { + } | TIMES { * } ; % % expr : number { $$ = $1; } | expr op expr { $$ = $1 $2 $3; } ; • How to use a Inline Rule Confidential Inlining
  82. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ %rule %inline op: PLUS { + } | TIMES { * } ; % % expr : number { $$ = $1; } | expr op expr { $$ = $1 $2 $3; } ; • How to use a Inline Rule Confidential Inlining
  83. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ %rule %inline op: PLUS { + } | TIMES { * } ; % % expr : number { $$ = $1; } | expr op expr { $$ = $1 $2 $3; } ; • How to use a Inline Rule Confidential Inlining
  84. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ %rule %inline op: PLUS { + } | TIMES { * } ; % % expr : number { $$ = $1; } | expr op expr { $$ = $1 $2 $3; } ; • How to use a Inline Rule Confidential Inlining
  85. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ ❯ exe/lrama - - trace=actions sample.y Grammar rules with actions: $accept - > expr YYEOF {} expr - > number { $$ = $1; } expr - > expr PLUS expr { $$ = $1 + $3; } expr - > expr TIMES expr { $$ = $1 * $3; } • Rules and Actions After Inlining: Confidential Inlining
  86. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • However, the current situation is still a naive implementation • Like macros, code is rewritten when rules are deployed • A new grammar is required • Keep the code in the defining action of Inlining • Reference on the side using Inlining • We need to think about intermediate language design • Finally, we aims to output a parser written in various languages • We need a transpiler to convert to each language • We need to design an intermediate language that looks to the future Confidential Inlining
  87. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Difficult to resolve S/R or R/R conflicts • The ability to resolve conflicts without losing information about the structure of Inlining is a powerful force • Inlining is still a naive implementation and still needs to be improved • We need a forward-looking intermediate language design Summary Confidential
  88. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ “これからも、恐ろしい敵は次々と現れるだろう。 だが、我々がウルトラ警備隊魂を持ち続ける限 り、地球の平和は守られるに違いない" —モロボシ・ダン Confidential Conclusion
  89. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • It was suggested that 3 threats to maintainability could be well fought by extending grammar • There's still room for expansion, so developing richer grammar can enhance maintainability • We can now use more convenient parser generator features! (if we implemented it 😉 ) Conclusion Confidential
  90. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • @yui-knk • @junk0612 • @amatsuda • @jinroq Acknowledgements Confidential
  91. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ • Lrama LALR (1) parser generator https://github.com/ruby/lrama • Menhir Reference Manual(version 20231231) https://gallium.inria.fr/~fpottier/menhir/manual.html • Yuichiro Kaneko “Ruby Parser開発日誌 (5) - Lrama LALR (1) parser generatorを実装した” かねこにっき 2023/03/13. https://yui-knk.hatenablog.com/entry/2023/03/13/101951 • Yuichiro Kaneko “Ruby Parser開発日誌 (6) - parse.yのMaintainabilityの話” かねこにっき 2023/04/04. https://yui-knk.hatenablog.com/entry/2023/04/04/190413 • Yuichiro Kaneko “Lrama LRパーサジェネレータが切り開く、Rubyの構文解析の未来” gihyo.jp Ruby3.3リリー ス! 新機能解説 2024/01/24. https://gihyo.jp/article/2024/01/ruby3.3-lrama • Bill Venners, Yukihiro Matsumoto “Dynamic Productivity with Ruby - A Conversation with Yukihiro Matsumoto, Part II” 2003/11/17” https://www.artima.com/articles/dynamic-productivity-with-ruby References Confidential
  92. Copyright © 2020 Present ANDPAD Inc. This information is confidential

    and was prepared by ANDPAD Inc. for the use of our client. It is not to be relied on by and 3rd party. Proprietary & Confidential ແஅసࡌɾແஅෳ੡ͷېࢭ Confidential Thank you!!