RNode with code locations

B3ba3ccedfbf4d605f00bafd1a732529?s=47 yui-knk
June 01, 2018

RNode with code locations

B3ba3ccedfbf4d605f00bafd1a732529?s=128

yui-knk

June 01, 2018
Tweet

Transcript

  1. RNode with code locations Jun 1, 2018 in RubyKaigi 2018

    @yui-knk Yuichiro Kaneko
  2. Self-introduction • Yuichiro Kaneko

  3. None
  4. Self-introduction • Yuichiro Kaneko • Asakusa.rb • A CRuby Committer

    (2015/12~) • GitHub (yui-knk) • Twitter (spikeolaf)
  5. I will join Treasure Data next week!!!

  6. Today's topic • RNode • Node of Abstract Syntax Tree

    • Code location • Location information of RNode
  7. Run the code • Run the code on … •

    Ruby 2.4 • Ruby 2.5 $ ruby --dump=p -e '"str".upcase'
  8. Ruby 2.4 $ ruby --dump=p -e '"str".upcase' # @ NODE_SCOPE

    (line: 1) # +- nd_tbl: (empty) # +- nd_args: # | (null node) # +- nd_body: # @ NODE_PRELUDE (line: 1) # +- nd_head: # | (null node) # +- nd_body: # | @ NODE_CALL (line: 1) # | +- nd_mid: :upcase # | +- nd_recv: # | | @ NODE_STR (line: 1) # | | +- nd_lit: "str" # | +- nd_args: # | (null node) # +- nd_compile_option: # +- coverage_enabled: false
  9. $ ruby --dump=p -e '"str".upcase' # @ NODE_SCOPE (line: 1,

    code_range: (1,0)-(1,12)) # +- nd_tbl: (empty) # +- nd_args: # | (null node) # +- nd_body: # @ NODE_PRELUDE (line: 1, code_range: (1,0)-(1,12)) # +- nd_head: # | (null node) # +- nd_body: # | @ NODE_CALL (line: 1, code_range: (1,0)-(1,12)) # | +- nd_mid: :upcase # | +- nd_recv: # | | @ NODE_STR (line: 1, code_range: (1,0)-(1,5)) # | | +- nd_lit: "str" # | +- nd_args: # | (null node) # +- nd_compile_option: # +- coverage_enabled: false Ruby 2.5
  10. @@ -1,8 +1,8 @@ # +- nd_body: -# | @

    NODE_CALL (line: 1) +# | @ NODE_CALL (line: 1, code_range: (1,0)- (1,12)) # | +- nd_mid: :upcase # | +- nd_recv: -# | | @ NODE_STR (line: 1) +# | | @ NODE_STR (line: 1, code_range: (1,0)-(1,5)) # | | +- nd_lit: "str" # | +- nd_args: # | (null node)
  11. @@ -1,8 +1,8 @@ # +- nd_body: -# | @

    NODE_CALL (line: 1) +# | @ NODE_CALL (line: 1, code_range: (1,0)- (1,12)) # | +- nd_mid: :upcase # | +- nd_recv: -# | | @ NODE_STR (line: 1) +# | | @ NODE_STR (line: 1, code_range: (1,0)-(1,5)) # | | +- nd_lit: "str" # | +- nd_args: # | (null node) Today’s Topic
  12. Agenda • What code locations are • Why code locations

    are needed • Ruby crash course • How to implement code locations • The future plan of code locations feature • Conclusion
  13. What code locations are

  14. Location information in programming • Location information of script is

    used in various situations.
  15. Exception Traceback (most recent call last): 1: from src/exception.rb:5:in `<main>'

    src/exception.rb:2:in `a': undefined method `foo' for "":String (NoMethodError)
  16. Warning src/warning.rb:2: warning: instance variable @a not initialized

  17. Location information in programming • Location information of script is

    used in various situations. • "An exception is raised from line number XX" (Exception) • "Instance variable of line number XX not initialized" (Warning) • "No test cases for line number XX" (Coverage)
  18. Is line number enough to represent location?

  19. Location and position • Location is presented by 2 numbers:

    • Line number (lineno) • Distance from beginning of line (column)
  20. Location and position • 4 numbers are needed to represent

    “begin” and “end”. • "Code position" is a pair of lineno and column. • "Code location" is a pair of begin position and end position. 1 + 2 ^ ^ ^ | | +- @3 (1.4-1.5) | +--- @2 (1.2-1.3) +----- @1 (1.0-1.1) @3 (1.4-1.5) Code position (begin) Code location @3 (1.4-1.5) Lineno (1) Column (4) Code position (end)
  21. Location in Ruby • Ruby holds *only* line numbers until

    Ruby 2.4. • Ruby holds line numbers and columns since Ruby 2.5. • Today’s main topic is “Column”.
  22. Minor details about column • 0-based / 1-based • 0-based

    • Vary according to programming languages and editors. • From the beginning of line / file • Line
  23. Minor details about column • Byte length / Character length

    • Byte length • “ߏจ໦ʹৄࡉͳҐஔ৘ใΛ΋ͨͤΔܭը” • https://bugs.ruby-lang.org/projects/ruby-trunk/wiki/ Node-position-memo
  24. Why code locations are needed

  25. For coverage features • For branch coverage and method coverage

    (Ruby 2.5~). • "An introduction and future of Ruby coverage library” • http://rubykaigi.org/2017/presentations/mametter.html (30:50-)
  26. What is branch coverage • "Branch coverage tells you which

    branches are executed, and which not." (doc/NEWS-2.5.0) (a == 2) ? :t : :f
  27. What is branch coverage • You may forget to write

    test codes for `then` cases. • `n/m` • `n`: How many times the “then clause” is executed. • `m`: How many times the “else clause” is executed. 0/1: (a == 2) ? :t : :f
  28. Use-case (1) • Code locations can be used for visualizing

    branch coverage results. 0/1: (a == 2) ? :t : :f
  29. Use-case (1) • Code locations can be used for visualizing

    branch coverage results. 0/1: (a == 2) ? :t : :f YOU SHOULD WRITE TEST !!!
  30. Use-case (2) • One line can contain one or more

    branches. • In these case, we can't recognize which clause is executed by only line numbers. (a == b) ? ((c == d) ? :A : :B) : :C obj&.foo? ? "a" : "b"
  31. Use-case (2) • One line can contain one or more

    branches. • In these case, we can't recognize which clause is executed by only line numbers. (a == b) ? ((c == d) ? :A : :B) : :C obj&.foo? ? "a" : "b"
  32. Ruby crash course

  33. How Ruby script is processed 4UFQ *OQVU 0VUQVU %FCVH 4PSVDF

    5PLFOJ[BUJPO 3VCZTDSJQU 5PLFOT EVNQZ QBSTFZ 1BSTJOH 5PLFOT "45 EVNQQ QBSTFZ $PNQJMF "45 #ZUFDPEF EVNQJ DPNQJMFD Parsing ___ \ Ruby script -> Tokens -> AST -> Byte code (insns / ISeq) __/ __/ Tokenization Compile
  34. Ruby script 1 + 2

  35. Tokenization Parsing ___ \ Ruby script -> Tokens -> AST

    -> Byte code (insns / ISeq) __/ __/ Tokenization Compile
  36. Tokenization • Each token has • a token type (tINTEGER)

    • a semantic value (1) 1 + 2 ^ ^ ^^ | | |+--- '\n' / "end-of-input" | | +---- tINTEGER (2) | +------ '+' +-------- tINTEGER (1)
  37. Tokenization $ ruby --dump=y -e '1 + 2' | grep

    Shifting Shifting token tINTEGER (1.0-1.1: ) Shifting token '+' (1.2-1.3: ) Shifting token tINTEGER (1.4-1.5: ) Shifting token '\n' (1.5-1.5: ) Shifting token "end-of-input" (1.5-1.5: ) On Ruby 2.5.1 1 2
  38. Tokenization $ ruby --dump=y -e '1 + 2' | grep

    Shifting On Ruby 2.6.0preview1 Shifting token tINTEGER (1.0-1.1: 1) Shifting token '+' (1.2-1.3: ) Shifting token tINTEGER (1.4-1.5: 2) Shifting token '\n' (1.5-1.5: ) Shifting token "end-of-input" (1.5-1.5: )
  39. Tokenization $ ruby --dump=y -e '1 + 2' | grep

    Shifting On Ruby 2.6.0preview1 Shifting token tINTEGER (1.0-1.1: 1) Shifting token '+' (1.2-1.3: ) Shifting token tINTEGER (1.4-1.5: 2) Shifting token '\n' (1.5-1.5: ) Shifting token "end-of-input" (1.5-1.5: )
  40. r61997 / 46e2fad

  41. Parsing Parsing ___ \ Ruby script -> Tokens -> AST

    -> Byte code (insns / ISeq) __/ __/ Tokenization Compile
  42. Parsing • Analyzes tokens conforming to the rules of Ruby

    syntax. • Builds AST.
  43. Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ;

    simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt parse.y
  44. Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ;

    simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt W W W Rules parse.y
  45. Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ;

    simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt parse.y
  46. Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ;

    simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt 1 2.1 3r 4i parse.y
  47. Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ;

    simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt parse.y
  48. Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ;

    simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt 1 parse.y
  49. Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ;

    simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt 1 parse.y
  50. Parsing numeric : simple_numeric | tUMINUS_NUM simple_numeric %prec tLOWEST ;

    simple_numeric : tINTEGER | tFLOAT | tRATIONAL | tIMAGINARY ; %% program : { } top_compstmt Goal parse.y
  51. Parsing $ ruby --dump=y -e '1 + 2' # Shifting

    token tINTEGER (1) tINTEGER simple_numeric numeric literal primary arg
  52. Parsing # Shifting token '+' arg '+' # Shifting token

    tINTEGER (2) arg '+' tINTEGER arg '+' simple_numeric arg '+' numeric arg '+' literal arg '+' primary arg '+' arg arg expr stmt top_stmt top_stmts
  53. Parsing # Shifting token '\n' top_stmts '\n' top_stmts term top_stmts

    terms top_stmts opt_terms top_compstmt program # Completed
  54. Build AST $ ruby --dump=p -e '1 + 2' NODE_SCOPE

    NODE_PRELUDE NODE_OPCALL (:+) NODE_LIT (1) NODE_ARRAY NODE_LIT (2) NODE_SCOPE NODE_PRELUDE NODE_OPCALL (:+) NODE_LIT (1) NODE_ARRAY NODE_LIT (2)
  55. Build AST typedef struct RNode { VALUE flags; union {

    struct RNode *node; ... } u1; union { struct RNode *node; ... } u2; union { struct RNode *node; ... } u3; rb_code_location_t nd_loc; } NODE;
  56. Build AST typedef struct RNode { VALUE flags; union {

    struct RNode *node; ... } u1; union { struct RNode *node; ... } u2; union { struct RNode *node; ... } u3; rb_code_location_t nd_loc; } NODE; Contain node_type
  57. Build AST typedef struct RNode { VALUE flags; union {

    struct RNode *node; ... } u1; union { struct RNode *node; ... } u2; union { struct RNode *node; ... } u3; rb_code_location_t nd_loc; } NODE; Contain node_type Contain various data
  58. Build AST typedef struct RNode { VALUE flags; union {

    struct RNode *node; ... } u1; union { struct RNode *node; ... } u2; union { struct RNode *node; ... } u3; rb_code_location_t nd_loc; } NODE; Contain node_type Contain various data Contain Location information
  59. Build AST • Builds AST in actions. • $1 stands

    for the value of the 1st component (`arg`). arg | arg '+' arg { $$ = call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { NODE *expr; value_expr(recv); value_expr(arg1); expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); nd_set_line(expr, op_loc->beg_pos.lineno); return expr; }
  60. Build AST arg | arg '+' arg { $$ =

    call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { NODE *expr; value_expr(recv); value_expr(arg1); expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); nd_set_line(expr, op_loc->beg_pos.lineno); return expr; } Action
  61. Build AST arg | arg '+' arg { $$ =

    call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { NODE *expr; value_expr(recv); value_expr(arg1); expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); nd_set_line(expr, op_loc->beg_pos.lineno); return expr; }
  62. Build AST arg | arg '+' arg { $$ =

    call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { NODE *expr; value_expr(recv); value_expr(arg1); expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); nd_set_line(expr, op_loc->beg_pos.lineno); return expr; }
  63. Build AST arg | arg '+' arg { $$ =

    call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { NODE *expr; value_expr(recv); value_expr(arg1); expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); nd_set_line(expr, op_loc->beg_pos.lineno); return expr; } Create NODE_OPCALL
  64. Build AST $ ruby --dump=p -e '1 + 2' NODE_SCOPE

    NODE_PRELUDE NODE_OPCALL (:+) NODE_LIT (1) NODE_ARRAY NODE_LIT (2)
  65. Compile Parsing ___ \ Ruby script -> Tokens -> AST

    -> Byte code (insns / ISeq) __/ __/ Tokenization Compile
  66. Compile • Do compile. • See “compile.c”. $ ruby --dump=i

    -e '1 + 2' == disasm: #<ISeq:<main>@-e:1 (1,0)- (1,5)>============================== 0000 putobject_OP_INT2FIX_O_1_C_ ( 1)[Li] 0001 putobject 2 0003 opt_plus <callinfo!mid:+, argc:1, ARGS_SIMPLE>, <callcache> 0006 leave
  67. Compile • Do compile. • See “compile.c”. $ ruby --dump=i

    -e '1 + 2' == disasm: #<ISeq:<main>@-e:1 (1,0)- (1,5)>============================== 0000 putobject_OP_INT2FIX_O_1_C_ ( 1)[Li] 0001 putobject 2 0003 opt_plus <callinfo!mid:+, argc:1, ARGS_SIMPLE>, <callcache> 0006 leave ISeq 4 insn(s)
  68. References • "Ruby Hacking Guide" (Part 2: Syntax analysis) •

    https://ruby-hacking-guide.github.io/ [EN] • http://i.loveruby.net/ja/rhg/book/ [JA] • "Ruby Under a Microscope" / “Rubyͷ͘͠Έ"
  69. How to implement code locations

  70. Goal • Branch coverage • To pass code locations to

    compile phase. • Method coverage • To store code locations on ISeq. • What should we implement • Embed code locations into each NODE.
  71. Hint • Original source of location information is Ruby script.

    • If we want to use location information in "n"th step, we should implement location information in "n-1"th step. • In this case, it's need to pass location information from "Tokenization" to "Compile" to use location information in compile phase. Parsing ___ \ Ruby script -> Tokens -> AST -> Byte code (insns / ISeq) __/ __/ Tokenization Compile
  72. parser_params crash course • One on the main data structure

    of parser. • Too Big!!! struct parser_params { rb_imemo_tmpbuf_t *heap; YYSTYPE *lval; struct { rb_strterm_t *strterm; VALUE (*gets)(struct parser_params*,VALUE); VALUE input; VALUE prevline; VALUE lastline; VALUE nextline; const char *pbeg; const char *pcur; const char *pend; const char *ptok; long gets_ptr; enum lex_state_e state; /* track the nest level of any parens "()[]{}" */ int paren_nest; /* keep p->lex.paren_nest at the beginning of lambda "->" to detect tLAMBEG and keyword_do_LAMBDA */ int lpar_beg; /* track the nest level of only braces "{}" */ int brace_nest; } lex; stack_type cond_stack; stack_type cmdarg_stack; int tokidx; int toksiz; int tokline; int heredoc_end; int heredoc_indent; int heredoc_line_indent; char *tokenbuf; struct local_vars *lvtbl; int line_count; int ruby_sourceline; /* current line no. */ char *ruby_sourcefile; /* current source file */ VALUE ruby_sourcefile_string; rb_encoding *enc; token_info *token_info; VALUE compile_option; VALUE debug_buffer; VALUE debug_output; ID cur_arg; rb_ast_t *ast; unsigned int command_start:1; unsigned int eofp: 1; unsigned int ruby__end__seen: 1; unsigned int debug: 1; unsigned int has_shebang: 1; unsigned int in_defined: 1; unsigned int in_main: 1; unsigned int in_kwarg: 1; unsigned int in_def: 1; unsigned int in_class: 1; unsigned int token_seen: 1; unsigned int token_info_enabled: 1; # if WARN_PAST_SCOPE unsigned int past_scope_enabled: 1; # endif unsigned int error_p: 1; unsigned int cr_seen: 1; #ifndef RIPPER /* Ruby core only */ unsigned int do_print: 1; unsigned int do_loop: 1; unsigned int do_chomp: 1; unsigned int do_split: 1; unsigned int warn_location: 1; NODE *eval_tree_begin; NODE *eval_tree; VALUE error_buffer; VALUE debug_lines; VALUE coverage; const struct rb_block *base_block; #else /* Ripper only */ VALUE delayed; int delayed_line; int delayed_col; VALUE value; VALUE result; VALUE parsing_thread; #endif };
  73. parser_params crash course • It has struct for lexer (`lex`).

    • Lexer processes input in units of lines. • *Basically* processes from top to bottom. struct parser_params { ... struct { ... VALUE prevline; VALUE lastline; VALUE nextline; const char *pbeg; const char *pcur; const char *pend; const char *ptok; ... } lex; ... }; Lines W W Pointers
  74. What is column /* parse.y */ /* Structure of Lexer

    Buffer: lex.pbeg lex.ptok lex.pcur lex.pend | | | | |------------+------------+------------| |<---------->| token */
  75. What is column • When token '+' is recognized (Left).

    • When token tINTEGER (2) is recognized (Right). 1 + 2 ^ ^^ ^ | || +--- lex.pend | |+----- lex.pcur | +------ lex.ptok +-------- lex.pbeg 1 + 2 ^ ^ ^ | | +--- lex.pcur, lex.pend | +----- lex.ptok +-------- lex.pbeg
  76. What is column • `lex.ptok - lex.pbeg` (begin) and `lex.pcur

    - lex.pbeg` (end). • Column is a difference between pointers when a token is recognized. |--| lex.pcur - lex.pbeg (end) |-| lex.ptok - lex.pbeg (begin) 1 + 2 ^ ^^ ^ | || +--- lex.pend | |+----- lex.pcur | +------ lex.ptok +-------- lex.pbeg
  77. What is column • We must store columns somewhere before

    next token is recognized. 1 + 2 ^ ^^ ^ | || +--- lex.pend | |+----- lex.pcur | +------ lex.ptok +-------- lex.pbeg 1 + 2 ^ ^ ^ | | +--- lex.pcur, lex.pend | +----- lex.ptok +-------- lex.pbeg
  78. From Ruby script to tokens • Copy location information to

    `YYLTYPE *yylloc` in `yylex`. • The `yylloc` argument is newly added to `yylex`. • Call `RUBY_SET_YYLLOC` to set `yylloc`.
  79. static enum yytokentype yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p)

    { enum yytokentype t; p->lval = lval; lval->val = Qundef; t = parser_yylex(p); if (has_delayed_token(p)) dispatch_delayed_token(p, t); else if (t != 0) dispatch_scan_event(p, t); if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC)) RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc); else RUBY_SET_YYLLOC(*yylloc); return t; }
  80. static enum yytokentype yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p)

    { enum yytokentype t; p->lval = lval; lval->val = Qundef; t = parser_yylex(p); if (has_delayed_token(p)) dispatch_delayed_token(p, t); else if (t != 0) dispatch_scan_event(p, t); if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC)) RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc); else RUBY_SET_YYLLOC(*yylloc); return t; } New argument
  81. static enum yytokentype yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p)

    { enum yytokentype t; p->lval = lval; lval->val = Qundef; t = parser_yylex(p); if (has_delayed_token(p)) dispatch_delayed_token(p, t); else if (t != 0) dispatch_scan_event(p, t); if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC)) RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc); else RUBY_SET_YYLLOC(*yylloc); return t; } New argument static enum yytokentype yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p) { enum yytokentype t; p->lval = lval; lval->val = Qundef; t = parser_yylex(p); if (has_delayed_token(p)) dispatch_delayed_token(p, t); else if (t != 0) dispatch_scan_event(p, t); if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC)) RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc); else RUBY_SET_YYLLOC(*yylloc); return t; } Create token
  82. static enum yytokentype yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p)

    { enum yytokentype t; p->lval = lval; lval->val = Qundef; t = parser_yylex(p); if (has_delayed_token(p)) dispatch_delayed_token(p, t); else if (t != 0) dispatch_scan_event(p, t); if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC)) RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc); else RUBY_SET_YYLLOC(*yylloc); return t; } New argument static enum yytokentype yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p) { enum yytokentype t; p->lval = lval; lval->val = Qundef; t = parser_yylex(p); if (has_delayed_token(p)) dispatch_delayed_token(p, t); else if (t != 0) dispatch_scan_event(p, t); if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC)) RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc); else RUBY_SET_YYLLOC(*yylloc); return t; } Create token Set `yylloc`
  83. From tokens to Nodes • Now we can use `@n`

    in each action. • `@n` stands for the location of the nth component of the right hand side. • `@$` stands for the location of the left hand side grouping (`YYLTYPE yyloc`). • Set by `YYLLOC_DEFAULT`. • https://www.gnu.org/software/bison/manual/html_node/ Tracking-Locations.html#Tracking-Locations
  84. |---| @$ (1.0-1.5) 1 + 2 ^ ^ ^ |

    | +- @3 (1.4-1.5) | +--- @2 (1.2-1.3) +----- @1 (1.0-1.1) arg | arg '+' arg { $$ = call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { ... expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); ... } arg | arg '+' arg { $$ = call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { ... expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); ... } 1 2
  85. |---| @$ (1.0-1.5) 1 + 2 ^ ^ ^ |

    | +- @3 (1.4-1.5) | +--- @2 (1.2-1.3) +----- @1 (1.0-1.1) arg | arg '+' arg { $$ = call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { ... expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); ... } (1.0-1.5) 1 2
  86. |---| @$ (1.0-1.5) 1 + 2 ^ ^ ^ |

    | +- @3 (1.4-1.5) | +--- @2 (1.2-1.3) +----- @1 (1.0-1.1) arg | arg '+' arg { $$ = call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { ... expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); ... } (1.0-1.5) arg | arg '+' arg { $$ = call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { ... expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); ... } 1 2
  87. |---| @$ (1.0-1.5) 1 + 2 ^ ^ ^ |

    | +- @3 (1.4-1.5) | +--- @2 (1.2-1.3) +----- @1 (1.0-1.1) arg | arg '+' arg { $$ = call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { ... expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); ... } (1.0-1.5) arg | arg '+' arg { $$ = call_bin_op(p, $1, '+', $3, &@2, &@$); } static NODE * call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1, const YYLTYPE *op_loc, const YYLTYPE *loc) { ... expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc); ... } 1 2
  88. `NODE_OPCALL` is simple • All location needed to NODE_OPCALL is

    supplied when create NODE_OPCALL. • NODE_LIT (1) • mid (:+) • NODE_ARRAY (2) NODE_OPCALL 1 + 2 ----------- ------------ arg : arg '+' arg
  89. `NODE_*ASGN` family is not simple • `NODE_LASGN` (local variable assignment)

    or `NODE_IASGN` (instance variable assignment). @a = 1 NODE_SCOPE NODE_IASGN (:@a) NODE_LIT (1)
  90. NODE_IASGN @a ---------- ------------- lhs : user_variable { /*%%%*/ $$

    = assignable(p, $1, 0, &@$); /*% %*/ } NODE_IASGN NODE_IASGN = 1 ---------- ------------------------ arg : lhs '=' arg_rhs { /*%%%*/ $$ = node_assign(p, $1, $3, &@$); /*% %*/ /*% ripper: assign!($1, $3) %*/ }
  91. NODE_IASGN @a ---------- ------------- lhs : user_variable { /*%%%*/ $$

    = assignable(p, $1, 0, &@$); /*% %*/ } NODE_IASGN NODE_IASGN = 1 ---------- ------------------------ arg : lhs '=' arg_rhs { /*%%%*/ $$ = node_assign(p, $1, $3, &@$); /*% %*/ /*% ripper: assign!($1, $3) %*/ } Create NODE_IASGN
  92. NODE_IASGN @a ---------- ------------- lhs : user_variable { /*%%%*/ $$

    = assignable(p, $1, 0, &@$); /*% %*/ } NODE_IASGN NODE_IASGN = 1 ---------- ------------------------ arg : lhs '=' arg_rhs { /*%%%*/ $$ = node_assign(p, $1, $3, &@$); /*% %*/ /*% ripper: assign!($1, $3) %*/ } Create NODE_IASGN Location is determined
  93. NODE_IASGN @a ---------- ------------- lhs : user_variable { /*%%%*/ $$

    = assignable(p, $1, 0, &@$); /*% %*/ } NODE_IASGN NODE_IASGN = 1 ---------- ------------------------ arg : lhs '=' arg_rhs { /*%%%*/ $$ = node_assign(p, $1, $3, &@$); /*% %*/ /*% ripper: assign!($1, $3) %*/ } Create NODE_IASGN Update location Location is determined
  94. `NODE_*ASGN` family is not simple • Create `NODE_*ASGN`. • Assign

    right hand side. • So it's needed to update location of `NODE_*ASGN` when right hand side is assigned.
  95. `NODE_ITER` is not simple • `NODE_ITER` (method call with block).

    3.times { foo } NODE_ITER NODE_CALL (:times) NODE_LIT (3) NODE_SCOPE NODE_VCALL (:foo)
  96. NODE_CALL 3 . times ----------- -------------------------------------------------- method_call | primary_value call_op

    operation2 opt_paren_args { } NODE_ITER { NODE_ITER } ----------- --------------------- brace_block : '{' brace_body '}' { } NODE_ITER NODE_CALL NODE_ITER --------- ----------------------- primary | method_call brace_block { /*%%%*/ block_dup_check(p, $1->nd_args, $2); $$ = method_add_block(p, $1, $2, &@$); /*% %*/ }
  97. NODE_CALL 3 . times ----------- -------------------------------------------------- method_call | primary_value call_op

    operation2 opt_paren_args { } NODE_ITER { NODE_ITER } ----------- --------------------- brace_block : '{' brace_body '}' { } NODE_ITER NODE_CALL NODE_ITER --------- ----------------------- primary | method_call brace_block { /*%%%*/ block_dup_check(p, $1->nd_args, $2); $$ = method_add_block(p, $1, $2, &@$); /*% %*/ } `3.times` `{ foo }`
  98. NODE_CALL 3 . times ----------- -------------------------------------------------- method_call | primary_value call_op

    operation2 opt_paren_args { } NODE_ITER { NODE_ITER } ----------- --------------------- brace_block : '{' brace_body '}' { } NODE_ITER NODE_CALL NODE_ITER --------- ----------------------- primary | method_call brace_block { /*%%%*/ block_dup_check(p, $1->nd_args, $2); $$ = method_add_block(p, $1, $2, &@$); /*% %*/ } `3.times` `{ foo }` Update location
  99. `NODE_ITER` is not simple • `NODE_CALL` is created. • `NODE_ITER`

    is created. • `NODE_ITER` is added to `NODE_CALL`. • It's needed to update location of `NODE_ITER` when it is passed to `NODE_CALL`.
  100. None
  101. $ git shortlog -s -n parse.y | head -10 XXX

    ???? 362 matz 133 mame 88 yui-knk 55 ko1 38 aamine 37 akr 33 naruse 25 usa 7 normal On "v2_6_0_preview1".
  102. $ git shortlog -s -n parse.y | head -10 XXX

    ???? 362 matz 133 mame 88 yui-knk 55 ko1 38 aamine 37 akr 33 naruse 25 usa 7 normal On "v2_6_0_preview1". Me
  103. $ git shortlog -s -n parse.y | head -10 884

    nobu 362 matz 133 mame 88 yui-knk 55 ko1 38 aamine 37 akr 33 naruse 25 usa 7 normal On "v2_6_0_preview1".
  104. $ git shortlog -s -n parse.y | head -10 884

    nobu 362 matz 133 mame 88 yui-knk 55 ko1 38 aamine 37 akr 33 naruse 25 usa 7 normal On "v2_6_0_preview1". x 10
  105. $ git shortlog -s -n parse.y | head -10 884

    nobu 362 matz 133 mame 88 yui-knk 55 ko1 38 aamine 37 akr 33 naruse 25 usa 7 normal On "v2_6_0_preview1". x 10 @nobu is the lord of Demon Castle "parse.y".
  106. How to test • Define some rules and check all

    ruby files in "test" directory follow the rules. • Related files: • “ext/-test-/ast/ast.c" • "test/-ext-/ast/test_ast.rb" On "v2_6_0_preview1".
  107. "ext/-test-" and "test/-ext-" • "ext/-test-" contains C extensions which are

    used in Ruby's tests. • "ext/-test-/ast/ast.c" defines `AST` module and `AST::Node` class. On "v2_6_0_preview1".
  108. "ext/-test-" and "test/-ext-" • "test/-ext-" contains test cases which depend

    “ext/- test-". On "v2_6_0_preview1".
  109. Rule 1 • `lineno` is initialized with `0` and `column`

    with `-1`. • Validate all node locations are update at least once. NODE_IF (line: 1, location: (0,-1)-(0,-1))
  110. Rule 1 • `lineno` is initialized with `0` and `column`

    with `-1`. • Validate all node locations are update at least once. NODE_IF (line: 1, location: (0,-1)-(0,-1))
  111. Rule 2 • Validate children do not exceed a parent

    location. 3.times { foo } NODE_ITER [1.0-1.15] NODE_CALL (:times) [1.0-1.8] NODE_LIT (3) [1.0-1.1] NODE_SCOPE [1.8-1.15] NODE_VCALL (:foo) [1.10-1.13]
  112. Rule 2 • Validate children do not exceed a parent

    location. 3.times { foo } NODE_ITER [1.0-1.15] NODE_CALL (:times) [1.0-1.8] NODE_LIT (3) [1.0-1.1] NODE_SCOPE [1.8-1.15] -> covers [1.10-1.13] NODE_VCALL (:foo) [1.10-1.13]
  113. Rule 2 • Validate children do not exceed a parent

    location. 3.times { foo } NODE_ITER [1.0-1.15] NODE_CALL (:times) [1.0-1.8] -> covers [1.0-1.1] NODE_LIT (3) [1.0-1.1] NODE_SCOPE [1.8-1.15] -> covers [1.10-1.13] NODE_VCALL (:foo) [1.10-1.13]
  114. Rule 2 • Validate children do not exceed a parent

    location. 3.times { foo } NODE_ITER [1.0-1.15] -> covers [1.0-1.8] and [1.8-1.15] NODE_CALL (:times) [1.0-1.8] -> covers [1.0-1.1] NODE_LIT (3) [1.0-1.1] NODE_SCOPE [1.8-1.15] -> covers [1.10-1.13] NODE_VCALL (:foo) [1.10-1.13]
  115. Rule 2 • Validate children do not exceed a parent

    location. 3.times { foo } NODE_ITER [1.0-1.15] -> covers [1.0-1.8] and [1.8-1.15] NODE_CALL (:times) [1.0-1.8] -> covers [1.0-1.1] NODE_LIT (3) [1.0-1.1] NODE_SCOPE [1.8-1.15] -> covers [1.10-1.13] NODE_VCALL (:foo) [1.10-1.13]
  116. Dir.glob("test/**/*.rb", base: SRCDIR).each do |path| define_method("test_ranges:#{path}") do helper = Helper.new("#{SRCDIR}/#{path}")

    helper.validate_range assert_equal([], helper.errors) end end test/-ext-/ast/test_ast.rb
  117. Dir.glob("test/**/*.rb", base: SRCDIR).each do |path| define_method("test_ranges:#{path}") do helper = Helper.new("#{SRCDIR}/#{path}")

    helper.validate_range assert_equal([], helper.errors) end end Check all ruby files in "test" directory test/-ext-/ast/test_ast.rb
  118. Dir.glob("test/**/*.rb", base: SRCDIR).each do |path| define_method("test_ranges:#{path}") do helper = Helper.new("#{SRCDIR}/#{path}")

    helper.validate_range assert_equal([], helper.errors) end end Check all ruby files in "test" directory Validate each file test/-ext-/ast/test_ast.rb
  119. def validate_range0(node) beg_pos, end_pos = node.beg_pos, node.end_pos children = node.children.compact

    min = children.map(&:beg_pos).min max = children.map(&:end_pos).max unless beg_pos <= min @errors << { type: :min_validation_error, min: min, beg_pos: beg_pos, node: node } end unless max <= end_pos @errors << { type: :max_validation_error, max: max, end_pos: end_pos, node: node } end children.each do |child| validate_range0(child) end end ast = AST.parse_file(@path) validate_not_cared0(ast) test/-ext-/ast/test_ast.rb
  120. def validate_range0(node) beg_pos, end_pos = node.beg_pos, node.end_pos children = node.children.compact

    min = children.map(&:beg_pos).min max = children.map(&:end_pos).max unless beg_pos <= min @errors << { type: :min_validation_error, min: min, beg_pos: beg_pos, node: node } end unless max <= end_pos @errors << { type: :max_validation_error, max: max, end_pos: end_pos, node: node } end children.each do |child| validate_range0(child) end end ast = AST.parse_file(@path) validate_not_cared0(ast) Generate AST test/-ext-/ast/test_ast.rb
  121. def validate_range0(node) beg_pos, end_pos = node.beg_pos, node.end_pos children = node.children.compact

    min = children.map(&:beg_pos).min max = children.map(&:end_pos).max unless beg_pos <= min @errors << { type: :min_validation_error, min: min, beg_pos: beg_pos, node: node } end unless max <= end_pos @errors << { type: :max_validation_error, max: max, end_pos: end_pos, node: node } end children.each do |child| validate_range0(child) end end ast = AST.parse_file(@path) validate_not_cared0(ast) Generate AST test/-ext-/ast/test_ast.rb Check ranges
  122. def validate_range0(node) beg_pos, end_pos = node.beg_pos, node.end_pos children = node.children.compact

    min = children.map(&:beg_pos).min max = children.map(&:end_pos).max unless beg_pos <= min @errors << { type: :min_validation_error, min: min, beg_pos: beg_pos, node: node } end unless max <= end_pos @errors << { type: :max_validation_error, max: max, end_pos: end_pos, node: node } end children.each do |child| validate_range0(child) end end ast = AST.parse_file(@path) validate_not_cared0(ast) Generate AST test/-ext-/ast/test_ast.rb Check ranges Check children
  123. The future plan of code locations feature

  124. Case 1 (Proc/Method) • Add new methods to Proc/Method which

    return their code location. def a(&block) p block.code_location end a do 1 + 2 end # => [[5, 2], [7, 3]] p self.class.instance_method(:a).code_location # => [[1, 0], [3, 3]] https://github.com/yui-knk/ruby/tree/feature/rb_iseq_code_location
  125. Case 1 (Proc/Method) • Add new methods to Proc/Method which

    return their code location. def a(&block) p block.code_location end a do 1 + 2 end # => [[5, 2], [7, 3]] p self.class.instance_method(:a).code_location # => [[1, 0], [3, 3]] https://github.com/yui-knk/ruby/tree/feature/rb_iseq_code_location
  126. Case 1 (Proc/Method) • Add new methods to Proc/Method which

    return their code location. def a(&block) p block.code_location end a do 1 + 2 end # => [[5, 2], [7, 3]] p self.class.instance_method(:a).code_location # => [[1, 0], [3, 3]] https://github.com/yui-knk/ruby/tree/feature/rb_iseq_code_location
  127. Case 1 (Proc/Method) • Add new methods to Proc/Method which

    return their code location. def a(&block) p block.code_location end a do 1 + 2 end # => [[5, 2], [7, 3]] p self.class.instance_method(:a).code_location # => [[1, 0], [3, 3]] https://github.com/yui-knk/ruby/tree/feature/rb_iseq_code_location
  128. Case 2 (NoMethodError) • Give `NoMethodError` more detailed message. class

    A def foo nil end end A.new.foo.foo Traceback (most recent call last): /tmp/test.rb:7:in `<main>': undefined method `foo' for nil:NilClass (NoMethodError) A.new.foo.foo ^^^^ https://github.com/yui-knk/ruby/tree/feature/node_id
  129. Case 3 (AST module) AST.parse("1 + 2") # => #<AST::Node(NODE_SCOPE(0)

    1:0, 1:5 (4)): > AST.parse("1 + 2").children[1] # => #<AST::Node(NODE_OPCALL(36) 1:0, 1:5 (3)): > AST.parse("1 + 2").children[1].children # => [#<AST::Node(NODE_LIT(59) 1:0, 1:1 (0)): >, #<AST::Node(NODE_ARRAY(42) 1:4, 1:5 (2)): >]
  130. Case 3 (AST module) AST.parse("1 + 2") # => #<AST::Node(NODE_SCOPE(0)

    1:0, 1:5 (4)): > AST.parse("1 + 2").children[1] # => #<AST::Node(NODE_OPCALL(36) 1:0, 1:5 (3)): > AST.parse("1 + 2").children[1].children # => [#<AST::Node(NODE_LIT(59) 1:0, 1:1 (0)): >, #<AST::Node(NODE_ARRAY(42) 1:4, 1:5 (2)): >]
  131. • We discussed this topic at Developers Meeting yesterday.

  132. Committed

  133. Conference Driven Development !!!

  134. None
  135. None
  136. Case 3 (AST module) • We can get children nodes.

    RubyVM::AST.parse("1 + 2") # => #<RubyVM::AST::Node(NODE_SCOPE(0) 1:0, 1:5): > RubyVM::AST.parse("1 + 2").children[1] # => #<RubyVM::AST::Node(NODE_OPCALL(36) 1:0, 1:5): > RubyVM::AST.parse("1 + 2").children[1].children # => [#<RubyVM::AST::Node(NODE_LIT(59) 1:0, 1:1): >, #<RubyVM::AST::Node(NODE_ARRAY(42) 1:4, 1:5): >]
  137. Case 3 (AST module) • We can get location information.

    [RubyVM::AST.parse("1 + 2").first_lineno, RubyVM::AST.parse("1 + 2").first_column] # => [1, 0] [RubyVM::AST.parse("1 + 2").last_lineno, RubyVM::AST.parse("1 + 2").last_column] # => [1, 5]
  138. Enjoy programming with Ruby 2.6.0-preview2!

  139. Conclusion

  140. Acknowledgments • @mametter • @nobu • @ko1 • @shyouhei •

    @takeshinoda • @hkdnet • @HaiTo • @littlestarling
  141. Conclusion • AST Node has location information. • Share the

    future plan of code locations feature. • If you have any idea to use location information, please let me know :) • https://bugs.ruby-lang.org/ • You now get the map of Demon Castle "parse.y", let's hack “parse.y" :)
  142. Thank you!!!

  143. Bonus track

  144. How to implement more detailed message of `NoMethodError`

  145. Target code class A def foo nil end end A.new.foo.foo

  146. # @ NODE_CALL (line: 7, location: (7,0)-(7,13))* 13 # +-

    nd_mid: :foo # +- nd_recv: # | @ NODE_CALL (line: 7, location: (7,0)-(7,9)) 12 # | +- nd_mid: :foo # | +- nd_recv: # | | @ NODE_CALL (line: 7, location: (7,0)-(7,5)) 11 # | | +- nd_mid: :new # | | +- nd_recv: # | | | @ NODE_CONST (line: 7, location: (7,0)-(7,1)) 10 # | | | +- nd_vid: :A # | | +- nd_args: # | | (null node) # | +- nd_args: # | (null node) # +- nd_args: # (null node) node_id • Add unique id (per file), “node_id”, to Node.
  147. == disasm: #<ISeq:<main>@src/no_method_error2.rb:1 (1,0)-(7,13)> (catch: FALSE) 0000 putspecialobject 3 (

    1)[ 0][Li] 0002 putnil [ 9] 0003 defineclass :A, <class:A>, 0 0007 pop 0008 getinlinecache 15, <is:0> ( 7)[ 10][Li] 0011 getconstant :A 0013 setinlinecache <is:0> 0015 opt_send_without_block <callinfo!mid:new, argc:0, ARGS_SIMPLE>, <callcache>[ 11] 0018 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 12] 0021 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 13] 0024 leave [ 10] node_id • Store node_id of insn on an ISeq.
  148. • Store node_id of insn on an ISeq. • We

    can distinguish between `#foo`s by node_id. == disasm: #<ISeq:<main>@src/no_method_error2.rb:1 (1,0)-(7,13)> (catch: FALSE) 0000 putspecialobject 3 ( 1)[ 0][Li] 0002 putnil [ 9] 0003 defineclass :A, <class:A>, 0 0007 pop 0008 getinlinecache 15, <is:0> ( 7)[ 10][Li] 0011 getconstant :A 0013 setinlinecache <is:0> 0015 opt_send_without_block <callinfo!mid:new, argc:0, ARGS_SIMPLE>, <callcache>[ 11] 0018 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 12] 0021 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 13] 0024 leave [ 10]
  149. Exceptions • Exceptions have an ISeq and a program counter

    (pc). == disasm: #<ISeq:<main>@src/no_method_error2.rb:1 (1,0)-(7,13)> (catch: FALSE) 0000 putspecialobject 3 ( 1)[ 0][Li] 0002 putnil [ 9] 0003 defineclass :A, <class:A>, 0 0007 pop 0008 getinlinecache 15, <is:0> ( 7)[ 10][Li] 0011 getconstant :A 0013 setinlinecache <is:0> 0015 opt_send_without_block <callinfo!mid:new, argc:0, ARGS_SIMPLE>, <callcache>[ 11] 0018 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 12] 0021 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 13] 0024 leave [ 10]
  150. Exceptions • Exceptions have an ISeq and a program counter

    (pc). == disasm: #<ISeq:<main>@src/no_method_error2.rb:1 (1,0)-(7,13)> (catch: FALSE) 0000 putspecialobject 3 ( 1)[ 0][Li] 0002 putnil [ 9] 0003 defineclass :A, <class:A>, 0 0007 pop 0008 getinlinecache 15, <is:0> ( 7)[ 10][Li] 0011 getconstant :A 0013 setinlinecache <is:0> 0015 opt_send_without_block <callinfo!mid:new, argc:0, ARGS_SIMPLE>, <callcache>[ 11] 0018 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 12] 0021 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 13] 0024 leave [ 10] Exception
  151. Exceptions • Exceptions have an ISeq and a program counter

    (pc). == disasm: #<ISeq:<main>@src/no_method_error2.rb:1 (1,0)-(7,13)> (catch: FALSE) 0000 putspecialobject 3 ( 1)[ 0][Li] 0002 putnil [ 9] 0003 defineclass :A, <class:A>, 0 0007 pop 0008 getinlinecache 15, <is:0> ( 7)[ 10][Li] 0011 getconstant :A 0013 setinlinecache <is:0> 0015 opt_send_without_block <callinfo!mid:new, argc:0, ARGS_SIMPLE>, <callcache>[ 11] 0018 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 12] 0021 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 13] 0024 leave [ 10] Exception
  152. Exceptions • Get node_id from an exception. == disasm: #<ISeq:<main>@src/no_method_error2.rb:1

    (1,0)-(7,13)> (catch: FALSE) 0000 putspecialobject 3 ( 1)[ 0][Li] 0002 putnil [ 9] 0003 defineclass :A, <class:A>, 0 0007 pop 0008 getinlinecache 15, <is:0> ( 7)[ 10][Li] 0011 getconstant :A 0013 setinlinecache <is:0> 0015 opt_send_without_block <callinfo!mid:new, argc:0, ARGS_SIMPLE>, <callcache>[ 11] 0018 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 12] 0021 opt_send_without_block <callinfo!mid:foo, argc:0, ARGS_SIMPLE>, <callcache>[ 13] 0024 leave [ 10] Exception Get node_id (13)
  153. # @ NODE_CALL (line: 7, location: (7,0)-(7,13))* 13 # +-

    nd_mid: :foo # +- nd_recv: # | @ NODE_CALL (line: 7, location: (7,0)-(7,9)) 12 # | +- nd_mid: :foo # | +- nd_recv: # | | @ NODE_CALL (line: 7, location: (7,0)-(7,5)) 11 # | | +- nd_mid: :new # | | +- nd_recv: # | | | @ NODE_CONST (line: 7, location: (7,0)-(7,1)) 10 # | | | +- nd_vid: :A # | | +- nd_args: # | | (null node) # | +- nd_args: # | (null node) # +- nd_args: # (null node) • Parse the source code file and find Node by node_id.
  154. # @ NODE_CALL (line: 7, location: (7,0)-(7,13))* 13 # +-

    nd_mid: :foo # +- nd_recv: # | @ NODE_CALL (line: 7, location: (7,0)-(7,9)) 12 # | +- nd_mid: :foo # | +- nd_recv: # | | @ NODE_CALL (line: 7, location: (7,0)-(7,5)) 11 # | | +- nd_mid: :new # | | +- nd_recv: # | | | @ NODE_CONST (line: 7, location: (7,0)-(7,1)) 10 # | | | +- nd_vid: :A # | | +- nd_args: # | | (null node) # | +- nd_args: # | (null node) # +- nd_args: # (null node) • Parse the source code file and find Node by node_id.
  155. # @ NODE_CALL (line: 7, location: (7,0)-(7,13))* 13 # +-

    nd_mid: :foo # +- nd_recv: # | @ NODE_CALL (line: 7, location: (7,0)-(7,9)) 12 # | +- nd_mid: :foo # | +- nd_recv: # | | @ NODE_CALL (line: 7, location: (7,0)-(7,5)) 11 # | | +- nd_mid: :new # | | +- nd_recv: # | | | @ NODE_CONST (line: 7, location: (7,0)-(7,1)) 10 # | | | +- nd_vid: :A # | | +- nd_args: # | | (null node) # | +- nd_args: # | (null node) # +- nd_args: # (null node) • Parse the source code file and find Node by node_id.
  156. A.new.foo.foo NODE_CALL (:foo) (line: 7, location: (7,0)-(7,13))* 13 NODE_CALL (:foo)

    (line: 7, location: (7,0)-(7,9)) 12 NODE_CALL (:new) (line: 7, location: (7,0)-(7,5)) 11 NODE_CONST (:A) (line: 7, location: (7,0)-(7,1)) 10 • Get location of Node.
  157. A.new.foo.foo NODE_CALL (:foo) (line: 7, location: (7,0)-(7,13))* 13 NODE_CALL (:foo)

    (line: 7, location: (7,0)-(7,9)) 12 NODE_CALL (:new) (line: 7, location: (7,0)-(7,5)) 11 NODE_CONST (:A) (line: 7, location: (7,0)-(7,1)) 10 • Get location of Node.
  158. A.new.foo.foo NODE_CALL (:foo) (line: 7, location: (7,0)-(7,13))* 13 NODE_CALL (:foo)

    (line: 7, location: (7,0)-(7,9)) 12 NODE_CALL (:new) (line: 7, location: (7,0)-(7,5)) 11 NODE_CONST (:A) (line: 7, location: (7,0)-(7,1)) 10 A.new.foo • Get location of Node.
  159. A.new.foo.foo NODE_CALL (:foo) (line: 7, location: (7,0)-(7,13))* 13 NODE_CALL (:foo)

    (line: 7, location: (7,0)-(7,9)) 12 NODE_CALL (:new) (line: 7, location: (7,0)-(7,5)) 11 NODE_CONST (:A) (line: 7, location: (7,0)-(7,1)) 10 A.new.foo .foo • Build an error message.
  160. A.new.foo.foo NODE_CALL (:foo) (line: 7, location: (7,0)-(7,13))* 13 NODE_CALL (:foo)

    (line: 7, location: (7,0)-(7,9)) 12 NODE_CALL (:new) (line: 7, location: (7,0)-(7,5)) 11 NODE_CONST (:A) (line: 7, location: (7,0)-(7,1)) 10 A.new.foo .foo ^^^^ • Build an error message.
  161. How to implement more detailed message of `NoMethodError` • Add

    unique id (per file), “node_id”, to Node. • Store node_id of insn on an ISeq. • Get node_id from an exception. • Parse the source code file and find Node by node_id. • Get location of Node. • Build an error message.
  162. Thank you!!!