Upgrade to Pro — share decks privately, control downloads, hide ads and more …

RNode with code locations

yui-knk
June 01, 2018

RNode with code locations

yui-knk

June 01, 2018
Tweet

More Decks by yui-knk

Other Decks in Programming

Transcript

  1. RNode with code locations
    Jun 1, 2018 in RubyKaigi 2018
    @yui-knk
    Yuichiro Kaneko

    View Slide

  2. Self-introduction
    • Yuichiro Kaneko

    View Slide

  3. View Slide

  4. Self-introduction
    • Yuichiro Kaneko

    • Asakusa.rb

    • A CRuby Committer (2015/12~)

    • GitHub (yui-knk)

    • Twitter (spikeolaf)

    View Slide

  5. I will join Treasure Data next
    week!!!

    View Slide

  6. Today's topic
    • RNode

    • Node of Abstract Syntax Tree

    • Code location

    • Location information of RNode

    View Slide

  7. Run the code
    • Run the code on …

    • Ruby 2.4

    • Ruby 2.5
    $ ruby --dump=p -e '"str".upcase'

    View Slide

  8. Ruby 2.4
    $ ruby --dump=p -e '"str".upcase'
    # @ NODE_SCOPE (line: 1)
    # +- nd_tbl: (empty)
    # +- nd_args:
    # | (null node)
    # +- nd_body:
    # @ NODE_PRELUDE (line: 1)
    # +- nd_head:
    # | (null node)
    # +- nd_body:
    # | @ NODE_CALL (line: 1)
    # | +- nd_mid: :upcase
    # | +- nd_recv:
    # | | @ NODE_STR (line: 1)
    # | | +- nd_lit: "str"
    # | +- nd_args:
    # | (null node)
    # +- nd_compile_option:
    # +- coverage_enabled: false

    View Slide

  9. $ ruby --dump=p -e '"str".upcase'
    # @ NODE_SCOPE (line: 1, code_range: (1,0)-(1,12))
    # +- nd_tbl: (empty)
    # +- nd_args:
    # | (null node)
    # +- nd_body:
    # @ NODE_PRELUDE (line: 1, code_range: (1,0)-(1,12))
    # +- nd_head:
    # | (null node)
    # +- nd_body:
    # | @ NODE_CALL (line: 1, code_range: (1,0)-(1,12))
    # | +- nd_mid: :upcase
    # | +- nd_recv:
    # | | @ NODE_STR (line: 1, code_range: (1,0)-(1,5))
    # | | +- nd_lit: "str"
    # | +- nd_args:
    # | (null node)
    # +- nd_compile_option:
    # +- coverage_enabled: false
    Ruby 2.5

    View Slide

  10. @@ -1,8 +1,8 @@
    # +- nd_body:
    -# | @ NODE_CALL (line: 1)
    +# | @ NODE_CALL (line: 1, code_range: (1,0)-
    (1,12))
    # | +- nd_mid: :upcase
    # | +- nd_recv:
    -# | | @ NODE_STR (line: 1)
    +# | | @ NODE_STR (line: 1, code_range:
    (1,0)-(1,5))
    # | | +- nd_lit: "str"
    # | +- nd_args:
    # | (null node)

    View Slide

  11. @@ -1,8 +1,8 @@
    # +- nd_body:
    -# | @ NODE_CALL (line: 1)
    +# | @ NODE_CALL (line: 1, code_range: (1,0)-
    (1,12))
    # | +- nd_mid: :upcase
    # | +- nd_recv:
    -# | | @ NODE_STR (line: 1)
    +# | | @ NODE_STR (line: 1, code_range:
    (1,0)-(1,5))
    # | | +- nd_lit: "str"
    # | +- nd_args:
    # | (null node)
    Today’s Topic

    View Slide

  12. Agenda
    • What code locations are

    • Why code locations are needed

    • Ruby crash course

    • How to implement code locations

    • The future plan of code locations feature

    • Conclusion

    View Slide

  13. What code locations are

    View Slide

  14. Location information in
    programming
    • Location information of script is used in various
    situations.

    View Slide

  15. Exception
    Traceback (most recent call last):
    1: from src/exception.rb:5:in `'
    src/exception.rb:2:in `a': undefined method `foo' for "":String
    (NoMethodError)

    View Slide

  16. Warning
    src/warning.rb:2: warning: instance variable @a not initialized

    View Slide

  17. Location information in
    programming
    • Location information of script is used in various
    situations.

    • "An exception is raised from line number
    XX" (Exception)

    • "Instance variable of line number XX not
    initialized" (Warning)

    • "No test cases for line number XX" (Coverage)

    View Slide

  18. Is line number enough to
    represent location?

    View Slide

  19. Location and position
    • Location is presented by 2 numbers:

    • Line number (lineno)

    • Distance from beginning of line (column)

    View Slide

  20. Location and position
    • 4 numbers are needed to represent “begin” and “end”.

    • "Code position" is a pair of lineno and column.

    • "Code location" is a pair of begin position and end
    position.
    1 + 2
    ^ ^ ^
    | | +- @3 (1.4-1.5)
    | +--- @2 (1.2-1.3)
    +----- @1 (1.0-1.1)
    @3 (1.4-1.5)
    Code position (begin)
    Code location
    @3 (1.4-1.5)
    Lineno (1)
    Column (4)
    Code position (end)

    View Slide

  21. Location in Ruby
    • Ruby holds *only* line numbers until Ruby 2.4.

    • Ruby holds line numbers and columns since Ruby 2.5.

    • Today’s main topic is “Column”.

    View Slide

  22. Minor details about column
    • 0-based / 1-based

    • 0-based

    • Vary according to programming languages and editors.

    • From the beginning of line / file

    • Line

    View Slide

  23. Minor details about column
    • Byte length / Character length

    • Byte length

    • “ߏจ໦ʹৄࡉͳҐஔ৘ใΛ΋ͨͤΔܭը”

    • https://bugs.ruby-lang.org/projects/ruby-trunk/wiki/
    Node-position-memo

    View Slide

  24. Why code locations are
    needed

    View Slide

  25. For coverage features
    • For branch coverage and method coverage (Ruby 2.5~).

    • "An introduction and future of Ruby coverage library”

    • http://rubykaigi.org/2017/presentations/mametter.html
    (30:50-)

    View Slide

  26. What is branch coverage
    • "Branch coverage tells you which branches are executed,
    and which not." (doc/NEWS-2.5.0)
    (a == 2) ? :t : :f

    View Slide

  27. What is branch coverage
    • You may forget to write test codes for `then` cases.

    • `n/m`

    • `n`: How many times the “then clause” is executed.

    • `m`: How many times the “else clause” is executed.
    0/1: (a == 2) ? :t : :f

    View Slide

  28. Use-case (1)
    • Code locations can be used for visualizing branch
    coverage results.
    0/1: (a == 2) ? :t : :f

    View Slide

  29. Use-case (1)
    • Code locations can be used for visualizing branch
    coverage results.
    0/1: (a == 2) ? :t : :f
    YOU SHOULD WRITE TEST !!!

    View Slide

  30. Use-case (2)
    • One line can contain one or more branches.

    • In these case, we can't recognize which clause is
    executed by only line numbers.
    (a == b) ? ((c == d) ? :A : :B) : :C
    obj&.foo? ? "a" : "b"

    View Slide

  31. Use-case (2)
    • One line can contain one or more branches.

    • In these case, we can't recognize which clause is
    executed by only line numbers.
    (a == b) ? ((c == d) ? :A : :B) : :C
    obj&.foo? ? "a" : "b"

    View Slide

  32. Ruby crash course

    View Slide

  33. How Ruby script is processed
    4UFQ *OQVU 0VUQVU %FCVH 4PSVDF
    5PLFOJ[BUJPO 3VCZTDSJQU 5PLFOT EVNQZ QBSTFZ
    1BSTJOH 5PLFOT "45 EVNQQ QBSTFZ
    $PNQJMF "45 #ZUFDPEF EVNQJ DPNQJMFD
    Parsing ___
    \
    Ruby script -> Tokens -> AST -> Byte code (insns / ISeq)
    __/ __/
    Tokenization Compile

    View Slide

  34. Ruby script
    1 + 2

    View Slide

  35. Tokenization
    Parsing ___
    \
    Ruby script -> Tokens -> AST -> Byte code (insns / ISeq)
    __/ __/
    Tokenization Compile

    View Slide

  36. Tokenization
    • Each token has

    • a token type (tINTEGER)

    • a semantic value (1)
    1 + 2
    ^ ^ ^^
    | | |+--- '\n' / "end-of-input"
    | | +---- tINTEGER (2)
    | +------ '+'
    +-------- tINTEGER (1)

    View Slide

  37. Tokenization
    $ ruby --dump=y -e '1 + 2' | grep Shifting
    Shifting token tINTEGER (1.0-1.1: )
    Shifting token '+' (1.2-1.3: )
    Shifting token tINTEGER (1.4-1.5: )
    Shifting token '\n' (1.5-1.5: )
    Shifting token "end-of-input" (1.5-1.5: )
    On Ruby 2.5.1
    1
    2

    View Slide

  38. Tokenization
    $ ruby --dump=y -e '1 + 2' | grep Shifting
    On Ruby 2.6.0preview1
    Shifting token tINTEGER (1.0-1.1: 1)
    Shifting token '+' (1.2-1.3: )
    Shifting token tINTEGER (1.4-1.5: 2)
    Shifting token '\n' (1.5-1.5: )
    Shifting token "end-of-input" (1.5-1.5: )

    View Slide

  39. Tokenization
    $ ruby --dump=y -e '1 + 2' | grep Shifting
    On Ruby 2.6.0preview1
    Shifting token tINTEGER (1.0-1.1: 1)
    Shifting token '+' (1.2-1.3: )
    Shifting token tINTEGER (1.4-1.5: 2)
    Shifting token '\n' (1.5-1.5: )
    Shifting token "end-of-input" (1.5-1.5: )

    View Slide

  40. r61997 / 46e2fad

    View Slide

  41. Parsing
    Parsing ___
    \
    Ruby script -> Tokens -> AST -> Byte code (insns / ISeq)
    __/ __/
    Tokenization Compile

    View Slide

  42. Parsing
    • Analyzes tokens conforming to the rules of Ruby syntax.

    • Builds AST.

    View Slide

  43. Parsing
    numeric : simple_numeric
    | tUMINUS_NUM simple_numeric %prec tLOWEST
    ;
    simple_numeric : tINTEGER
    | tFLOAT
    | tRATIONAL
    | tIMAGINARY
    ;
    %%
    program : {
    }
    top_compstmt
    parse.y

    View Slide

  44. Parsing
    numeric : simple_numeric
    | tUMINUS_NUM simple_numeric %prec tLOWEST
    ;
    simple_numeric : tINTEGER
    | tFLOAT
    | tRATIONAL
    | tIMAGINARY
    ;
    %%
    program : {
    }
    top_compstmt
    W
    W
    W
    Rules
    parse.y

    View Slide

  45. Parsing
    numeric : simple_numeric
    | tUMINUS_NUM simple_numeric %prec tLOWEST
    ;
    simple_numeric : tINTEGER
    | tFLOAT
    | tRATIONAL
    | tIMAGINARY
    ;
    %%
    program : {
    }
    top_compstmt
    parse.y

    View Slide

  46. Parsing
    numeric : simple_numeric
    | tUMINUS_NUM simple_numeric %prec tLOWEST
    ;
    simple_numeric : tINTEGER
    | tFLOAT
    | tRATIONAL
    | tIMAGINARY
    ;
    %%
    program : {
    }
    top_compstmt
    1
    2.1
    3r
    4i
    parse.y

    View Slide

  47. Parsing
    numeric : simple_numeric
    | tUMINUS_NUM simple_numeric %prec tLOWEST
    ;
    simple_numeric : tINTEGER
    | tFLOAT
    | tRATIONAL
    | tIMAGINARY
    ;
    %%
    program : {
    }
    top_compstmt
    parse.y

    View Slide

  48. Parsing
    numeric : simple_numeric
    | tUMINUS_NUM simple_numeric %prec tLOWEST
    ;
    simple_numeric : tINTEGER
    | tFLOAT
    | tRATIONAL
    | tIMAGINARY
    ;
    %%
    program : {
    }
    top_compstmt
    1
    parse.y

    View Slide

  49. Parsing
    numeric : simple_numeric
    | tUMINUS_NUM simple_numeric %prec tLOWEST
    ;
    simple_numeric : tINTEGER
    | tFLOAT
    | tRATIONAL
    | tIMAGINARY
    ;
    %%
    program : {
    }
    top_compstmt
    1
    parse.y

    View Slide

  50. Parsing
    numeric : simple_numeric
    | tUMINUS_NUM simple_numeric %prec tLOWEST
    ;
    simple_numeric : tINTEGER
    | tFLOAT
    | tRATIONAL
    | tIMAGINARY
    ;
    %%
    program : {
    }
    top_compstmt
    Goal
    parse.y

    View Slide

  51. Parsing
    $ ruby --dump=y -e '1 + 2'
    # Shifting token tINTEGER (1)
    tINTEGER
    simple_numeric
    numeric
    literal
    primary
    arg

    View Slide

  52. Parsing
    # Shifting token '+'
    arg '+'
    # Shifting token tINTEGER (2)
    arg '+' tINTEGER
    arg '+' simple_numeric
    arg '+' numeric
    arg '+' literal
    arg '+' primary
    arg '+' arg
    arg
    expr
    stmt
    top_stmt
    top_stmts

    View Slide

  53. Parsing
    # Shifting token '\n'
    top_stmts '\n'
    top_stmts term
    top_stmts terms
    top_stmts opt_terms
    top_compstmt
    program # Completed

    View Slide

  54. Build AST
    $ ruby --dump=p -e '1 + 2'
    NODE_SCOPE
    NODE_PRELUDE
    NODE_OPCALL (:+)
    NODE_LIT (1)
    NODE_ARRAY
    NODE_LIT (2)
    NODE_SCOPE
    NODE_PRELUDE
    NODE_OPCALL (:+)
    NODE_LIT (1) NODE_ARRAY
    NODE_LIT (2)

    View Slide

  55. Build AST
    typedef struct RNode {
    VALUE flags;
    union {
    struct RNode *node;
    ...
    } u1;
    union {
    struct RNode *node;
    ...
    } u2;
    union {
    struct RNode *node;
    ...
    } u3;
    rb_code_location_t nd_loc;
    } NODE;

    View Slide

  56. Build AST
    typedef struct RNode {
    VALUE flags;
    union {
    struct RNode *node;
    ...
    } u1;
    union {
    struct RNode *node;
    ...
    } u2;
    union {
    struct RNode *node;
    ...
    } u3;
    rb_code_location_t nd_loc;
    } NODE;
    Contain node_type

    View Slide

  57. Build AST
    typedef struct RNode {
    VALUE flags;
    union {
    struct RNode *node;
    ...
    } u1;
    union {
    struct RNode *node;
    ...
    } u2;
    union {
    struct RNode *node;
    ...
    } u3;
    rb_code_location_t nd_loc;
    } NODE;
    Contain node_type
    Contain various data

    View Slide

  58. Build AST
    typedef struct RNode {
    VALUE flags;
    union {
    struct RNode *node;
    ...
    } u1;
    union {
    struct RNode *node;
    ...
    } u2;
    union {
    struct RNode *node;
    ...
    } u3;
    rb_code_location_t nd_loc;
    } NODE;
    Contain node_type
    Contain various data
    Contain
    Location information

    View Slide

  59. Build AST
    • Builds AST in actions.

    • $1 stands for the value of the 1st component (`arg`).
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    NODE *expr;
    value_expr(recv);
    value_expr(arg1);
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    nd_set_line(expr, op_loc->beg_pos.lineno);
    return expr;
    }

    View Slide

  60. Build AST
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    NODE *expr;
    value_expr(recv);
    value_expr(arg1);
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    nd_set_line(expr, op_loc->beg_pos.lineno);
    return expr;
    }
    Action

    View Slide

  61. Build AST
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    NODE *expr;
    value_expr(recv);
    value_expr(arg1);
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    nd_set_line(expr, op_loc->beg_pos.lineno);
    return expr;
    }

    View Slide

  62. Build AST
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    NODE *expr;
    value_expr(recv);
    value_expr(arg1);
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    nd_set_line(expr, op_loc->beg_pos.lineno);
    return expr;
    }

    View Slide

  63. Build AST
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    NODE *expr;
    value_expr(recv);
    value_expr(arg1);
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    nd_set_line(expr, op_loc->beg_pos.lineno);
    return expr;
    } Create NODE_OPCALL

    View Slide

  64. Build AST
    $ ruby --dump=p -e '1 + 2'
    NODE_SCOPE
    NODE_PRELUDE
    NODE_OPCALL (:+)
    NODE_LIT (1)
    NODE_ARRAY
    NODE_LIT (2)

    View Slide

  65. Compile
    Parsing ___
    \
    Ruby script -> Tokens -> AST -> Byte code (insns / ISeq)
    __/ __/
    Tokenization Compile

    View Slide

  66. Compile
    • Do compile.

    • See “compile.c”.
    $ ruby --dump=i -e '1 + 2'
    == disasm: #@-e:1 (1,0)-
    (1,5)>==============================
    0000 putobject_OP_INT2FIX_O_1_C_
    ( 1)[Li]
    0001 putobject 2
    0003 opt_plus ,

    0006 leave

    View Slide

  67. Compile
    • Do compile.

    • See “compile.c”.
    $ ruby --dump=i -e '1 + 2'
    == disasm: #@-e:1 (1,0)-
    (1,5)>==============================
    0000 putobject_OP_INT2FIX_O_1_C_
    ( 1)[Li]
    0001 putobject 2
    0003 opt_plus ,

    0006 leave
    ISeq
    4 insn(s)

    View Slide

  68. References
    • "Ruby Hacking Guide" (Part 2: Syntax analysis)

    • https://ruby-hacking-guide.github.io/ [EN]

    • http://i.loveruby.net/ja/rhg/book/ [JA]

    • "Ruby Under a Microscope" / “Rubyͷ͘͠Έ"

    View Slide

  69. How to implement code
    locations

    View Slide

  70. Goal
    • Branch coverage

    • To pass code locations to compile phase.

    • Method coverage

    • To store code locations on ISeq.

    • What should we implement

    • Embed code locations into each NODE.

    View Slide

  71. Hint
    • Original source of location information is Ruby script.

    • If we want to use location information in "n"th step, we
    should implement location information in "n-1"th step.

    • In this case, it's need to pass location information from
    "Tokenization" to "Compile" to use location information in
    compile phase.
    Parsing ___
    \
    Ruby script -> Tokens -> AST -> Byte code (insns / ISeq)
    __/ __/
    Tokenization Compile

    View Slide

  72. parser_params crash course
    • One on the main data structure of parser.

    • Too Big!!!
    struct parser_params {
    rb_imemo_tmpbuf_t *heap;
    YYSTYPE *lval;
    struct {
    rb_strterm_t *strterm;
    VALUE (*gets)(struct parser_params*,VALUE);
    VALUE input;
    VALUE prevline;
    VALUE lastline;
    VALUE nextline;
    const char *pbeg;
    const char *pcur;
    const char *pend;
    const char *ptok;
    long gets_ptr;
    enum lex_state_e state;
    /* track the nest level of any parens "()[]{}" */
    int paren_nest;
    /* keep p->lex.paren_nest at the beginning of lambda "->" to detect tLAMBEG and keyword_do_LAMBDA */
    int lpar_beg;
    /* track the nest level of only braces "{}" */
    int brace_nest;
    } lex;
    stack_type cond_stack;
    stack_type cmdarg_stack;
    int tokidx;
    int toksiz;
    int tokline;
    int heredoc_end;
    int heredoc_indent;
    int heredoc_line_indent;
    char *tokenbuf;
    struct local_vars *lvtbl;
    int line_count;
    int ruby_sourceline; /* current line no. */
    char *ruby_sourcefile; /* current source file */
    VALUE ruby_sourcefile_string;
    rb_encoding *enc;
    token_info *token_info;
    VALUE compile_option;
    VALUE debug_buffer;
    VALUE debug_output;
    ID cur_arg;
    rb_ast_t *ast;
    unsigned int command_start:1;
    unsigned int eofp: 1;
    unsigned int ruby__end__seen: 1;
    unsigned int debug: 1;
    unsigned int has_shebang: 1;
    unsigned int in_defined: 1;
    unsigned int in_main: 1;
    unsigned int in_kwarg: 1;
    unsigned int in_def: 1;
    unsigned int in_class: 1;
    unsigned int token_seen: 1;
    unsigned int token_info_enabled: 1;
    # if WARN_PAST_SCOPE
    unsigned int past_scope_enabled: 1;
    # endif
    unsigned int error_p: 1;
    unsigned int cr_seen: 1;
    #ifndef RIPPER
    /* Ruby core only */
    unsigned int do_print: 1;
    unsigned int do_loop: 1;
    unsigned int do_chomp: 1;
    unsigned int do_split: 1;
    unsigned int warn_location: 1;
    NODE *eval_tree_begin;
    NODE *eval_tree;
    VALUE error_buffer;
    VALUE debug_lines;
    VALUE coverage;
    const struct rb_block *base_block;
    #else
    /* Ripper only */
    VALUE delayed;
    int delayed_line;
    int delayed_col;
    VALUE value;
    VALUE result;
    VALUE parsing_thread;
    #endif
    };

    View Slide

  73. parser_params crash course
    • It has struct for lexer (`lex`).

    • Lexer processes input in units of lines.

    • *Basically* processes from top to bottom.
    struct parser_params {
    ...
    struct {
    ...
    VALUE prevline;
    VALUE lastline;
    VALUE nextline;
    const char *pbeg;
    const char *pcur;
    const char *pend;
    const char *ptok;
    ...
    } lex;
    ...
    };
    Lines
    W
    W Pointers

    View Slide

  74. What is column
    /* parse.y */
    /*
    Structure of Lexer Buffer:
    lex.pbeg lex.ptok lex.pcur lex.pend
    | | | |
    |------------+------------+------------|
    |<---------->|
    token
    */

    View Slide

  75. What is column
    • When token '+' is recognized (Left).

    • When token tINTEGER (2) is recognized (Right).
    1 + 2
    ^ ^^ ^
    | || +--- lex.pend
    | |+----- lex.pcur
    | +------ lex.ptok
    +-------- lex.pbeg
    1 + 2
    ^ ^ ^
    | | +--- lex.pcur, lex.pend
    | +----- lex.ptok
    +-------- lex.pbeg

    View Slide

  76. What is column
    • `lex.ptok - lex.pbeg` (begin) and `lex.pcur - lex.pbeg` (end).

    • Column is a difference between pointers when a token is
    recognized.
    |--| lex.pcur - lex.pbeg (end)
    |-| lex.ptok - lex.pbeg (begin)
    1 + 2
    ^ ^^ ^
    | || +--- lex.pend
    | |+----- lex.pcur
    | +------ lex.ptok
    +-------- lex.pbeg

    View Slide

  77. What is column
    • We must store columns somewhere before next token is
    recognized.
    1 + 2
    ^ ^^ ^
    | || +--- lex.pend
    | |+----- lex.pcur
    | +------ lex.ptok
    +-------- lex.pbeg
    1 + 2
    ^ ^ ^
    | | +--- lex.pcur, lex.pend
    | +----- lex.ptok
    +-------- lex.pbeg

    View Slide

  78. From Ruby script to tokens
    • Copy location information to `YYLTYPE *yylloc` in `yylex`.

    • The `yylloc` argument is newly added to `yylex`.

    • Call `RUBY_SET_YYLLOC` to set `yylloc`.

    View Slide

  79. static enum yytokentype
    yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p)
    {
    enum yytokentype t;
    p->lval = lval;
    lval->val = Qundef;
    t = parser_yylex(p);
    if (has_delayed_token(p))
    dispatch_delayed_token(p, t);
    else if (t != 0)
    dispatch_scan_event(p, t);
    if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC))
    RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc);
    else
    RUBY_SET_YYLLOC(*yylloc);
    return t;
    }

    View Slide

  80. static enum yytokentype
    yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p)
    {
    enum yytokentype t;
    p->lval = lval;
    lval->val = Qundef;
    t = parser_yylex(p);
    if (has_delayed_token(p))
    dispatch_delayed_token(p, t);
    else if (t != 0)
    dispatch_scan_event(p, t);
    if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC))
    RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc);
    else
    RUBY_SET_YYLLOC(*yylloc);
    return t;
    }
    New argument

    View Slide

  81. static enum yytokentype
    yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p)
    {
    enum yytokentype t;
    p->lval = lval;
    lval->val = Qundef;
    t = parser_yylex(p);
    if (has_delayed_token(p))
    dispatch_delayed_token(p, t);
    else if (t != 0)
    dispatch_scan_event(p, t);
    if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC))
    RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc);
    else
    RUBY_SET_YYLLOC(*yylloc);
    return t;
    }
    New argument
    static enum yytokentype
    yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p)
    {
    enum yytokentype t;
    p->lval = lval;
    lval->val = Qundef;
    t = parser_yylex(p);
    if (has_delayed_token(p))
    dispatch_delayed_token(p, t);
    else if (t != 0)
    dispatch_scan_event(p, t);
    if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC))
    RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc);
    else
    RUBY_SET_YYLLOC(*yylloc);
    return t;
    }
    Create token

    View Slide

  82. static enum yytokentype
    yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p)
    {
    enum yytokentype t;
    p->lval = lval;
    lval->val = Qundef;
    t = parser_yylex(p);
    if (has_delayed_token(p))
    dispatch_delayed_token(p, t);
    else if (t != 0)
    dispatch_scan_event(p, t);
    if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC))
    RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc);
    else
    RUBY_SET_YYLLOC(*yylloc);
    return t;
    }
    New argument
    static enum yytokentype
    yylex(YYSTYPE *lval, YYLTYPE *yylloc, struct parser_params *p)
    {
    enum yytokentype t;
    p->lval = lval;
    lval->val = Qundef;
    t = parser_yylex(p);
    if (has_delayed_token(p))
    dispatch_delayed_token(p, t);
    else if (t != 0)
    dispatch_scan_event(p, t);
    if (p->lex.strterm && (p->lex.strterm->flags & STRTERM_HEREDOC))
    RUBY_SET_YYLLOC_FROM_STRTERM_HEREDOC(*yylloc);
    else
    RUBY_SET_YYLLOC(*yylloc);
    return t;
    }
    Create token
    Set `yylloc`

    View Slide

  83. From tokens to Nodes
    • Now we can use `@n` in each action.

    • `@n` stands for the location of the nth component of the
    right hand side.

    • `@$` stands for the location of the left hand side grouping
    (`YYLTYPE yyloc`).

    • Set by `YYLLOC_DEFAULT`.

    • https://www.gnu.org/software/bison/manual/html_node/
    Tracking-Locations.html#Tracking-Locations

    View Slide

  84. |---| @$ (1.0-1.5)
    1 + 2
    ^ ^ ^
    | | +- @3 (1.4-1.5)
    | +--- @2 (1.2-1.3)
    +----- @1 (1.0-1.1)
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    ...
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    ...
    }
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    ...
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    ...
    }
    1 2

    View Slide

  85. |---| @$ (1.0-1.5)
    1 + 2
    ^ ^ ^
    | | +- @3 (1.4-1.5)
    | +--- @2 (1.2-1.3)
    +----- @1 (1.0-1.1)
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    ...
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    ...
    }
    (1.0-1.5)
    1 2

    View Slide

  86. |---| @$ (1.0-1.5)
    1 + 2
    ^ ^ ^
    | | +- @3 (1.4-1.5)
    | +--- @2 (1.2-1.3)
    +----- @1 (1.0-1.1)
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    ...
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    ...
    }
    (1.0-1.5)
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    ...
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    ...
    }
    1 2

    View Slide

  87. |---| @$ (1.0-1.5)
    1 + 2
    ^ ^ ^
    | | +- @3 (1.4-1.5)
    | +--- @2 (1.2-1.3)
    +----- @1 (1.0-1.1)
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    ...
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    ...
    }
    (1.0-1.5)
    arg | arg '+' arg
    {
    $$ = call_bin_op(p, $1, '+', $3, &@2, &@$);
    }
    static NODE *
    call_bin_op(struct parser_params *p, NODE *recv, ID id, NODE *arg1,
    const YYLTYPE *op_loc, const YYLTYPE *loc)
    {
    ...
    expr = NEW_OPCALL(recv, id, NEW_LIST(arg1, &arg1->nd_loc), loc);
    ...
    }
    1 2

    View Slide

  88. `NODE_OPCALL` is simple
    • All location needed to NODE_OPCALL is supplied when
    create NODE_OPCALL.

    • NODE_LIT (1)

    • mid (:+)

    • NODE_ARRAY (2)
    NODE_OPCALL 1 + 2
    ----------- ------------
    arg : arg '+' arg

    View Slide

  89. `NODE_*ASGN` family is not
    simple
    • `NODE_LASGN` (local variable assignment) or
    `NODE_IASGN` (instance variable assignment).
    @a = 1
    NODE_SCOPE
    NODE_IASGN (:@a)
    NODE_LIT (1)

    View Slide

  90. NODE_IASGN @a
    ---------- -------------
    lhs : user_variable
    {
    /*%%%*/
    $$ = assignable(p, $1, 0, &@$);
    /*% %*/
    }
    NODE_IASGN NODE_IASGN = 1
    ---------- ------------------------
    arg : lhs '=' arg_rhs
    {
    /*%%%*/
    $$ = node_assign(p, $1, $3, &@$);
    /*% %*/
    /*% ripper: assign!($1, $3) %*/
    }

    View Slide

  91. NODE_IASGN @a
    ---------- -------------
    lhs : user_variable
    {
    /*%%%*/
    $$ = assignable(p, $1, 0, &@$);
    /*% %*/
    }
    NODE_IASGN NODE_IASGN = 1
    ---------- ------------------------
    arg : lhs '=' arg_rhs
    {
    /*%%%*/
    $$ = node_assign(p, $1, $3, &@$);
    /*% %*/
    /*% ripper: assign!($1, $3) %*/
    }
    Create NODE_IASGN

    View Slide

  92. NODE_IASGN @a
    ---------- -------------
    lhs : user_variable
    {
    /*%%%*/
    $$ = assignable(p, $1, 0, &@$);
    /*% %*/
    }
    NODE_IASGN NODE_IASGN = 1
    ---------- ------------------------
    arg : lhs '=' arg_rhs
    {
    /*%%%*/
    $$ = node_assign(p, $1, $3, &@$);
    /*% %*/
    /*% ripper: assign!($1, $3) %*/
    }
    Create NODE_IASGN
    Location is determined

    View Slide

  93. NODE_IASGN @a
    ---------- -------------
    lhs : user_variable
    {
    /*%%%*/
    $$ = assignable(p, $1, 0, &@$);
    /*% %*/
    }
    NODE_IASGN NODE_IASGN = 1
    ---------- ------------------------
    arg : lhs '=' arg_rhs
    {
    /*%%%*/
    $$ = node_assign(p, $1, $3, &@$);
    /*% %*/
    /*% ripper: assign!($1, $3) %*/
    }
    Create NODE_IASGN
    Update location
    Location is determined

    View Slide

  94. `NODE_*ASGN` family is not
    simple
    • Create `NODE_*ASGN`.

    • Assign right hand side.

    • So it's needed to update location of `NODE_*ASGN` when
    right hand side is assigned.

    View Slide

  95. `NODE_ITER` is not simple
    • `NODE_ITER` (method call with block).
    3.times { foo }
    NODE_ITER
    NODE_CALL (:times)
    NODE_LIT (3)
    NODE_SCOPE
    NODE_VCALL (:foo)

    View Slide

  96. NODE_CALL 3 . times
    ----------- --------------------------------------------------
    method_call | primary_value call_op operation2 opt_paren_args
    {
    }
    NODE_ITER { NODE_ITER }
    ----------- ---------------------
    brace_block : '{' brace_body '}'
    {
    }
    NODE_ITER NODE_CALL NODE_ITER
    --------- -----------------------
    primary | method_call brace_block
    {
    /*%%%*/
    block_dup_check(p, $1->nd_args, $2);
    $$ = method_add_block(p, $1, $2, &@$);
    /*% %*/
    }

    View Slide

  97. NODE_CALL 3 . times
    ----------- --------------------------------------------------
    method_call | primary_value call_op operation2 opt_paren_args
    {
    }
    NODE_ITER { NODE_ITER }
    ----------- ---------------------
    brace_block : '{' brace_body '}'
    {
    }
    NODE_ITER NODE_CALL NODE_ITER
    --------- -----------------------
    primary | method_call brace_block
    {
    /*%%%*/
    block_dup_check(p, $1->nd_args, $2);
    $$ = method_add_block(p, $1, $2, &@$);
    /*% %*/
    }
    `3.times` `{ foo }`

    View Slide

  98. NODE_CALL 3 . times
    ----------- --------------------------------------------------
    method_call | primary_value call_op operation2 opt_paren_args
    {
    }
    NODE_ITER { NODE_ITER }
    ----------- ---------------------
    brace_block : '{' brace_body '}'
    {
    }
    NODE_ITER NODE_CALL NODE_ITER
    --------- -----------------------
    primary | method_call brace_block
    {
    /*%%%*/
    block_dup_check(p, $1->nd_args, $2);
    $$ = method_add_block(p, $1, $2, &@$);
    /*% %*/
    }
    `3.times` `{ foo }`
    Update location

    View Slide

  99. `NODE_ITER` is not simple
    • `NODE_CALL` is created.

    • `NODE_ITER` is created.

    • `NODE_ITER` is added to `NODE_CALL`.

    • It's needed to update location of `NODE_ITER` when it is
    passed to `NODE_CALL`.

    View Slide

  100. View Slide

  101. $ git shortlog -s -n parse.y | head -10
    XXX ????
    362 matz
    133 mame
    88 yui-knk
    55 ko1
    38 aamine
    37 akr
    33 naruse
    25 usa
    7 normal
    On "v2_6_0_preview1".

    View Slide

  102. $ git shortlog -s -n parse.y | head -10
    XXX ????
    362 matz
    133 mame
    88 yui-knk
    55 ko1
    38 aamine
    37 akr
    33 naruse
    25 usa
    7 normal
    On "v2_6_0_preview1".
    Me

    View Slide

  103. $ git shortlog -s -n parse.y | head -10
    884 nobu
    362 matz
    133 mame
    88 yui-knk
    55 ko1
    38 aamine
    37 akr
    33 naruse
    25 usa
    7 normal
    On "v2_6_0_preview1".

    View Slide

  104. $ git shortlog -s -n parse.y | head -10
    884 nobu
    362 matz
    133 mame
    88 yui-knk
    55 ko1
    38 aamine
    37 akr
    33 naruse
    25 usa
    7 normal
    On "v2_6_0_preview1".
    x 10

    View Slide

  105. $ git shortlog -s -n parse.y | head -10
    884 nobu
    362 matz
    133 mame
    88 yui-knk
    55 ko1
    38 aamine
    37 akr
    33 naruse
    25 usa
    7 normal
    On "v2_6_0_preview1".
    x 10
    @nobu is the lord of Demon Castle "parse.y".

    View Slide

  106. How to test
    • Define some rules and check all ruby files in "test"
    directory follow the rules.

    • Related files:

    • “ext/-test-/ast/ast.c"

    • "test/-ext-/ast/test_ast.rb"
    On "v2_6_0_preview1".

    View Slide

  107. "ext/-test-" and "test/-ext-"
    • "ext/-test-" contains C extensions which are used in
    Ruby's tests.

    • "ext/-test-/ast/ast.c" defines `AST` module and
    `AST::Node` class.
    On "v2_6_0_preview1".

    View Slide

  108. "ext/-test-" and "test/-ext-"
    • "test/-ext-" contains test cases which depend “ext/-
    test-".
    On "v2_6_0_preview1".

    View Slide

  109. Rule 1
    • `lineno` is initialized with `0` and `column` with `-1`.

    • Validate all node locations are update at least once.
    NODE_IF (line: 1, location: (0,-1)-(0,-1))

    View Slide

  110. Rule 1
    • `lineno` is initialized with `0` and `column` with `-1`.

    • Validate all node locations are update at least once.
    NODE_IF (line: 1, location: (0,-1)-(0,-1))

    View Slide

  111. Rule 2
    • Validate children do not exceed a parent location.
    3.times { foo }
    NODE_ITER [1.0-1.15]
    NODE_CALL (:times) [1.0-1.8]
    NODE_LIT (3) [1.0-1.1]
    NODE_SCOPE [1.8-1.15]
    NODE_VCALL (:foo) [1.10-1.13]

    View Slide

  112. Rule 2
    • Validate children do not exceed a parent location.
    3.times { foo }
    NODE_ITER [1.0-1.15]
    NODE_CALL (:times) [1.0-1.8]
    NODE_LIT (3) [1.0-1.1]
    NODE_SCOPE [1.8-1.15] -> covers [1.10-1.13]
    NODE_VCALL (:foo) [1.10-1.13]

    View Slide

  113. Rule 2
    • Validate children do not exceed a parent location.
    3.times { foo }
    NODE_ITER [1.0-1.15]
    NODE_CALL (:times) [1.0-1.8] -> covers [1.0-1.1]
    NODE_LIT (3) [1.0-1.1]
    NODE_SCOPE [1.8-1.15] -> covers [1.10-1.13]
    NODE_VCALL (:foo) [1.10-1.13]

    View Slide

  114. Rule 2
    • Validate children do not exceed a parent location.
    3.times { foo }
    NODE_ITER [1.0-1.15] -> covers [1.0-1.8] and [1.8-1.15]
    NODE_CALL (:times) [1.0-1.8] -> covers [1.0-1.1]
    NODE_LIT (3) [1.0-1.1]
    NODE_SCOPE [1.8-1.15] -> covers [1.10-1.13]
    NODE_VCALL (:foo) [1.10-1.13]

    View Slide

  115. Rule 2
    • Validate children do not exceed a parent location.
    3.times { foo }
    NODE_ITER [1.0-1.15] -> covers [1.0-1.8] and [1.8-1.15]
    NODE_CALL (:times) [1.0-1.8] -> covers [1.0-1.1]
    NODE_LIT (3) [1.0-1.1]
    NODE_SCOPE [1.8-1.15] -> covers [1.10-1.13]
    NODE_VCALL (:foo) [1.10-1.13]

    View Slide

  116. Dir.glob("test/**/*.rb", base: SRCDIR).each do |path|
    define_method("test_ranges:#{path}") do
    helper = Helper.new("#{SRCDIR}/#{path}")
    helper.validate_range
    assert_equal([], helper.errors)
    end
    end
    test/-ext-/ast/test_ast.rb

    View Slide

  117. Dir.glob("test/**/*.rb", base: SRCDIR).each do |path|
    define_method("test_ranges:#{path}") do
    helper = Helper.new("#{SRCDIR}/#{path}")
    helper.validate_range
    assert_equal([], helper.errors)
    end
    end
    Check all ruby files in "test" directory
    test/-ext-/ast/test_ast.rb

    View Slide

  118. Dir.glob("test/**/*.rb", base: SRCDIR).each do |path|
    define_method("test_ranges:#{path}") do
    helper = Helper.new("#{SRCDIR}/#{path}")
    helper.validate_range
    assert_equal([], helper.errors)
    end
    end
    Check all ruby files in "test" directory
    Validate each file
    test/-ext-/ast/test_ast.rb

    View Slide

  119. def validate_range0(node)
    beg_pos, end_pos = node.beg_pos, node.end_pos
    children = node.children.compact
    min = children.map(&:beg_pos).min
    max = children.map(&:end_pos).max
    unless beg_pos <= min
    @errors << { type: :min_validation_error, min: min, beg_pos: beg_pos,
    node: node }
    end
    unless max <= end_pos
    @errors << { type: :max_validation_error, max: max, end_pos: end_pos,
    node: node }
    end
    children.each do |child|
    validate_range0(child)
    end
    end
    ast = AST.parse_file(@path)
    validate_not_cared0(ast) test/-ext-/ast/test_ast.rb

    View Slide

  120. def validate_range0(node)
    beg_pos, end_pos = node.beg_pos, node.end_pos
    children = node.children.compact
    min = children.map(&:beg_pos).min
    max = children.map(&:end_pos).max
    unless beg_pos <= min
    @errors << { type: :min_validation_error, min: min, beg_pos: beg_pos,
    node: node }
    end
    unless max <= end_pos
    @errors << { type: :max_validation_error, max: max, end_pos: end_pos,
    node: node }
    end
    children.each do |child|
    validate_range0(child)
    end
    end
    ast = AST.parse_file(@path)
    validate_not_cared0(ast)
    Generate AST
    test/-ext-/ast/test_ast.rb

    View Slide

  121. def validate_range0(node)
    beg_pos, end_pos = node.beg_pos, node.end_pos
    children = node.children.compact
    min = children.map(&:beg_pos).min
    max = children.map(&:end_pos).max
    unless beg_pos <= min
    @errors << { type: :min_validation_error, min: min, beg_pos: beg_pos,
    node: node }
    end
    unless max <= end_pos
    @errors << { type: :max_validation_error, max: max, end_pos: end_pos,
    node: node }
    end
    children.each do |child|
    validate_range0(child)
    end
    end
    ast = AST.parse_file(@path)
    validate_not_cared0(ast)
    Generate AST
    test/-ext-/ast/test_ast.rb
    Check ranges

    View Slide

  122. def validate_range0(node)
    beg_pos, end_pos = node.beg_pos, node.end_pos
    children = node.children.compact
    min = children.map(&:beg_pos).min
    max = children.map(&:end_pos).max
    unless beg_pos <= min
    @errors << { type: :min_validation_error, min: min, beg_pos: beg_pos,
    node: node }
    end
    unless max <= end_pos
    @errors << { type: :max_validation_error, max: max, end_pos: end_pos,
    node: node }
    end
    children.each do |child|
    validate_range0(child)
    end
    end
    ast = AST.parse_file(@path)
    validate_not_cared0(ast)
    Generate AST
    test/-ext-/ast/test_ast.rb
    Check ranges
    Check children

    View Slide

  123. The future plan of code
    locations feature

    View Slide

  124. Case 1 (Proc/Method)
    • Add new methods to Proc/Method which return their
    code location.
    def a(&block)
    p block.code_location
    end
    a do
    1 + 2
    end
    # => [[5, 2], [7, 3]]
    p self.class.instance_method(:a).code_location
    # => [[1, 0], [3, 3]]
    https://github.com/yui-knk/ruby/tree/feature/rb_iseq_code_location

    View Slide

  125. Case 1 (Proc/Method)
    • Add new methods to Proc/Method which return their
    code location.
    def a(&block)
    p block.code_location
    end
    a do
    1 + 2
    end
    # => [[5, 2], [7, 3]]
    p self.class.instance_method(:a).code_location
    # => [[1, 0], [3, 3]]
    https://github.com/yui-knk/ruby/tree/feature/rb_iseq_code_location

    View Slide

  126. Case 1 (Proc/Method)
    • Add new methods to Proc/Method which return their
    code location.
    def a(&block)
    p block.code_location
    end
    a do
    1 + 2
    end
    # => [[5, 2], [7, 3]]
    p self.class.instance_method(:a).code_location
    # => [[1, 0], [3, 3]]
    https://github.com/yui-knk/ruby/tree/feature/rb_iseq_code_location

    View Slide

  127. Case 1 (Proc/Method)
    • Add new methods to Proc/Method which return their
    code location.
    def a(&block)
    p block.code_location
    end
    a do
    1 + 2
    end
    # => [[5, 2], [7, 3]]
    p self.class.instance_method(:a).code_location
    # => [[1, 0], [3, 3]]
    https://github.com/yui-knk/ruby/tree/feature/rb_iseq_code_location

    View Slide

  128. Case 2 (NoMethodError)
    • Give `NoMethodError` more detailed message.
    class A
    def foo
    nil
    end
    end
    A.new.foo.foo
    Traceback (most recent call last):
    /tmp/test.rb:7:in `': undefined method `foo' for nil:NilClass
    (NoMethodError)
    A.new.foo.foo
    ^^^^
    https://github.com/yui-knk/ruby/tree/feature/node_id

    View Slide

  129. Case 3 (AST module)
    AST.parse("1 + 2")
    # => #
    AST.parse("1 + 2").children[1]
    # => #
    AST.parse("1 + 2").children[1].children
    # => [#,
    #]

    View Slide

  130. Case 3 (AST module)
    AST.parse("1 + 2")
    # => #
    AST.parse("1 + 2").children[1]
    # => #
    AST.parse("1 + 2").children[1].children
    # => [#,
    #]

    View Slide

  131. • We discussed this topic at Developers Meeting yesterday.

    View Slide

  132. Committed

    View Slide

  133. Conference Driven
    Development !!!

    View Slide

  134. View Slide

  135. View Slide

  136. Case 3 (AST module)
    • We can get children nodes.
    RubyVM::AST.parse("1 + 2")
    # => #
    RubyVM::AST.parse("1 + 2").children[1]
    # => #
    RubyVM::AST.parse("1 + 2").children[1].children
    # => [#,
    #]

    View Slide

  137. Case 3 (AST module)
    • We can get location information.
    [RubyVM::AST.parse("1 + 2").first_lineno,
    RubyVM::AST.parse("1 + 2").first_column]
    # => [1, 0]
    [RubyVM::AST.parse("1 + 2").last_lineno,
    RubyVM::AST.parse("1 + 2").last_column]
    # => [1, 5]

    View Slide

  138. Enjoy programming with
    Ruby 2.6.0-preview2!

    View Slide

  139. Conclusion

    View Slide

  140. Acknowledgments
    • @mametter

    • @nobu

    • @ko1

    • @shyouhei

    • @takeshinoda

    • @hkdnet

    • @HaiTo

    • @littlestarling

    View Slide

  141. Conclusion
    • AST Node has location information.

    • Share the future plan of code locations feature.

    • If you have any idea to use location information, please
    let me know :)

    • https://bugs.ruby-lang.org/

    • You now get the map of Demon Castle "parse.y", let's
    hack “parse.y" :)

    View Slide

  142. Thank you!!!

    View Slide

  143. Bonus track

    View Slide

  144. How to implement more detailed
    message of `NoMethodError`

    View Slide

  145. Target code
    class A
    def foo
    nil
    end
    end
    A.new.foo.foo

    View Slide

  146. # @ NODE_CALL (line: 7, location: (7,0)-(7,13))* 13
    # +- nd_mid: :foo
    # +- nd_recv:
    # | @ NODE_CALL (line: 7, location: (7,0)-(7,9)) 12
    # | +- nd_mid: :foo
    # | +- nd_recv:
    # | | @ NODE_CALL (line: 7, location: (7,0)-(7,5)) 11
    # | | +- nd_mid: :new
    # | | +- nd_recv:
    # | | | @ NODE_CONST (line: 7, location: (7,0)-(7,1)) 10
    # | | | +- nd_vid: :A
    # | | +- nd_args:
    # | | (null node)
    # | +- nd_args:
    # | (null node)
    # +- nd_args:
    # (null node)
    node_id
    • Add unique id (per file), “node_id”, to Node.

    View Slide

  147. == disasm: #@src/no_method_error2.rb:1 (1,0)-(7,13)> (catch: FALSE)
    0000 putspecialobject 3 ( 1)[ 0][Li]
    0002 putnil [ 9]
    0003 defineclass :A, , 0
    0007 pop
    0008 getinlinecache 15, ( 7)[ 10][Li]
    0011 getconstant :A
    0013 setinlinecache
    0015 opt_send_without_block , [ 11]
    0018 opt_send_without_block , [ 12]
    0021 opt_send_without_block , [ 13]
    0024 leave [ 10]
    node_id
    • Store node_id of insn on an ISeq.

    View Slide

  148. • Store node_id of insn on an ISeq.

    • We can distinguish between `#foo`s by node_id.
    == disasm: #@src/no_method_error2.rb:1 (1,0)-(7,13)> (catch: FALSE)
    0000 putspecialobject 3 ( 1)[ 0][Li]
    0002 putnil [ 9]
    0003 defineclass :A, , 0
    0007 pop
    0008 getinlinecache 15, ( 7)[ 10][Li]
    0011 getconstant :A
    0013 setinlinecache
    0015 opt_send_without_block , [ 11]
    0018 opt_send_without_block , [ 12]
    0021 opt_send_without_block , [ 13]
    0024 leave [ 10]

    View Slide

  149. Exceptions
    • Exceptions have an ISeq and a program counter (pc).
    == disasm: #@src/no_method_error2.rb:1 (1,0)-(7,13)> (catch: FALSE)
    0000 putspecialobject 3 ( 1)[ 0][Li]
    0002 putnil [ 9]
    0003 defineclass :A, , 0
    0007 pop
    0008 getinlinecache 15, ( 7)[ 10][Li]
    0011 getconstant :A
    0013 setinlinecache
    0015 opt_send_without_block , [ 11]
    0018 opt_send_without_block , [ 12]
    0021 opt_send_without_block , [ 13]
    0024 leave [ 10]

    View Slide

  150. Exceptions
    • Exceptions have an ISeq and a program counter (pc).
    == disasm: #@src/no_method_error2.rb:1 (1,0)-(7,13)> (catch: FALSE)
    0000 putspecialobject 3 ( 1)[ 0][Li]
    0002 putnil [ 9]
    0003 defineclass :A, , 0
    0007 pop
    0008 getinlinecache 15, ( 7)[ 10][Li]
    0011 getconstant :A
    0013 setinlinecache
    0015 opt_send_without_block , [ 11]
    0018 opt_send_without_block , [ 12]
    0021 opt_send_without_block , [ 13]
    0024 leave [ 10]
    Exception

    View Slide

  151. Exceptions
    • Exceptions have an ISeq and a program counter (pc).
    == disasm: #@src/no_method_error2.rb:1 (1,0)-(7,13)> (catch: FALSE)
    0000 putspecialobject 3 ( 1)[ 0][Li]
    0002 putnil [ 9]
    0003 defineclass :A, , 0
    0007 pop
    0008 getinlinecache 15, ( 7)[ 10][Li]
    0011 getconstant :A
    0013 setinlinecache
    0015 opt_send_without_block , [ 11]
    0018 opt_send_without_block , [ 12]
    0021 opt_send_without_block , [ 13]
    0024 leave [ 10]
    Exception

    View Slide

  152. Exceptions
    • Get node_id from an exception.
    == disasm: #@src/no_method_error2.rb:1 (1,0)-(7,13)> (catch: FALSE)
    0000 putspecialobject 3 ( 1)[ 0][Li]
    0002 putnil [ 9]
    0003 defineclass :A, , 0
    0007 pop
    0008 getinlinecache 15, ( 7)[ 10][Li]
    0011 getconstant :A
    0013 setinlinecache
    0015 opt_send_without_block , [ 11]
    0018 opt_send_without_block , [ 12]
    0021 opt_send_without_block , [ 13]
    0024 leave [ 10]
    Exception Get node_id (13)

    View Slide

  153. # @ NODE_CALL (line: 7, location: (7,0)-(7,13))* 13
    # +- nd_mid: :foo
    # +- nd_recv:
    # | @ NODE_CALL (line: 7, location: (7,0)-(7,9)) 12
    # | +- nd_mid: :foo
    # | +- nd_recv:
    # | | @ NODE_CALL (line: 7, location: (7,0)-(7,5)) 11
    # | | +- nd_mid: :new
    # | | +- nd_recv:
    # | | | @ NODE_CONST (line: 7, location: (7,0)-(7,1)) 10
    # | | | +- nd_vid: :A
    # | | +- nd_args:
    # | | (null node)
    # | +- nd_args:
    # | (null node)
    # +- nd_args:
    # (null node)
    • Parse the source code file and find Node by node_id.

    View Slide

  154. # @ NODE_CALL (line: 7, location: (7,0)-(7,13))* 13
    # +- nd_mid: :foo
    # +- nd_recv:
    # | @ NODE_CALL (line: 7, location: (7,0)-(7,9)) 12
    # | +- nd_mid: :foo
    # | +- nd_recv:
    # | | @ NODE_CALL (line: 7, location: (7,0)-(7,5)) 11
    # | | +- nd_mid: :new
    # | | +- nd_recv:
    # | | | @ NODE_CONST (line: 7, location: (7,0)-(7,1)) 10
    # | | | +- nd_vid: :A
    # | | +- nd_args:
    # | | (null node)
    # | +- nd_args:
    # | (null node)
    # +- nd_args:
    # (null node)

    • Parse the source code file and find Node by node_id.

    View Slide

  155. # @ NODE_CALL (line: 7, location: (7,0)-(7,13))* 13
    # +- nd_mid: :foo
    # +- nd_recv:
    # | @ NODE_CALL (line: 7, location: (7,0)-(7,9)) 12
    # | +- nd_mid: :foo
    # | +- nd_recv:
    # | | @ NODE_CALL (line: 7, location: (7,0)-(7,5)) 11
    # | | +- nd_mid: :new
    # | | +- nd_recv:
    # | | | @ NODE_CONST (line: 7, location: (7,0)-(7,1)) 10
    # | | | +- nd_vid: :A
    # | | +- nd_args:
    # | | (null node)
    # | +- nd_args:
    # | (null node)
    # +- nd_args:
    # (null node)


    • Parse the source code file and find Node by node_id.

    View Slide

  156. A.new.foo.foo
    NODE_CALL (:foo) (line: 7, location: (7,0)-(7,13))* 13
    NODE_CALL (:foo) (line: 7, location: (7,0)-(7,9)) 12
    NODE_CALL (:new) (line: 7, location: (7,0)-(7,5)) 11
    NODE_CONST (:A) (line: 7, location: (7,0)-(7,1)) 10
    • Get location of Node.

    View Slide

  157. A.new.foo.foo
    NODE_CALL (:foo) (line: 7, location: (7,0)-(7,13))* 13
    NODE_CALL (:foo) (line: 7, location: (7,0)-(7,9)) 12
    NODE_CALL (:new) (line: 7, location: (7,0)-(7,5)) 11
    NODE_CONST (:A) (line: 7, location: (7,0)-(7,1)) 10
    • Get location of Node.

    View Slide

  158. A.new.foo.foo
    NODE_CALL (:foo) (line: 7, location: (7,0)-(7,13))* 13
    NODE_CALL (:foo) (line: 7, location: (7,0)-(7,9)) 12
    NODE_CALL (:new) (line: 7, location: (7,0)-(7,5)) 11
    NODE_CONST (:A) (line: 7, location: (7,0)-(7,1)) 10
    A.new.foo
    • Get location of Node.

    View Slide

  159. A.new.foo.foo
    NODE_CALL (:foo) (line: 7, location: (7,0)-(7,13))* 13
    NODE_CALL (:foo) (line: 7, location: (7,0)-(7,9)) 12
    NODE_CALL (:new) (line: 7, location: (7,0)-(7,5)) 11
    NODE_CONST (:A) (line: 7, location: (7,0)-(7,1)) 10
    A.new.foo
    .foo
    • Build an error message.

    View Slide

  160. A.new.foo.foo
    NODE_CALL (:foo) (line: 7, location: (7,0)-(7,13))* 13
    NODE_CALL (:foo) (line: 7, location: (7,0)-(7,9)) 12
    NODE_CALL (:new) (line: 7, location: (7,0)-(7,5)) 11
    NODE_CONST (:A) (line: 7, location: (7,0)-(7,1)) 10
    A.new.foo
    .foo
    ^^^^
    • Build an error message.

    View Slide

  161. How to implement more detailed
    message of `NoMethodError`
    • Add unique id (per file), “node_id”, to Node.

    • Store node_id of insn on an ISeq.

    • Get node_id from an exception.

    • Parse the source code file and find Node by node_id.

    • Get location of Node.

    • Build an error message.

    View Slide

  162. Thank you!!!

    View Slide