Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Method JIT Compiler

The Method JIT Compiler

RubyKaigi 2018
http://rubykaigi.org/2018

Takashi Kokubun

June 02, 2018
Tweet

More Decks by Takashi Kokubun

Other Decks in Programming

Transcript

  1. T R E A S U R E D A T A
    The Method JIT Compiler for Ruby 2.6
    Takashi Kokubun / @k0kubun
    RubyKaigi 2018

    View Slide

  2. T R E A S U R E D A T A
    Maintainer of ERB, Haml
    Developing JIT compiler for Ruby 2.6
    @k0kubun

    View Slide

  3. View Slide

  4. • 2017 Sep: LLVM JIT (EN)
    • 2017 Nov: YARV MJIT (EN)
    • 2017 Dec: YARV MJIT (JA)
    • 2018 Feb: ERB generation (JA)
    • 2018 Apr: Preview2 optimizations (EN)
    .ZQBTUUBMLTBCPVU3VCZT+*5
    https://speakerdeck.com/k0kubun

    View Slide

  5. 1. Current Status
    2. JIT on Rails
    3. Dive Into Native Code
    4. Method Inlining
    5PEBZTUBML

    View Slide

  6. 1. CURRENT STATUS

    View Slide

  7. • JIT in 2.6.0-preview2 is not production ready yet
    • Fixing bugs by a race condition
    • I'll introduce current status of:
    • Implementation
    • Portability
    • Performance
    $VSSFOUTUBUVT

    View Slide

  8. IMPLEMENTATION

    View Slide

  9. .+*53VCZ`T+*5BSDIJUFDUVSF
    Ruby Process
    Disk Memory
    MJIT worker
    Thread
    Method #1
    Bytecode
    Ruby VM
    Thread
    Interpret

    View Slide

  10. .+*53VCZ`T+*5BSDIJUFDUVSF
    Ruby Process
    Disk Memory
    MJIT worker
    Thread
    Method #1
    Bytecode
    Ruby VM
    Thread
    Interpret
    Request
    JIT-ing #1

    View Slide

  11. .+*53VCZ`T+*5BSDIJUFDUVSF
    Ruby Process
    Disk Memory
    Method #1
    Bytecode
    Ruby VM
    Thread
    Interpret
    Request
    JIT-ing #1
    Method #1
    C code
    MJIT worker
    Thread
    Generate

    View Slide

  12. .+*53VCZ`T+*5BSDIJUFDUVSF
    Ruby Process
    Disk Memory
    Method #1
    Bytecode
    Ruby VM
    Thread
    Interpret
    Request
    JIT-ing #1
    Method #1
    C code
    MJIT worker
    Thread
    Generate
    Method #1
    SO file
    Run C compiler

    View Slide

  13. .+*53VCZ`T+*5BSDIJUFDUVSF
    Ruby Process
    Disk Memory
    Method #1
    Bytecode
    Ruby VM
    Thread
    Interpret
    Request
    JIT-ing #1
    Method #1
    C code
    MJIT worker
    Thread
    Generate
    Method #1
    SO file
    Run C compiler
    Method #1
    Native code
    Load

    View Slide

  14. .+*53VCZ`T+*5BSDIJUFDUVSF
    Ruby Process
    Disk Memory
    Method #1
    Bytecode
    Interpret
    Request
    JIT-ing #1
    Method #1
    C code
    MJIT worker
    Thread
    Generate
    Method #1
    SO file
    Run C compiler
    Method #1
    Native code
    Load
    Ruby VM
    Thread
    Call

    View Slide

  15. )PXJTUIJTJNQMFNFOUFE
    Ruby Process
    Disk Memory
    Method #1
    Bytecode
    Interpret
    Request
    JIT-ing #1
    Method #1
    C code
    MJIT worker
    Thread
    Generate
    Method #1
    SO file
    Run C compiler
    Method #1
    Native code
    Load
    Ruby VM
    Thread
    Call

    View Slide

  16. )PXJTUIJTJNQMFNFOUFE
    Ruby Code
    def three
    1 + 2
    end
    Bytecode
    putobject 1
    putobject 2
    opt_plus
    leave
    =

    View Slide

  17. )PXJTUIJTJNQMFNFOUFE
    Ruby Code
    def three
    1 + 2
    end
    Bytecode
    putobject 1
    putobject 2
    opt_plus
    leave
    =
    C code
    three() {
    VALUE stack[2];
    /* putobject 1 */
    stack[0] = 1;
    }

    View Slide

  18. )PXJTUIJTJNQMFNFOUFE
    Ruby Code
    def three
    1 + 2
    end
    Bytecode
    putobject 1
    putobject 2
    opt_plus
    leave
    =
    C code
    three() {
    VALUE stack[2];
    /* putobject 1 */
    stack[0] = 1;
    /* putobject 2 */
    stack[1] = 2;
    }

    View Slide

  19. )PXJTUIJTJNQMFNFOUFE
    Ruby Code
    def three
    1 + 2
    end
    Bytecode
    putobject 1
    putobject 2
    opt_plus
    leave
    =
    C code
    three() {
    VALUE stack[2];
    /* putobject 1 */
    stack[0] = 1;
    /* putobject 2 */
    stack[1] = 2;
    /* opt_plus */
    stack[0] = opt_plus(
    stack[0], stack[1]
    );
    }

    View Slide

  20. )PXJTUIJTJNQMFNFOUFE
    Ruby Code
    def three
    1 + 2
    end
    Bytecode
    putobject 1
    putobject 2
    opt_plus
    leave
    =
    C code
    three() {
    VALUE stack[2];
    /* putobject 1 */
    stack[0] = 1;
    /* putobject 2 */
    stack[1] = 2;
    /* opt_plus */
    stack[0] = opt_plus(
    stack[0], stack[1]
    );
    /* leave */
    return stack[0];
    }

    View Slide

  21. $DPEFHFOFSBUPS
    ERB template: mjit_compile.inc.erb

    View Slide

  22. $DPEFHFOFSBUPS
    ERB template: mjit_compile.inc.erb
    VM instructions: insns.def

    View Slide

  23. $DPEFHFOFSBUPS
    ERB template: mjit_compile.inc.erb
    VM instructions: insns.def
    C code generator: mjit_compile.inc
    Render Copy definition

    View Slide

  24. • Based on Ruby 2.5’s Ruby VM
    • If JIT is disabled, everything must work in 2.6
    • JIT implementation is automatically generated
    • To keep up with frequent Ruby VM changes
    3VCZT+*5EFTJHO

    View Slide

  25. PORTABILITY

    View Slide

  26. $DPNQJMFSTVQQPSUT
    GCC Clang Visual C++
    Intel C++
    Compiler
    MJIT worker ○ ○ ○ ○
    JIT header ○ ○ × ○
    CLI support ○ ○ ○ ×
    Support plan Done Done Next Later
    Now MJIT worker (native thread, dynamic loading) runs on Windows and UNIX

    View Slide

  27. 1MBUGPSNTVQQPSUTXJUI($$
    Linux MinGW Solaris NetBSD FreeBSD
    JIT header ○ ˚ ○ ○ ○
    test_jit.rb ○ ○ ○ ? ×
    MinGW header is not minimized and thus compilation speed is slow.
    I guess NetBSD works but we don’t have NetBSD RubyCI. GCC on FreeBSD is crashing.

    View Slide

  28. 1MBUGPSNTVQQPSUTXJUI$MBOH
    Linux macOS OpenBSD
    JIT header ○ ○ ○
    test_jit.rb ○ ○ ?
    I guess OpenBSD works but we don’t have OpenBSD RubyCI

    View Slide

  29. PERFORMANCE

    View Slide

  30. 3VCZ,BJHJ-5--7.+*5

    View Slide

  31. 3VCZ,BJHJ-5--7.+*5

    View Slide

  32. 3VCZ,BJHJ 5IJTUBML

    https://benchmark-driver.github.io/benchmarks/mjit/commits.html

    View Slide

  33. 3VCZ,BJHJ 5IJTUBML

    https://benchmark-driver.github.io/benchmarks/mjit/commits.html
    2.6.0
    Preview1
    2.6.0
    Preview2
    5.7x faster

    View Slide

  34. 0QUDBSSPU
    GQT





    Ruby 2.0 trunk trunk+JIT



    1.49x → 2.03x
    https://gist.github.com/k0kubun/95c81358af6f34b4d0a71425da871178

    View Slide

  35. 3BJMT %JTDPVSTF

    View Slide

  36. 3BJMT %JTDPVSTF

    View Slide

  37. 2. JIT ON RAILS

    View Slide

  38. • Generated code should be faster in general
    • What's different from Optcarrot?
    8IZ3BJMTCFDPNFTTMPXXJUI+*5

    View Slide

  39. 1. longjmp by exception is slow
    2. Profiling method calls has overhead
    3. JIT-ed call is canceled too often
    4. JIT compilation has overhead
    5. Calling JIT-ed code has overhead
    .ZIZQPUIFTJT

    View Slide

  40. • When a method is returned from its child block,
    it calls longjmp(3)
    • VM is implemented with just return statement
    and may be faster in that case
    MPOHKNQCZFYDFQUJPOJTTMPX

    View Slide

  41. -FUTDIFDLJGMPOHKNQJTDBMMFE
    • Fortunately, longjmp was not used in this
    Discourse endpoint

    View Slide

  42. • MJIT counts method calls to decide which
    method to compile with JIT enabled
    • This was suspected in [Bug #14490]
    1SPGJMJOHNFUIPEDBMMTIBTPWFSIFBE

    View Slide

  43. -FUTDPVOUJUFWFOJG+*5JTEJTBCMFE

    View Slide

  44. /PCJHEJGGFSFODFCZQSPGJMJOHNFUIPEDBMMT
    trunk
    No options
    modified
    No options
    trunk
    --jit
    JIT × × ○
    Profiling × ○ ○
    Percentile: ms
    GET /:
    50: 58.4ms
    75: 65.4ms
    90: 67.9ms
    99: 131.1ms
    GET /:
    50: 58.5ms
    75: 64.6ms
    90: 67.8ms
    99: 127.3ms
    GET /:
    50: 66.3ms
    75: 72.3ms
    90: 77.0ms
    99: 133.3ms
    `ruby script/simple_bench.rb 1000` with:
    https://github.com/k0kubun/discourse/tree/20fc03558f16aff94c6c017347783374cf4a0ca8

    View Slide

  45. • MJIT has a kind of de-optimization to fallback to
    VM interpretation when any assumption is not met
    • ex) Method redefinition, etc.
    • Such fallback might be an overhead
    +*5FEDBMMJTDBODFMMFEUPPPGUFO

    View Slide

  46. -FUTMPHBMM+*5DBODFMMBUJPO

    View Slide

  47. 5IFSBUJPPG+*5DBODFMMBUJPO
    JIT-ed calls
    Cancel by
    opt_xxx
    Cancel by
    call cache
    Optcarrot 49,171,765
    786,842
    (1.60%)
    0
    (0.00%)
    Discourse
    1,000 requests
    168,925,050
    19,394,792
    (11.5%)
    10,092,254
    (5.97%)
    JIT cancel reasons:
    • opt_xxx: Non-core class is given to +, -, *, /, #[], etc.
    • call cache: Method redefinition, receiver class is changed

    View Slide

  48. 8IZ+*5DBODFMIBQQFOTTPPGUFO
    • Current JIT doesn't discard any JIT-ed code
    whose assumption is not met
    • opt_xxx is performing badly when a receiver is not
    a core class like Integer, Float, String, Array, Hash

    View Slide

  49. 8IZ+*5DBODFMIBQQFOTTPPGUFO
    • Current JIT doesn't discard any JIT-ed code
    whose assumption is not met
    • opt_xxx is performing badly when a receiver is not
    a core class like Integer, Float, String, Array, Hash
    There are many #[] for non Hash/Array classes in Rails

    View Slide

  50. *GJYFEUIJTJTTVFGPS<> S

    View Slide

  51. PQU@YYYDBODFMJTEFDSFBTFENVDI
    JIT-ed calls
    Cancel by
    opt_xxx
    Cancel by
    call cache
    Discourse
    Before
    168,925,050
    19,394,792
    (11.5%)
    10,092,254
    (5.97%)
    Discourse
    After
    75,150,482
    2,849,825
    (3.79%)
    3,072,673
    (4.09%)
    #[] has a major impact on Rails. Others are to be improved...

    View Slide

  52. • Appending a method to JIT-ed queue may have
    overhead
    • GCC or Clang may use the same CPU core, or it
    may cost to transfer data to another core
    +*5DPNQJMBUJPOIBTPWFSIFBE

    View Slide

  53. 1SFQBSF3VCZ7..+*5TUPQBOE
    Stop JIT compilation

    View Slide

  54. +*5FOBCMFEWT+*5TUPQQFE
    JIT enabled
    1000 requests
    JIT enabled
    1000 requests
    JIT stopped
    1000 requests
    RubyVM::MJIT.stop
    Measure

    View Slide

  55. +*5DPNQJMBUJPOIBEPWFSIFBE
    No options --jit → Stop --jit
    Code is JIT-ed × ○ ○
    JIT is going on × × ○
    Percentile: ms
    GET /:
    50: 60.4ms
    75: 66.9ms
    90: 69.6ms
    99: 125.4ms
    GET /:
    50: 65.1ms
    75: 72.4ms
    90: 75.8ms
    99: 145.6ms
    GET /:
    50: 68.4ms
    75: 74.8ms
    90: 80.0ms
    99: 137.2ms
    But this overhead is excluded from [Bug #14490] degradation…

    View Slide

  56. • JIT-ed code behaves slower only on an exception
    or JIT cancellation, but they weren’t culprit
    • JIT compilation does not dominate the slowness
    • Then, calling native code has overhead…?
    $BMMJOH+*5FEDPEFIBTPWFSIFBE

    View Slide

  57. -FU`TDIFDL+*5FEDBMMPWFSIFBE

    View Slide

  58. -FU`TDIFDL+*5FEDBMMPWFSIFBE

    View Slide

  59. $BMMJOH+*5FEDPEFXBTTMPX
    JIT disabled JIT enabled
    Duration 2.17s 2.45s

    View Slide

  60. -FU`TQSPGJMFXJUIQFSG

    View Slide

  61. (VBSEGPSKJUXBJUUBLFTUJNF

    View Slide

  62. 4LJQKJUXBJUDIFDLJONBJOCSBODI
    r63480

    View Slide

  63. *NQSPWFEBMJUUMF
    JIT disabled JIT enabled
    Duration 2.17s
    2.31s
    (-0.14s)

    View Slide

  64. 3FNBJOJOHTXBTGPS
    Additional memory access here
    …But it wasn’t a big deal in Rails

    View Slide

  65. 8IBUJGUIFSFBSFBMPUPGNFUIPET

    View Slide

  66. $BMMJOHNBOZEJGGFSFOUNFUIPETJTTMPX
    Called methods 1 method 15 methods
    JIT disabled 3.69s 3.71s
    JIT enabled 3.79s 5.34s
    Duration with the same total calls

    View Slide

  67. $BMMJOHNBOZEJGGFSFOUNFUIPETJTTMPX
    0
    1.5
    3
    4.5
    6
    1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
    VM JIT

    View Slide

  68. $BMMJOHNBOZEJGGFSFOUNFUIPETJTTMPX
    0
    1.5
    3
    4.5
    6
    1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
    VM JIT
    6 12 19

    View Slide

  69. )PUNFUIPETPG0QUDBSSPU
    Top 6 methods
    dominate 50%

    View Slide

  70. )PUNFUIPETPG%JTDPVSTF
    Top 6 methods are only 18%
    They are not so hot.

    View Slide

  71. 8IZEPFTUIJTIBQQFO
    0
    1.5
    3
    4.5
    6
    1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39
    VM JIT
    6 12 19

    View Slide

  72. -FU`TTFFlQFSGTUBUz
    6 methods
    40 methods

    View Slide

  73. lJOTOQFSDZDMFzJTWFSZEJGGFSFOU
    6 methods
    40 methods

    View Slide

  74. YDZDMFTGPSBMNPTUUIFTBNFJOTOT
    6 methods
    40 methods

    View Slide

  75. 6 methods

    View Slide

  76. 40 methods

    View Slide

  77. &BDINFUIPE TPGJMF
    JTVTJOHPOFQBHF .#

    View Slide

  78. 40 methods w/ the same so file PoC

    View Slide

  79. • Ongoing JIT compilation may have overhead
    • JIT cancel is happening frequently (to be fixed)
    • It stalls to load many different methods (to be fixed)
    3FBTPOPG3BJMTTMPXEPXOPO+*5

    View Slide

  80. 3. DIVE INTO NATIVE CODE

    View Slide

  81. &YBNQMF
    Ruby Code
    def three
    1 + 2
    end
    Bytecode
    putobject 1
    putobject 2
    opt_plus
    leave
    =

    View Slide

  82. &YBNQMF
    Ruby Code
    def three
    1 + 2
    end
    Bytecode
    putobject 1
    putobject 2
    opt_plus
    leave
    =
    C code
    three() {
    VALUE stack[2];
    /* putobject 1 */
    stack[0] = 1;
    /* putobject 2 */
    stack[1] = 2;
    /* opt_plus */
    stack[0] = opt_plus(
    stack[0], stack[1]
    );
    /* leave */
    return stack[0];
    }

    View Slide

  83. More detailed definition
    before inlining opt_plus

    View Slide

  84. opt_plus inlined

    View Slide

  85. Native code
    generated by GCC
    (output of perf)

    View Slide

  86. Integer#+
    redefinition check

    View Slide

  87. JIT cancel handler
    Let's ignore this
    Integer#+
    redefinition check

    View Slide

  88. Integer#+
    redefinition check

    View Slide

  89. SET_SP: VM's behavior
    which can be removed
    Integer#+
    redefinition check

    View Slide

  90. Check interrupts like
    SIGINT, another thread
    Interruption handler
    Integer#+
    redefinition check
    SET_SP: VM's behavior
    which can be removed

    View Slide

  91. Interruption handler
    Check interrupts like
    SIGINT, another thread
    Pop VM call frame
    Integer#+
    redefinition check
    SET_SP: VM's behavior
    which can be removed

    View Slide

  92. Pop VM call frame
    Interruption handler
    Check interrupts like
    SIGINT, another thread
    Return 3
    FIX2INT(0x7) == 3
    Integer#+
    redefinition check
    SET_SP: VM's behavior
    which can be removed

    View Slide

  93. So what?

    View Slide

  94. Instruction dispatch
    Instruction dispatch
    Instruction dispatch
    Instruction dispatch
    1. Instruction dispatch
    cost is removed

    View Slide

  95. Program counter motion
    Program counter motion
    Program counter motion
    Program counter motion
    2. No program counter
    motion

    View Slide

  96. Stack pointer motion
    Stack pointer motion
    Stack pointer motion
    Forgot to remove this
    3. Stack pointer motion
    is reduced

    View Slide

  97. And also...

    View Slide

  98. 4. This optimization is
    delegated to GCC

    View Slide

  99. 8IBUJGJUTNPSFDPNQMFY
    def six
    4 + 8 - 3 * 4 / 2
    end

    View Slide

  100. View Slide

  101. Fixnum#+ redefinition check
    Fixnum#* redefinition check
    Fixnum#/ redefinition check
    Fixnum#- redefinition check
    Return 6
    FIX2INT(0xd) == 6

    View Slide

  102. -BTU&YBNQMFXIJMFMPPQ
    def while_loop
    i = 0
    while i < 1000000
    i += 1
    end
    end
    i = 0
    while i < 2000
    while_loop
    i += 1
    end

    View Slide

  103. -BTU&YBNQMFXIJMFMPPQ
    VM: 22.9s JIT: 2.8s
    Why it becomes so faster?
    8.18x faster

    View Slide

  104. -FUTTFFUIFOBUJWFDPEFPGXIJMF@MPPQ
    i = 0
    while i < 1000000
    i += 1
    end

    View Slide

  105. View Slide

  106. j
    They are JIT cancel handlers
    Let's ignore them

    View Slide

  107. View Slide

  108. They are interruption handlers
    Let's ignore them

    View Slide

  109. View Slide

  110. They are write-barrier-related slow paths
    Let's ignore them

    View Slide

  111. View Slide

  112. It is Bignum promotion handler
    Let's ignore them

    View Slide

  113. View Slide

  114. View Slide

  115. "HBJO MFUTTFFUIFOBUJWFDPEFPG
    i = 0
    while i < 1000000
    i += 1
    end

    View Slide

  116. c
    i = 0

    View Slide

  117. i = 0
    check
    interrupts
    c

    View Slide

  118. i = 0 c
    Fixnum?(i)
    for #<
    check
    interrupts

    View Slide

  119. i = 0 c
    check
    interrupts
    Fixnum#<
    redefined?
    Fixnum?(i)
    for #<

    View Slide

  120. i = 0 c
    check
    interrupts
    i < 1000000
    Fixnum?(i)
    for #<
    Fixnum#<
    redefined?

    View Slide

  121. i = 0 c
    i < 1000000
    check
    interrupts
    check
    interrupts
    Fixnum?(i)
    for #<
    Fixnum#<
    redefined?

    View Slide

  122. i = 0 c
    i < 1000000
    check
    interrupts
    check
    interrupts
    Fixnum#+
    redefined?
    Fixnum?(i)
    for #<
    Fixnum#<
    redefined?

    View Slide

  123. i = 0 c
    i < 1000000
    check
    interrupts
    check
    interrupts
    i + 1
    Fixnum?(i)
    for #<
    Fixnum#<
    redefined?
    Fixnum#+
    redefined?

    View Slide

  124. i = 0 c
    i < 1000000
    check
    interrupts
    check
    interrupts
    Int overflow?
    i + 1
    Fixnum?(i)
    for #<
    Fixnum#<
    redefined?
    Fixnum#+
    redefined?

    View Slide

  125. i = 0 c
    i < 1000000
    check
    interrupts
    check
    interrupts
    Int overflow?
    can't optimize #+ ?
    i + 1
    Fixnum?(i)
    for #<
    Fixnum#<
    redefined?
    Fixnum#+
    redefined?

    View Slide

  126. i = 0 c
    i < 1000000
    check
    interrupts
    check
    interrupts
    Int overflow?
    can't optimize #+ ?
    i + 1
    Fixnum?(i)
    for #<
    Fixnum#<
    redefined?
    Fixnum#+
    redefined?
    set i for VM
    + check WB

    View Slide

  127. i = 0 c
    i < 1000000
    check
    interrupts
    check
    interrupts
    Int overflow?
    can't optimize #+ ?
    i + 1
    Fixnum?(i)
    for #<
    Fixnum#<
    redefined?
    Fixnum#+
    redefined?
    set i for VM
    + check WB
    set i for JIT

    View Slide

  128. i = 0 c
    i < 1000000
    check
    interrupts
    check
    interrupts
    Int overflow?
    i + 1
    Fixnum?(i)
    for #<
    Fixnum#<
    redefined?
    Fixnum#+
    redefined?
    can't optimize #+ ?
    set i for VM
    + check WB
    set i for JIT

    View Slide

  129. • #+ and #< are performed on not VM stack but registers
    • #+ and #< share some instructions to check redefinition
    • Unnecessary type checks are omitted from the loop
    8IZXIJMFMPPQCFDPNFTGBTUFS

    View Slide

  130. 4. METHOD INLINING

    View Slide

  131. • Many optimizations are possible because C
    compiler can know definitions
    • If we could inline methods, C compiler would
    be able to optimize more
    -FU$DPNQJMFSXPSLIBSE

    View Slide

  132. 1. x
    2. x
    3. x
    8IFOJTNFUIPEJOMJOJOHQPTTJCMF

    View Slide

  133. 1. JIT compiler can know definitions
    2. JIT compiler can modify code to call a method
    3. Inlined code can be invalidated
    8IFOJTNFUIPEJOMJOJOHQPTTJCMF

    View Slide

  134. 1. JIT compiler can know definitions
    2. JIT compiler can modify code to call a method
    3. Inlined code can be invalidated
    8IFOJTNFUIPEJOMJOJOHQPTTJCMF

    View Slide

  135. 1. JIT compiler can know definitions
    2. JIT compiler can modify code to call a method
    3. Inlined code can be invalidated
    8IFOJTNFUIPEJOMJOJOHQPTTJCMF

    View Slide

  136. • Ruby method
    • called by Ruby method
    • called by C method
    • Ruby block
    • yield-ed by Ruby method
    • called by C method
    • C method
    • called by Ruby method
    • called by C method
    .BKPSJOMJOFUBSHFUT

    View Slide

  137. • Ruby method
    • called by Ruby method => easy
    • called by C method
    • Ruby block
    • yield-ed by Ruby method
    • called by C method
    • C method
    • called by Ruby method
    • called by C method
    .BKPSJOMJOFUBSHFUT
    JIT compiler can deal
    with bytecode easily
    Method cache can be
    used for invalidation

    View Slide

  138. • Ruby method
    • called by Ruby method => easy
    • called by C method
    • Ruby block
    • yield-ed by Ruby method => medium
    • called by C method
    • C method
    • called by Ruby method => medium
    • called by C method
    .BKPSJOMJOFUBSHFUT
    yield doesn't
    have cache
    Sometimes it's hard
    to know definitions

    View Slide

  139. • Ruby method
    • called by Ruby method => easy
    • called by C method => hard
    • Ruby block
    • yield-ed by Ruby method => medium
    • called by C method => hard
    • C method
    • called by Ruby method => medium
    • called by C method => hard
    .BKPSJOMJOFUBSHFUT
    There is no cache
    key for invalidation
    How to modify
    C code?

    View Slide

  140. 3VCZˠ$ˠ3VCZJOMJOJOHQSPCMFN
    ret = 0
    1000000.times do |i|
    ret += i
    end
    ret

    View Slide

  141. ret = 0
    1000000.times do |i|
    ret += i
    end
    ret
    Ruby -> C method call
    medium
    Integer#times is defined with C
    3VCZˠ$ˠ3VCZJOMJOJOHQSPCMFN

    View Slide

  142. ret = 0
    1000000.times do |i|
    ret += i
    end
    ret
    Ruby -> C method call
    medium
    Integer#times is defined with C
    C -> Ruby block call
    hard
    3VCZˠ$ˠ3VCZJOMJOJOHQSPCMFN

    View Slide

  143. What if...
    Ruby can be faster than C?

    View Slide

  144. What if...
    Ruby can be faster than C?

    View Slide

  145. -FUTEFGJOF*OUFHFSUJNFTXJUI3VCZ
    https://github.com/rubinius/rubinius/blob/master/core/integer.rb

    View Slide

  146. • Ruby method
    • called by Ruby method => easy
    • called by C method => hard
    • Ruby block
    • yield-ed by Ruby method => medium
    • called by C method => hard
    • C method
    • called by Ruby method => medium
    • called by C method => hard
    *JNQMFNFOUFEBQSPUPUZQFUPJOMJOFUIJT
    https://github.com/k0kubun/ruby/commits/mjit-inline-send-yield

    View Slide

  147. 5JNFUPCFODINBSL

    View Slide

  148. *OUFHFSUJNFTCFODINBSLSFTVMUT
    Integer#times
    in C
    Integer#times
    in Ruby
    VM
    145.44s
    1.00x
    156.38s
    0.93x
    JIT
    time ruby --disable-gems times_loop.rb

    View Slide

  149. *OUFHFSUJNFTCFODINBSLSFTVMUT
    Integer#times
    in C
    Integer#times
    in Ruby
    VM
    145.44s
    1.00x
    156.38s
    0.93x
    JIT
    104.80s
    1.39x
    time ruby --disable-gems times_loop.rb

    View Slide

  150. *OUFHFSUJNFTCFODINBSLSFTVMUT
    Integer#times
    in C
    Integer#times
    in Ruby
    VM
    145.44s
    1.00x
    156.38s
    0.93x
    JIT
    104.80s
    1.39x
    56.46s
    2.56x
    time ruby --disable-gems times_loop.rb

    View Slide

  151. C language is dead

    View Slide

  152. • Rails performance is going to be improved
    • JIT can eliminate many instructions
    • C language will be useless in the future
    $PODMVTJPO

    View Slide