Slide 1

Slide 1 text

Ruby 3 JIT's roadmap RubyConf China 2020 Takashi Kokubun / @k0kubun

Slide 2

Slide 2 text

Self introduction • GitHub, Twitter: @k0kubun • Treasure Data, Inc. • Ruby committer • JIT: 2017~2020

Slide 3

Slide 3 text

GitHub Sponsors - Thank you!

Slide 4

Slide 4 text

Agenda 1. What can Ruby's JIT do? 2. Ruby 3 JIT's roadmap 3. Recent progress in Ruby 3 4. Current challenges

Slide 5

Slide 5 text

1. What can Ruby's JIT do?

Slide 6

Slide 6 text

Ruby JIT's architecture 3VCZJOUFSQSFUFS +*5UISFBE 3VCZUISFBE

Slide 7

Slide 7 text

Ruby JIT's architecture 3VCZJOUFSQSFUFS +*5UISFBE 3VCZUISFBE .FUIPE DpMF

Slide 8

Slide 8 text

Ruby JIT's architecture 3VCZJOUFSQSFUFS +*5UISFBE 3VCZUISFBE .FUIPE TPpMF $DPNQJMFS DpMF 3VO

Slide 9

Slide 9 text

Ruby JIT's architecture 3VCZJOUFSQSFUFS +*5UISFBE 3VCZUISFBE .FUIPE TPpMF $DPNQJMFS DpMF 3VO -PBEBOEDBMM

Slide 10

Slide 10 text

What we can do with Ruby's JIT • Optimize Ruby methods to native code for hot spots • Eliminate VM interpretation cost: SP / PC • Optimize based on what C compiler can know • Ruby VM-specific optimizations we implemented

Slide 11

Slide 11 text

What we CAN'T do with Ruby's JIT • Optimize a short-running program • JIT needs time to optimize many methods • Things may be slower while a C compiler is running • Optimization based on native code generated by C compiler • Deoptimization based on native insn pointer / stack pointer

Slide 12

Slide 12 text

Use case: Obviate micro optimizations 4MPXFS 'BTUFS

Slide 13

Slide 13 text

Use case: Obviate micro optimizations

Slide 14

Slide 14 text

Use case: Obviate micro optimizations OVN[FSP OVN JUFSBUJPOTFD 3VCZ 7. +*5

Slide 15

Slide 15 text

2. Ruby 3 JIT's roadmap

Slide 16

Slide 16 text

Ruby JIT's goals • Optcarrot: 3x faster than Ruby 2.0 • Sinatra, Rails: 10% throughput increase vs VM

Slide 17

Slide 17 text

mame/optcarrot 3VCZ 3VCZ 3VCZ GSBNFTTFD 7. +*5

Slide 18

Slide 18 text

benchmark-driver/sinatra 3VCZ 3VCZ 3VCZ SFRVFTUTTFD 7. +*5

Slide 19

Slide 19 text

k0kubun/railsbench 3VCZ 3VCZ 3VCZ SFRVFTUTTFD 7. +*5

Slide 20

Slide 20 text

What should we do? • Current status: • Programs like Optcarrot run faster • Sinatra, Rails are still slightly slower than no-JIT mode • Let’s take a look at each of major Ruby features and JIT core

Slide 21

Slide 21 text

Ruby 3 JIT's roadmap 1. Variables / Constants 2. Method inlining 3. Constant folding 4. Object allocation 5. Deoptimization 6. Scalability

Slide 22

Slide 22 text

1. Variables / Constants • Local variables: ⚠ • Instance variables: ✅ • Global variables: ❌ • Constants: ❌

Slide 23

Slide 23 text

2. Method inlining • Ruby method: ✅ • C method: ✅ • super: ⚠ • yield: ⚠

Slide 24

Slide 24 text

3. Constant folding • VM-optimized instructions: ⚠ • C method: ⚠

Slide 25

Slide 25 text

4. Object allocation • Stack allocation: ⚠ • Static allocation: ❌

Slide 26

Slide 26 text

5. Deoptimization • Reduce safepoints: ✅ • Zero-cost deoptimization: ❌

Slide 27

Slide 27 text

6. Scalability • Single-page code: ✅ • Code size reduction: ⚠ • JIT dispatch cost: ⚠

Slide 28

Slide 28 text

3. Recent progress in Ruby 3

Slide 29

Slide 29 text

Decrease ICache misses • ICache: Instruction Cache • Sinatra / Rails spends a lot of time on ICache misses • And the amount is increased by JIT

Slide 30

Slide 30 text

• VTune: VM, JIT VTune: mame/optcarrot - VM

Slide 31

Slide 31 text

• VTune: VM, JIT VTune: mame/optcarrot - JIT

Slide 32

Slide 32 text

VTune: benchmark-driver/sinatra - VM

Slide 33

Slide 33 text

VTune: benchmark-driver/sinatra - JIT

Slide 34

Slide 34 text

Decrease ICache misses • We implemented: • Deduplication of the same code • Hot / cold partitioning

Slide 35

Slide 35 text

Decrease ICache misses

Slide 36

Slide 36 text

Merge type checks on ivar access • Instance variable index is class-specific • We can check class only once per method

Slide 37

Slide 37 text

Merge type checks on ivar access 30CKFDU qBHT FNCFEGBMTF OVNJW JWQUSˠIFBQ JW@JOEFY@UCM 30CKFDU qBHT FNCFEUSVF BSZ<> BSZ<> BSZ<>

Slide 38

Slide 38 text

Merge type checks on ivar access 3VCZ 3VCZ GSBNFTTFD 7. +*5

Slide 39

Slide 39 text

Inline C method call • Inlining C method had been hard because of: • Difficulty of detecting whether it’s safe to omit a call frame or not • Lots of indirection between method call and actual C function

Slide 40

Slide 40 text

Inline C method call • We introduced a new type of method definition in CRuby core, called “builtin method”

Slide 41

Slide 41 text

Inline C method call • We also added a way to “annotate” a C function in “builtin method” • Now we can say it’s safe to inline a C function

Slide 42

Slide 42 text

Inline C method call ,FSOFMDMBTT JT 7. +*5 +*5 JOMJOJOH 1.7x 1.3x

Slide 43

Slide 43 text

Inline C method call YJT'JYOVN

Slide 44

Slide 44 text

4. Current challenges

Slide 45

Slide 45 text

Allow exception on "inline" method • A method which raises an exception can't have “inline" • But many methods raise: • TypeError • NoMemoryError • We could lazily update backtrace and others?

Slide 46

Slide 46 text

Optimize VM -> JIT call • VM -> JIT call is slower than VM -> VM call • We might be able to offset icache miss's slowness by this • Prepare a fastpath / VM insn specialized for JIT call

Slide 47

Slide 47 text

Optimize VM -> JIT call def foo3 nil end def foo2 nil end def foo1 nil end Time to call a method returning nil (ns) 0 8 16 24 32 Number of called methods 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81 85 89 93 97 VM JIT

Slide 48

Slide 48 text

Improve inlining decision • Rails has polymorphic methods • If inlined by a specific caller, class can be specific for each caller

Slide 49

Slide 49 text

Improve inlining decision

Slide 50

Slide 50 text

… and many more things in the roadmap • Inline `super` and `yield` • Optimize local variables • Optimize constants • …

Slide 51

Slide 51 text

Summary • We reviewed Ruby 3 JIT's roadmap and what we've implemented. • While it's not useful for Rails yet, we've had progress towards that.