Slide 1

Slide 1 text

Why Ruby's JIT was slow RubyKaigi Takeout 2021 @k0kubun / Takashi Kokubun

Slide 2

Slide 2 text

Self introduction ● GitHub, Twitter: @k0kubun ● Ruby committer ○ JIT ○ IRB: Color, ls, show_source ○ Struct keyword_init ● Treasure Data

Slide 3

Slide 3 text

My first RubyKaigi: 2015

Slide 4

Slide 4 text

Hamlit will be Haml 6 (?) (Manually merged)

Slide 5

Slide 5 text

Why Ruby's JIT was slow

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

https://gist.github.com/k0kubun/cbc5251be1c19e36b7b7f786db302465

Slide 9

Slide 9 text

https://gist.github.com/k0kubun/cbc5251be1c19e36b7b7f786db302465

Slide 10

Slide 10 text

Why was Ruby's JIT slow? ● The "MJIT" architecture was making Rails slow ○ MJIT: C compiler + dlopen ○ A lot of duplications in generated codes ■ Ruby 3.0 fixed it ■ Ruby 3.1 will have a better default config for it

Slide 11

Slide 11 text

Other drawbacks of MJIT ● Too slow compilation ○ 5 min to fully warm up 1,000 methods on Railsbench ● Too large compilation overhead ● PIC is slower by several cycles

Slide 12

Slide 12 text

JIT's architecture matters! ● It took 3 years to fix MJIT's bottleneck ○ and still other drawbacks remain ● It impacts your daily Ruby usage ○ MJIT will make your Rails app slower during long warm-up ○ MJIT requires a C compiler on runtime

Slide 13

Slide 13 text

MJIT for competitive programming? How about supporting JIT in AtCoder? (competitive programming website)

Slide 14

Slide 14 text

MJIT for competitive programming? https://docs.google.com/spreadsheets/d/1PmsqufkF3wjKN6g1L0STS80yP4a6u-VdGiEv5uOHe0M/edit Removed --jit because it actually makes 2s of use slower.

Slide 15

Slide 15 text

vs Golang https://youtu.be/mMwC0QenvcA?t=5188

Slide 16

Slide 16 text

vs Python https://youtu.be/vucLAqv7qpc

Slide 17

Slide 17 text

The goal of this talk ● Discuss the JIT architecture of Ruby ○ It will impact your future use of Ruby ○ JIT authors could reduce development effort

Slide 18

Slide 18 text

Layers of concerns VM JIT compiler Codegen RTL YARV RTL-MJIT MJIT (C compiler) YARV-MJIT (mjit_compile.c) MIR YJIT yjit_codegen.c MIR-based JIT Ruby 2.6~3.0 Feature #12589 MIR YJIT

Slide 19

Slide 19 text

Discussion points ● Maintainability ● Internal Representation ● How to compile and generate code ● What optimization is feasible

Slide 20

Slide 20 text

Maintainability

Slide 21

Slide 21 text

Why did we choose MJIT? ● One reason: maintainability ○ You can use gdb and see C code while debugging JIT-ed methods

Slide 22

Slide 22 text

Automatic support of new instructions ● Support new instructions automatically ○ Koichi's idea ○ This may be helpful for any JIT ● People think MJIT pastes C code and lets GCC do everything, but it’s wrong ○ GCC alone can’t perform most of Ruby-specific optimizations

Slide 23

Slide 23 text

Language to implement JIT: C vs Ruby ● Use Ruby to write the JIT compiler? ○ Ractor: always multi-ractor or create and stop every time? ○ Inter-process communication with a JIT process?

Slide 24

Slide 24 text

Internal Representation

Slide 25

Slide 25 text

YARV vs RTL ● YARV-MJIT is no longer slower than RTL-MJIT ○ We didn’t need to rewrite the VM for JIT’s performance.

Slide 26

Slide 26 text

What RTL had ● Register-based instructions ● Speculative instructions

Slide 27

Slide 27 text

Speculative instructions ● These work like a profiler of runtime information ● Alternatively, we could let JIT generate code for profiling ○ The current MJIT generates the most speculative code first, and then recompile code with some optimizations disabled when cancelled ○ YJIT's basic block versioning also profiles type information, etc.

Slide 28

Slide 28 text

C method inlining ● LLVM, MIR ● Rewrite everything in Ruby: YARV ● TruffleRuby is both

Slide 29

Slide 29 text

Compilation and Code Generation

Slide 30

Slide 30 text

Compilation: Sync vs Async ● MJIT: Concurrently JIT-compile methods in an MJIT worker thread ● YJIT: Ruby threads JIT-compile methods during execution

Slide 31

Slide 31 text

Compilation speed ● MJIT: 50~200ms for a single compile, and minutes for compaction ● YJIT, MIR: < 1ms

Slide 32

Slide 32 text

Code generation ● C compiler ○ Slow startup ● LLVM ○ Binary size, build complexity ● MIR ● YJIT's assembler

Slide 33

Slide 33 text

Feasible optimizations

Slide 34

Slide 34 text

JIT code dispatch ● call vs jmp (direct threading) ○ jmp is hard for MJIT ○ But fortunately call seems faster, even in YJIT

Slide 35

Slide 35 text

Deoptimization ● On-stack replacement ○ mprotect + SEGV handler ○ Code patching ● It’s hard for MJIT to manipulate low-level information

Slide 36

Slide 36 text

Method frame skip ● We already have one since Ruby 2.7 ○ We also supported frame skip of more methods in Ruby 3.0 ● Next: Lazy method frame push ○ This is probably feasible in MJIT as well

Slide 37

Slide 37 text

Conclusion ● JIT's architecture may impact: ○ When you can use it ○ Warmup speed ○ Performance of VM and JIT ○ Build and runtime dependencies