Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Optimizing Production Performance with MRI JIT / RubyConf 2021

Optimizing Production Performance with MRI JIT / RubyConf 2021

RubyConf 2021
https://rubyconf.org/

08d5432a5bc31e6d9edec87b94cb1db1?s=128

Takashi Kokubun

October 27, 2021
Tweet

More Decks by Takashi Kokubun

Other Decks in Programming

Transcript

  1. Optimizing Production Performance with MRI JIT @k0kubun / Takashi Kokubun

  2. Self introduction • GitHub, Twitter: @k0kubun • Ruby committer: JIT,

    ERB, IRB • Company: Treasure Data
  3. Agenda • Introduction to MRI JIT • Tuning JIT performance

    for Rails • Warming up MRI JIT • The future of MRI JIT
  4. Introduction to MRI JIT

  5. The list of MRI JITs • MJIT • YJIT •

    MIR
  6. MJIT • Merged in Ruby 2.6 • Optionally enabled by

    --jit • Run a C compiler at runtime • Support GCC, Clang, and MSVC
  7. YJIT • To be merged in Ruby 3.1 • Optionally

    enabled by --yjit • In-process x86 assembler
  8. MIR • JIT framework, motivated by MJIT • Planned to

    be integrated with MRI • Inline C functions without a C compiler
  9. Layers of JIT implementation VM JIT compiler Codegen RTL YARV

    RTL-MJIT MJIT (C compiler) YARV-MJIT (mjit_compile.c) MIR YJIT yjit_codegen.c MIR-based JIT Ruby 2.6~3.0 Feature #12589 MIR YJIT
  10. Today's theme • How to make your application faster in

    Ruby 3.0 ◦ i.e. with MJIT
  11. Tuning JIT performance for Rails

  12. If you don't tune MJIT https://speed.yjit.org

  13. Tuned peak performance https://gist.github.com/k0kubun/cbc5251be1c19e36b7b7f786db302465

  14. Key ideas • Some versions are slow • TracePoint, GC.compact,

    and Ractor • Change --jit-max-cache for Ruby 3.0 • Wait until everything is compiled
  15. Some versions are slow • Don't use Ruby 2.x ◦

    Ruby 3.0 has better CPU cache efficiency • Even Ruby 3 has slow versions ◦ MJIT doesn't work properly in Ruby 3.0.1 ◦ Ruby 3.0.0 is OK, but others might have throttling issues
  16. TracePoint, GC.compact, and Ractor • MJIT can be disabled when

    GC.compact or TracePoint is used ◦ Ruby 3.1 shows "JIT cancel" on --jit-verbose=1 when it happens • However, Ruby 3.1 supported TracePoint :class events for Zeitwerk • MJIT has performance issues when you have Ractors
  17. Change --jit-max-cache for Ruby 3.0 • The default --jit-max-cache is

    100 in Ruby 3.0 • It should be large enough to compile everything, like 10,000 ◦ Use --jit-verbose=1 to see what's happening
  18. Wait until everything is compiled • When a C compiler

    is running, the interpreter becomes slower ◦ We've found no workaround so far • So be sure to see the end of compilation with --jit-verbose=1 ◦ This can take some minutes
  19. Warming up MRI JIT

  20. --jit-min-calls • The default of --jit-min-calls is 10,000 • You

    need to wait until the benchmarked path is used 10,000 times
  21. The lifecycle of JIT-ed code • MJIT's code has multiple

    stages: ◦ Fragmented code with full optimizations ◦ Fragmented code with partial optimizations ◦ Compacted code with partial optimizations • All methods should be in the last stage to see the peak performance
  22. JIT recompile • MJIT disables optimizations that didn't work and

    recompiles the code • Look for "JIT recompile" shown by --jit-verbose=1 • Your log should NOT end with "MJIT recompile" to see the peak performance
  23. Optimization switches for each method • disable_ivar_cache • disable_exivar_cache •

    disable_send_cache • disable_inlining • disable_const_cache
  24. JIT compaction • Once everything is compiled, MJIT schedules "JIT

    compaction" • Your --jit-verbose=1 log should end with this to see the peak performance
  25. The future of MRI JIT

  26. Why do we have multiple JITs? • Are we competing?

    ◦ No, we contribute to each other's project as well • Multi-tier JIT? ◦ Efficiently mixing the code of MJIT and YJIT might be hard ◦ At least MJIT needs to be replaced by MIR for better control
  27. A short-term idea • We should probably focus on YJIT

    ◦ It is already faster and has more developers than MJIT ◦ MJIT's warmup is too slow by design
  28. A long-term idea • Unblock inlining over C methods ◦

    YJIT cannot inline and optimize C methods as is ◦ MJIT has Ruby → C inlining, but not C → Ruby yet ◦ Rewrite more C methods to Ruby and/or integrate MIR
  29. Conclusion • There's a way to speed up Rails with

    MJIT • We're shifting to YJIT for better performance