Upgrade to Pro — share decks privately, control downloads, hide ads and more …

JRuby: Ruby on the Modern JVM

JRuby: Ruby on the Modern JVM

A talk on JRuby 10 and how we are leaping forward with Ruby 3.4 support, modern JVM features, and next-gen optimizations.

Delivered at RedDotRubyConf in Singpore on July 26, 2024.

headius

July 26, 2024
Tweet

More Decks by headius

Other Decks in Programming

Transcript

  1. Karaoke! • #RubyKaraoke tonight after the conference! • Cash Studio

    @ Prinsep Street • 8pm-10pm • Let's sing and celebrate Ruby together!
  2. What is JRuby? • Ruby on the Java Virtual Machine

    (JVM) • Ruby implementation fi rst, JVM language second • Many bene fi ts from JVM ecosystem • Ruby code should "just work" • Different extension API, no forking, parallel threads • Thousands of production users, 17 years of real-world use
  3. Example Users • Kami: homework app for teachers, students •

    LogStash: you probably have already used it • Datek: point-of-sale, airplane refueling terminals, ad kiosks • Looker: business analytics platform from Google • Kinetic Data: business portal and automation platform • Let us know if you use JRuby!
  4. JVM Advantage • World class garbage collectors (many choices) •

    Native JIT (JRuby was the fi rst JIT for Ruby) • Optimizes your whole application • Monitoring and pro fi ling tools (many choices, most are Free) • Deploy anywhere (single fi le, obfuscated code, sell your app!) • Thousands of JVM developers making it better
  5. JRuby 9.4 • Stable release, supported until 2026(?) • Ruby

    3.1 compatible • Java 8+ compatible • rvm, ruby-install, ruby-build, OS packages, Windows installer, Docker images, tarballs, zips, whatever • Thousands of users, billions of requests worldwide
  6. JRuby Compiler Pipeline Ruby (.rb) JIT Java Instructions (java bytecode)

    Ruby Instructions (IR) parse interpret interpreter interpret C1 compile native code better native code java bytecode interpreter execute C2 compile Java Virtual Machine JRuby Internals
  7. Better Extension Performance 0M 0.3M 0.6M 0.9M 1.2M small medium

    large 0.13 0.29 1.1 0.06 0.11 0.72 MRI (oj) JRuby (oj) Millions of loads per second (higher is better)
  8. Better Scaling requests per second per MB of memory (16-way

    concurrency) 0rps/mb 0.45rps/mb 0.9rps/mb 1.35rps/mb 1.8rps/mb 1.72 rps/MB 0.92 rps/MB 0.8 rps/MB CRuby CRuby + YJIT JRuby 300MB heap One JRuby process can run your entire site
  9. JRuby 10! • Major leap forward! • Ruby 3.4 support,

    Java 17 (or 21) minimum • New Prism parser with complete language features • Targeted optimization across the board • Our biggest jump since JRuby 9000 (9.0.0.0) • Releasing late this year...now is the time to contribute!
  10. Ruby 3.4 Compatible • Language specs: 98.6% passing • Core

    specs: 97% passing • Same default and bundled gems, some with JRuby extensions • Releasing right after CRuby 3.4 • And preview releases maybe?
  11. Ruby Parser • Two parsers: Java (port of C version)

    and Prism (new C-based) • Future is Prism by default • Ship native library for most platforms • Ship WASM build that runs on JVM without native code • Prism already integrated in JRuby 9.4!
  12. Compatibility Before Optimization • Compatibility fi rst • Keep up

    with Ruby features • Support users and fi x bugs • Now: caught up with CRuby, more compatible than ever • Finally doing long-planned optimizations!
  13. Keyword-like Methods • Special methods can see into the method

    that called them • __method__/__callee__ look at method's name • block_given? needs passed block • JVM does not let us access caller dierectly (security etc) • Copying data to heap is slow and breaks optimization • Optimized: pass caller data to special call sites
  14. __callee__ performance 0M iter/s 2.25M iter/s 4.5M iter/s 6.75M iter/s

    9M iter/s __callee__ 8.4M 4.4M 2.5M CRuby JRuby 9.4 CRuby YJIT JRuby 10
  15. ALOAD 0 ALOAD 4 ALOAD 5 ALOAD 2 GETSTATIC org/jruby/runtime/Visibility.PUBLIC

    : Lorg/jruby/runtime/Visibility; ALOAD 3 INVOKEVIRTUAL org/jruby/runtime/ThreadContext.preMethodFrameOnly (Lorg/jruby/RubyModul ALOAD 0 ALOAD 2 INVOKEDYNAMIC callVariable:__callee__(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runt // handle kind 0x6 : INVOKESTATIC org/jruby/ir/targets/indy/SelfInvokeSite.bootstrap(Ljava/lang/invoke/MethodHandles$L // arguments: 0, 0, "-e", 1 ] Load method name on stack Update frame on heap Normal call to __callee__
  16. InvokeDynamic • JVM's special sauce for dynamic languages • De

    fi ne a new "instruction" • Write code to connect call site to instruction logic • JIT optimizes like static JVM bytecode • Most JRuby optimizations use InvokeDynamic
  17. ALOAD 0 ALOAD 2 ALOAD 5 INVOKEDYNAMIC callVariable:__callee__(Lorg/jruby/runtime/ThreadContext;Lorg/jruby/runt // handle

    kind 0x6 : INVOKESTATIC org/jruby/ir/targets/indy/FrameNameSite.frameNameBootstrap(Ljava/lang/invoke/MethodH // arguments: "-e", 1 ] Load method name on stack Special call to name-aware instruction FrameNameSite calls frameless __callee__ with name or makes normal method call
  18. __callee__ performance 0M iter/s 2.25M iter/s 4.5M iter/s 6.75M iter/s

    9M iter/s __callee__ 8.4M 4.4M 2.5M CRuby JRuby 9.4 CRuby YJIT JRuby 10
  19. __callee__ performance 0M iter/s 10M iter/s 20M iter/s 30M iter/s

    40M iter/s __callee__ 36.30M 8.40M 4.40M 2.50M CRuby JRuby 9.4 CRuby YJIT JRuby 10
  20. block_given? performance 0M iter/s 5M iter/s 10M iter/s 15M iter/s

    20M iter/s block_given? 18.9M 6.4M 2.6M CRuby JRuby 9.4 CRuby YJIT JRuby 10
  21. block_given? performance 0M iter/s 75M iter/s 150M iter/s 225M iter/s

    300M iter/s block_given? 241.5M 18.9M 6.4M 2.6M CRuby JRuby 9.4 CRuby YJIT JRuby 10
  22. String Interpolation • Too much bytecode to compile interpolated string

    • Harder to inline, slower to warm up • Optimized: special call site with static string and pattern • Small bytecode, better optimization
  23. ALOAD 0 INVOKEDYNAMIC bufferString(Lorg/jruby/runtime/ThreadContext;)Lorg/jruby/RubyStr ALOAD 0 INVOKEDYNAMIC frozen(Lorg/jruby/runtime/ThreadContext;)Lorg/jruby/RubyString; INVOKEVIRTUAL org/jruby/RubyString.cat19

    (Lorg/jruby/RubyString;)Lorg/jruby/Rub ALOAD 8 INVOKEVIRTUAL org/jruby/RubyString.appendAsDynamicString (Lorg/jruby/runtime/bu ALOAD 0 INVOKEDYNAMIC frozen(Lorg/jruby/runtime/ThreadContext;)Lorg/jruby/RubyString; INVOKEVIRTUAL org/jruby/RubyString.cat19 (Lorg/jruby/RubyString;)Lorg/jruby/Rub ALOAD 9 INVOKEVIRTUAL org/jruby/RubyString.appendAsDynamicString (Lorg/jruby/runtime/bu ALOAD 0 INVOKEDYNAMIC frozen(Lorg/jruby/runtime/ThreadContext;)Lorg/jruby/RubyString; INVOKEVIRTUAL org/jruby/RubyString.cat19 (Lorg/jruby/RubyString;)Lorg/jruby/Rub ALOAD 10 INVOKEVIRTUAL org/jruby/RubyString.appendAsDynamicString (Lorg/jruby/runtime/bu ASTORE 11
  24. 0001 putobject "foo" 0003 getlocal_WC_0 a@0 0005 dup 0006 objtostring

    <calldata!mid:to_s, argc:0, FCALL|ARGS_SIMPLE> 0008 anytostring 0009 putobject "bar" 0011 getlocal_WC_0 b@1 0013 dup 0014 objtostring <calldata!mid:to_s, argc:0, FCALL|ARGS_SIMPLE> 0016 anytostring 0017 putobject "baz" 0019 getlocal_WC_0 c@2 0021 dup 0022 objtostring <calldata!mid:to_s, argc:0, FCALL|ARGS_SIMPLE> 0024 anytostring 0025 concatstrings 6
  25. BuildDynamicString • Allocate large enough buffer for all elements •

    Pro fi le to pick "right size"? • Static strings copy directly into buffer • "Appendable" objects copy directly into buffer • Only non-String objects create temporary strings
  26. ALOAD 0 INVOKEDYNAMIC bufferString(Lorg/jruby/runtime/ThreadContext;)Lorg/jruby/RubyStr ALOAD 0 INVOKEDYNAMIC frozen(Lorg/jruby/runtime/ThreadContext;)Lorg/jruby/RubyString; INVOKEVIRTUAL org/jruby/RubyString.cat19

    (Lorg/jruby/RubyString;)Lorg/jruby/Rub ALOAD 8 INVOKEVIRTUAL org/jruby/RubyString.appendAsDynamicString (Lorg/jruby/runtime/bu ALOAD 0 INVOKEDYNAMIC frozen(Lorg/jruby/runtime/ThreadContext;)Lorg/jruby/RubyString; INVOKEVIRTUAL org/jruby/RubyString.cat19 (Lorg/jruby/RubyString;)Lorg/jruby/Rub ALOAD 9 INVOKEVIRTUAL org/jruby/RubyString.appendAsDynamicString (Lorg/jruby/runtime/bu ALOAD 0 INVOKEDYNAMIC frozen(Lorg/jruby/runtime/ThreadContext;)Lorg/jruby/RubyString; INVOKEVIRTUAL org/jruby/RubyString.cat19 (Lorg/jruby/RubyString;)Lorg/jruby/Rub ALOAD 10 INVOKEVIRTUAL org/jruby/RubyString.appendAsDynamicString (Lorg/jruby/runtime/bu ASTORE 11
  27. ALOAD 0 ALOAD 8 ALOAD 9 ALOAD 10 INVOKEDYNAMIC buildDynamicString(Lorg/jruby/runtime

    // handle kind 0x6 : INVOKESTATIC org/jruby/ir/targets/indy/BuildDynamicStringSite. // arguments: "foo", "UTF-8",
  28. 0 7.5 15 22.5 30 "#{n}" "foo#{n}" "#{n}bar" "#{n}#{n}" "foo#{n}bar"

    "#{n}bar#{n}" "#{n}#{n}bar" "foo#{n}#{n}" "#{n}#{n}#{n}" "foo#{n}bar#{n}baz" "foo#{n}bar#{n}baz#{n}" "#{x}#{x}#{x}#{x}#{x}" "#{x}#{x}#{x}#{x}#{x}#{x}#{x}#{x}#{x}#{x}" CRuby 3.4 YJIT JRuby 10
  29. Many Opportunities • Special variables like $~ and $_ •

    Read-only closure variables passed on stack • Block-receiving methods inlined with block • Smarter object shapes for instance variables, strings, arrays • Inline super calls, re fi nements, metaprogramming, and more • JRuby 10 will be faster and warm up more quickly!
  30. Fibers with Project Loom • Thread-based fi bers don't scale

    • Enumerators use fi bers • Structured concurrency is coming • Loom brings fi bers to JVM • Easily handles thousands of fi bers • Faster context-switching • Working with @ioquatix on async
  31. FFI With Project Panama • Foreign function interface (FFI) •

    With JVM help to make direct calls • Foreign memory API • JVM-assisted access, lifecycle • Make FFI calls as fast as C ext • Time to get rid of C extensions! • Generate Panama FFI, call from Ruby, better than writing C!
  32. JRuby Compiler Pipeline Ruby (.rb) JIT Java Instructions (java bytecode)

    Ruby Instructions (IR) parse interpret interpreter interpret C1 compile native code better native code java bytecode interpreter execute C2 compile Java Virtual Machine JRuby Internals
  33. Startup Time • Very hard for JRuby, due to JVM

    design • Project CRaC: checkpoint and restore (Linux-only) • Snapshot a warm JRuby and jump back in • Project Leyden: ahead-of-time optimization for OpenJDK • Early access builds are promising! • JVM JIT server: start with JIT code from previous run
  34. ruby -e 1 0s 0.45s 0.9s 1.35s 1.8s -e 1

    0.21s 0.635s 1.271s 1.686s 0.053s CRuby 3.2 JRuby 9.4 JRuby 9.4 --dev JRuby 9.4 Leyden JRuby 9.4 CRaC
  35. rails new testapp --skip-bundle 0s 1.5s 3s 4.5s 6s rails

    new testapp --skip-bundle 0.89s 1.35s 2.7s 5.918s 0.314s CRuby JRuby JRuby --dev JRuby Leyden JRuby CRaC
  36. Big Plans • Maintaining parity with CRuby features • Leveraging

    all the new JVM features • Optimizing all the things • Tooling for the enterprise • Mobile and embedded enhancements
  37. JRuby Needs You! • Sponsor @headius on Github • Help

    me keep building JRuby • Publicity for you or your company! • Commercial JRuby support now available • Contact [email protected] • Let us know how we can help you!