Six Years of Ruby Performance: A History

Six Years of Ruby Performance: A History

Ruby keeps getting faster. And people keep asking, “but how fast is it for Rails?” Rails makes a great way to measure Ruby’s speed, and how Ruby has changed version-by-version. Let’s look at six years of performance for apps big and small.

How fast is 2.6.0? With JIT or not? How do I measure? How close is Ruby 3x3? Should I upgrade?

5e8107f48d4471a40de325151d589b6d?s=128

Noah Gibbs

July 05, 2019
Tweet

Transcript

  1. Six Years of Ruby Performance History But How to Measure...?

  2. Have You Seen the Toy Museum? Brighton's fun. If you

    haven't seen the toy and model museum, it's right under the train station.
  3. I Love Questions - Seriously Am I speaking too fast?

    Tell me. Methodology question? Speak up. I can’t always see your hands, so just ask. Separately: AppFolio pays me to do Ruby stuff.
 Thank you, AppFolio!
  4. Ruby 3x3 In Japan, Matz spoke about Ruby 3x3 at

    RubyKaigi. He said the speed is doing well.
  5. Will JIT Fix It? With MJIT, Ruby 2.6 is 280%

    the speed of 2.0.0. We’re nearly done!
  6. Except... But JIT doesn’t help with Rails. With MJIT, Rails

    is slower than without it. MJIT will get a little faster, but...
  7. Ruby or Rails? I’m going to look at using Rails

    to measure Ruby speed because… Rails is important code that is slow right now.
  8. Old Way: Measure with RRB Rails Ruby Bench answers “how

    fast is Ruby for a real-world, production Rails app?”
  9. None
  10. None
  11. None
  12. How Does RRB Measure? Rails Ruby Bench uses a highly

    concurrent simulated real world workload on a real world forum app called Discourse.
  13. Throughput, Not Latency RRB uses 10 processes and 60 threads,

    based on measurements showing that to be the fastest configuration. We'll talk more about that later.
  14. "Real World", "Production" Us: It’s about 172% faster. But the

    times are all mixed together. Which part is slow? RRB:
  15. "Real World", "Production" Us: It’s about 172% faster. But the

    times are all mixed together. Which part is slow? RRB: ... "
  16. Micro vs Macro Benchmarks Benchmarks come in many sizes. RRB

    is very large.
  17. Micro/Macro Small benchmarks are specific. Large ones are representative.

  18. Rails Ruby Bench is Macro RRB answers “how fast is

    this workload?” but not “which part is slow?”
  19. This is a Job For... To explore specific questions, we'll

    use a smaller benchmark. What should that benchmark be like?
  20. RSB == Rails Simpler Bench RSB should make it easy

    to time various operations. But not mixed into one single timing like RRB.
  21. Routes for Exploration We’ll begin exploring with a trivial “hello,

    world” route that returns a static string.
  22. Specific We’ll start with a single-process, single- thread concurrency model.

    Simple, with low latency.
  23. Specificity So I built it. Let's see what it says

    about a few different questions.
  24. First: Rails Overhead Rails takes some time to run for

    each request. A static “Hello, World” route is just benchmarking Rails overhead. What does it find?
  25. None
  26. First Off... WHAT? Remember how with RRB, half the gains

    came from Ruby 2.1 and 2.2? This doesn’t look like that at all!
  27. Wait, How Much? Also in this graph Ruby 2.6 gained

    only about 30% over 2.0.0p0, and not 70%+. What?
  28. Oh. Wait. We’re not measuring just Ruby framework code yet.

    We’re also measuring the app server (Puma.)
  29. Framework Time, App Time To measure Rails framework time versus

    the app server, I'll build a quick Rack app and measure it.
  30. None
  31. And That's Why They Say... This is why Rails has

    a special API mode. Normal Rails has a lot of overhead compared to Rack.
  32. And Now, the Final Total If we turn reqs/sec into

    secs/req and subtract, we get Rails' overhead per request.
  33. None
  34. Ruby 3x3 Means Concurrency We've looked at framework overhead. What

    about concurrency? Is concurrency improving?
  35. Exploration: Concurrency On a multicore machine we get better throughput

    with more processes and threads. But how many?
  36. Maximum CRuby Throughput We’ll look for a configuration which will

    use a GIL-friendly number of processes and threads.
  37. The Experiment I set up RSB to run with varying

    numbers of processes and threads on Ruby from 2.0 to 2.6. For instance, 4 processes beats 4 threads...
  38. None
  39. Threads Bad? In fact, sometimes threads are clearly worse than

    no threads. Let's look at 4 processes - with only 1 thread/process versus with 8.
  40. None
  41. Sometimes? With more processes, threads help... But only a little.

    Here's 8 processes, with 1 thread versus 4 threads.
  42. None
  43. But Why? This is a workload difference between RRB and

    RSB. RRB has lots of non-Ruby work, and lots of context switches.
  44. CRuby Threads, CRuby GIL CRuby only benefits from threads for

    non-Ruby time - database, I/O, C extensions, etc. RSB is nearly 100% Ruby work.
  45. One More Observation On each previous graph, from left to

    right, each row does roughly the same thing...
  46. So... That means each Ruby is handling concurrency the same

    way.
  47. Is There a Win Here? One reminder: if we could

    run Ruby concurrently (with Guilds?) then we might be able to get far more speed from this benchmark.
  48. Rails and MJIT We know MJIT isn't helping for Rails

    yet. But is it coming closer? Let's check. That means comparing 2.6 performance to prerelease Ruby 2.7 performance.
  49. Rails w/ MJIT - Getting Closer Ruby 2.6 vs recent

    2.7 - with 2.6, MJIT is 83% the speed. With 2.7, it's 94%.
  50. Rack w/ MJIT - No Change Ruby 2.6, Rack w/

    MJIT is 97% the speed. With 2.7, it's also 97%.
  51. What's Next? There’s a lot more I can benchmark with

    RSB, and I do. I post the results to engineering.appfolio.com. Future comparisons will include app servers, memory allocators and floating point.
  52. Methodology RSB is pretty methodologically solid, though the code is

    new. This isn’t a 100% full description, but...
  53. Methodology Misc Dedicated EC2 Linux Instance Localhost, not TCP Multiple

    consecutive batches, random order One route/concurrency config per batch Many requests/batch No Redis or cache No destructive routes Fail batch > 0.01% error Configurable warmups Low-overhead C load tester (wg/wrk) Rails 4.2 for Ruby 2.0 -> 2.6+ compatibility
  54. The Code, The Data RSB Code: https://github.com/noahgibbs/rsb Data for these

    graphs: https://github.com/noahgibbs/rrb_datavis
  55. Questions? Comments? Objections? Recitations? Show Tunes? Should you discover an

    important question you missed, I can be reached as @codefolio on Twitter or the.codefolio.guy@gmail.com as well. I love talking Ruby! These slides: http://bit.ly/brighton2019-gibbs