Slide 1

Slide 1 text

Ruby Performance The Last Mile

Slide 2

Slide 2 text

Charles Oliver Nutter @headius

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

I've spent ten years building

Slide 6

Slide 6 text

I love Ruby

Slide 7

Slide 7 text

I want Ruby to succeed

Slide 8

Slide 8 text

I believe JRuby is the way

Slide 9

Slide 9 text

Do you want performance today?

Slide 10

Slide 10 text

Do you want concurrency today?

Slide 11

Slide 11 text

Do you know JRuby?

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

What you love from Ruby • Latest Ruby language features • Mostly-same Ruby standard library • Pure-Ruby gems work great • Native gems with JRuby support • It walks like Ruby, talks like Ruby • It is Ruby!

Slide 14

Slide 14 text

With the power of the JVM • Fast JIT to native code • Fully parallel threading • Leading-edge garbage collectors • Access to Java, Scala, Clojure, ... • But it's still Ruby!

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

Do you know Jay?

Slide 17

Slide 17 text

Matz's Keynote

Slide 18

Slide 18 text

Performance

Slide 19

Slide 19 text

Concurrency

Slide 20

Slide 20 text

By 2020

Slide 21

Slide 21 text

But what about today?

Slide 22

Slide 22 text

Performance

Slide 23

Slide 23 text

What do you optimize for? • Easy to develop with: short time until first deploy • Fast startup: good response cycle at command line • Straight-line performance, many operations per second • Parallelism: utilize many cores to get more done

Slide 24

Slide 24 text

State of the Art: Production-quality Rubies

Slide 25

Slide 25 text

Production Quality? • Support for 99%+ of Ruby language features • Important parts of standard library • Runs typical Ruby applications and libraries • Healthy extension ecosystem • CRuby, JRuby are the only real options right now

Slide 26

Slide 26 text

CRuby (MRI) • Up until 1.9, AST interpreter • YARV bytecode VM introduced for 1.9.0 • GC and performance improvements through 2.x series • Ruby 2.3 is latest, released in December • Future work on JIT, GC, happening now

Slide 27

Slide 27 text

JRuby • Many redesigns since creation in 2001 • AST interpreter until 2007 • Simple AST-to-bytecode JIT until JRuby 9000 • Optimizing compiler with JIT for 9k • JRuby 9.0.5 is current, 9.1 in a couple weeks • Next-gen Truffle runtime in the works

Slide 28

Slide 28 text

Lies, damn lies, and benchmarks

Slide 29

Slide 29 text

CRuby: red/black tree 0 2 4 6 8 1.8.7 1.9.3 2.0 2.1 2.2 2.3

Slide 30

Slide 30 text

CRuby: red/black tree 0 0.625 1.25 1.875 2.5 1.8.7 1.9.3 2.0 2.1 2.2 2.3

Slide 31

Slide 31 text

JRuby: red/black tree 0 1.25 2.5 3.75 5 JRuby 9.1 CRuby 2.3

Slide 32

Slide 32 text

JRuby: red/black tree 0 1.25 2.5 3.75 5 JRuby 9.1 CRuby 2.3

Slide 33

Slide 33 text

JRuby: red/black tree 0 0.6 1.2 1.8 2.4 JRuby 9.1 CRuby 2.3 4x FASTER

Slide 34

Slide 34 text

Ruby Code

Slide 35

Slide 35 text

Ruby Code Parser JRuby AST

Slide 36

Slide 36 text

r JRuby AST Compiler JRuby IR

Slide 37

Slide 37 text

er JRuby IR JIT JVM Bytecode

Slide 38

Slide 38 text

What can we do with this?

Slide 39

Slide 39 text

Block Pass-through def foo(&b)
 bar(&b)
 end
 
 def bar
 yield
 end

Slide 40

Slide 40 text

Block Pass-through loop {
 puts Benchmark.measure {
 i = 0
 while i < 1_000_000
 i+=1
 foo { }; foo { }; foo { }; foo { }; foo { }
 end
 }
 }

Slide 41

Slide 41 text

Block Passing 0 0.55 1.1 1.65 2.2 CRuby 2.3 JRuby 1.7.24 JRuby 9.1

Slide 42

Slide 42 text

define_method define_method :add do |a, b|
 a + b
 end

Slide 43

Slide 43 text

define_method 0 0.25 0.5 0.75 1 CRuby 2.3 JRuby 1.7.24 JRuby 9.1

Slide 44

Slide 44 text

Postfix rescue foo rescue nil

Slide 45

Slide 45 text

csv.rb Converters Converters = {
 integer: lambda { |f|
 Integer(f.encode(ConverterEncoding)) rescue f
 },
 float: lambda { |f|
 Float(f.encode(ConverterEncoding)) rescue f
 },

Slide 46

Slide 46 text

Postfix rescue 0 3.5 7 10.5 14 CRuby 2.3 JRuby 9.1

Slide 47

Slide 47 text

0 3.5 7 10.5 14 CRuby 2.3 JRuby 9.1 Postfix rescue

Slide 48

Slide 48 text

CRuby starts up the fastest

Slide 49

Slide 49 text

JRuby runs the fastest

Slide 50

Slide 50 text

And we're getting faster

Slide 51

Slide 51 text

Concurrency

Slide 52

Slide 52 text

Parallelism

Slide 53

Slide 53 text

Concurrency? Parallelism? • Parallelism happens on the harder, e.g. multi-core • Concurrency happens in the software, e.g. Thread API • You can have concurrency without parallelism • You can have both with JRuby

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

Parallelism in Ruby • On CRuby, usually process-level • Ruby threads prevented from running in parallel • Extensions, IO can opt to release lock • On JRuby, usually thread-level • Ruby thread == JVM thread == OS thread • Single-process, shared memory

Slide 56

Slide 56 text

A Mailing Queue • A simple example of concurrency • For each job, construct an email to send • Some computation added to make processing heavier • "Ruby Concurrency and Parallelism: A Practical Tutorial"
 https://www.toptal.com/ruby/ruby-concurrency-and-parallelism-a- practical-primer

Slide 57

Slide 57 text

require "./lib/mailer"
 require "benchmark"
 
 puts Benchmark.measure{
 (ARGV[0] || 10_000).times do |i|
 Mailer.deliver do
 from "eki_#{i}@eqbalq.com"
 to "jill_#{i}@example.com"
 subject "Threading and Forking (#{i})"
 body "Some content"
 end
 end
 }

Slide 58

Slide 58 text

POOL_SIZE = (ARGV[0] || 10).to_i
 
 jobs = Queue.new
 
 (ARGV[1] || 10_000).to_i.times{|i| jobs.push i}
 
 workers = (POOL_SIZE).times.map do
 Thread.new do
 begin
 while x = jobs.pop(true)
 Mailer.deliver do
 ...
 end
 end
 rescue ThreadError
 end
 end
 end
 
 workers.map(&:join)

Slide 59

Slide 59 text

CRuby: mailer * 1000 Time in Seconds 0 37.5 75 112.5 150 synchronous 4 threads 4 forks

Slide 60

Slide 60 text

JRuby: mailer * 1000 0 7 14 21 28 Synchronous 4 Threads

Slide 61

Slide 61 text

JRuby vs MRI Times Improvement 0 0.85 1.7 2.55 3.4 CRuby Forks JRuby Threads 3.37x 3.09x

Slide 62

Slide 62 text

But Threads are bad, right?

Slide 63

Slide 63 text

Most users will never Thread.new

Slide 64

Slide 64 text

You'll deploy one Rails server for your entire site

Slide 65

Slide 65 text

You'll cut your instances ten times

Slide 66

Slide 66 text

Or maybe 100 times

Slide 67

Slide 67 text

Libraries and frameworks will Thread.new for you

Slide 68

Slide 68 text

And on JRuby, you'll have more efficient apps

Slide 69

Slide 69 text

So we're done?

Slide 70

Slide 70 text

Move to JRuby

Slide 71

Slide 71 text

Now your app is fast!

Slide 72

Slide 72 text

Right?

Slide 73

Slide 73 text

It is possible to write efficient Ruby code

Slide 74

Slide 74 text

But it's very easy to write inefficient Ruby code

Slide 75

Slide 75 text

Great Features, Hidden Costs • Blocks are expensive to create, slower than method calls • case/when is an O(n) cascade of calls • Singleton classes/methods are costly and hurt method cache • Literal arrays, hashes, strings have to be constructed, GCed • Flow-control exceptions can be very expensive and hard to find

Slide 76

Slide 76 text

What happens if your code does not run fast enough?

Slide 77

Slide 77 text

You need to know your app

Slide 78

Slide 78 text

You need good tools

Slide 79

Slide 79 text

And the JVM has great tools

Slide 80

Slide 80 text

CRuby Tooling • Basic GC stats built in • Simple profilers in standard library • Some third-party tools • stackprof, ruby-prof, perftools.rb, ...

Slide 81

Slide 81 text

JVM tooling is JRuby tooling

Slide 82

Slide 82 text

JVM Tooling • Wide range of GCs: parallel, concurrent, realtime, pauseless • Built-in tools for analyzing GC, JIT, thread, IO, heap • Built-in remote monitoring via JMX • Dozens of tools out there for profiling, management, and more

Slide 83

Slide 83 text

VisualVM • Graphical console into your application • Monitor GC, threads, CPU usage • Sampled or full profiling with GUI browser • Live memory dumping, heap inspection • Ships with every OpenJDK install

Slide 84

Slide 84 text

No content

Slide 85

Slide 85 text

No content

Slide 86

Slide 86 text

Java Mission Control • Extremely low-overhead application recording • Analyze results offline in JMC • GC, CPU, heap events, IO operation all browsable • Commercial feature, free for development use

Slide 87

Slide 87 text

No content

Slide 88

Slide 88 text

No content

Slide 89

Slide 89 text

No content

Slide 90

Slide 90 text

More on GC • JVM GCs are incredibly tunable with sensible defaults • Tools like http://gceasy.io and JClarity give you a deeper view • These are the best GCs and the best tools in the world

Slide 91

Slide 91 text

Finding Bottlenecks

Slide 92

Slide 92 text

Profiling

Slide 93

Slide 93 text

Profiling Tools • Command line options: --profile, --sample • JVM command-line profilers like prof • Many graphical sampling/complete profiling options • Flame graphs, stack profilers, you name it!

Slide 94

Slide 94 text

No content

Slide 95

Slide 95 text

No content

Slide 96

Slide 96 text

No content

Slide 97

Slide 97 text

...and this is just the beginning

Slide 98

Slide 98 text

Wrapping Up

Slide 99

Slide 99 text

Ruby is alive and well

Slide 100

Slide 100 text

CRuby continues to improve

Slide 101

Slide 101 text

Ruby 3 is very exciting

Slide 102

Slide 102 text

JRuby is performance today

Slide 103

Slide 103 text

JRuby is concurrency today

Slide 104

Slide 104 text

JRuby has tools today

Slide 105

Slide 105 text

JRuby makes you happier!

Slide 106

Slide 106 text

Thank You! Charles Oliver Nutter [email protected] @headius http://jruby.org https://github.com/jruby/jruby