Slide 1

Slide 1 text

Real WORLD RUBY PERFORMANCE Aaron Quint / @aq / Ruby Conf 2014

Slide 2

Slide 2 text

@tmm1 @SamSaffron @_ko1 SHOUTOUT

Slide 3

Slide 3 text

We’ll come back to who I am later. It’s [relatively] unimportant. SKIPPING THE INTRO

Slide 4

Slide 4 text

I’ve learned so much over the past 5 years, what could I share? This TALK was HARD TO WRITE

Slide 5

Slide 5 text

It’s a ⌘+C ⌘+P culture. TIPS And tricks are the CLIFF NOTES of tech learning

Slide 6

Slide 6 text

How to THINK about a problem is much more interesting than how to solve it. As a mentor I want to teach philosophy not snippets

Slide 7

Slide 7 text

The tools and tricks will change over time. Today, Take away the process

Slide 8

Slide 8 text

A multi-step process. Ruby Performance as therapy

Slide 9

Slide 9 text

It’s a multi-step process Relax, Open up We’re going to go deep

Slide 10

Slide 10 text

Step 1: Acceptance

Slide 11

Slide 11 text

It’s your Fault.

Slide 12

Slide 12 text

Really?

Slide 13

Slide 13 text

Yes.

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

It’s not you, It’s me.

Slide 16

Slide 16 text

It’s not you, It’s me.

Slide 17

Slide 17 text

— George Costanza (Inventor of “It’s not you, it’s me”) It’s not you, It’s me.

Slide 18

Slide 18 text

Performance is about context

Slide 19

Slide 19 text

Doesn’t scale for what? To what degree? With what hardware? … “X Doesn’t SCALE” IS BS

Slide 20

Slide 20 text

So when we talk about our ruby being slow

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

Rails

Slide 23

Slide 23 text

Rails 10ms

Slide 24

Slide 24 text

Rails Your application 10ms

Slide 25

Slide 25 text

Rails Your application DB 10ms

Slide 26

Slide 26 text

Rails Your application DB 10ms 20ms

Slide 27

Slide 27 text

Rails Your application DB Cache 10ms 20ms

Slide 28

Slide 28 text

Rails Your application DB Cache 10ms 20ms 10ms

Slide 29

Slide 29 text

Rails Your application DB Cache 10ms 20ms 10ms 250ms

Slide 30

Slide 30 text

IT’s MY FAULT.

Slide 31

Slide 31 text

Step 2: Diagnosis

Slide 32

Slide 32 text

Where did I go wrong?

Slide 33

Slide 33 text

METRICS! Measurement! MMMNUMBERS! Milliseconds MATTER!

Slide 34

Slide 34 text

Use the right one for the job. Tools abound!

Slide 35

Slide 35 text

Step 3: Treatment

Slide 36

Slide 36 text

what are the steps to fix this problem?

Slide 37

Slide 37 text

How many strokes for the lowest #? Playing golf.

Slide 38

Slide 38 text

Two angles of optimization

Slide 39

Slide 39 text

Proxies/Balancers Application Datastores Filesystem/OS/Hardware Individual Request Path (Controller#action)

Slide 40

Slide 40 text

aka, speeding up a single query, controller action, or code path Vertical: Fix individual Elements

Slide 41

Slide 41 text

aka, Adding more workers per-node, buying better hardware Horizontal: Address hardware or software across a cluster

Slide 42

Slide 42 text

Important Themes:

Slide 43

Slide 43 text

Context is crucial to acceptance

Slide 44

Slide 44 text

Visibility and Introspect- ability are crucial to diagnosis

Slide 45

Slide 45 text

Knowing your tools is crucial to treatment

Slide 46

Slide 46 text

I’m Aaron Quint. I’m the chief Scientist at Paperless Post.

Slide 47

Slide 47 text

No content

Slide 48

Slide 48 text

Opposing forces. Features vs. speed

Slide 49

Slide 49 text

We realized that being fast meant being stable

Slide 50

Slide 50 text

CASE STUDIES in performance therapy

Slide 51

Slide 51 text

No content

Slide 52

Slide 52 text

Case 1: JSON FOR DAYS

Slide 53

Slide 53 text

No content

Slide 54

Slide 54 text

No content

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

package:7292:1123434234234

Slide 57

Slide 57 text

package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424

Slide 58

Slide 58 text

package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424 package:7292:1123434234234

Slide 59

Slide 59 text

package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424 package:7292:1123434234234

Slide 60

Slide 60 text

Uncached performance is still a problem

Slide 61

Slide 61 text

ppprofiler to the rescue

Slide 62

Slide 62 text

ppprofiler

Slide 63

Slide 63 text

ppprofiler • Auto-cache toggling • Benchmark • Rblineprof • As::Notification Counts (SQL/Cache, etc) • MemoryProfiler (NEW!) • Gist-able (markdown) output

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

No content

Slide 66

Slide 66 text

No content

Slide 67

Slide 67 text

Rinse and Repeat Make the slowest lines faster

Slide 68

Slide 68 text

No content

Slide 69

Slide 69 text

No content

Slide 70

Slide 70 text

No content

Slide 71

Slide 71 text

No content

Slide 72

Slide 72 text

No content

Slide 73

Slide 73 text

Case 2: FINGER IN THE SOCKET

Slide 74

Slide 74 text

Before Vday we were looking for any wins

Slide 75

Slide 75 text

IN BETWEEN THE LINES! stackprof + stackprof-remote

Slide 76

Slide 76 text

No content

Slide 77

Slide 77 text

Ruby Process (Unicorn)

Slide 78

Slide 78 text

Ruby Process (Unicorn)

Slide 79

Slide 79 text

Ruby Process (Unicorn)

Slide 80

Slide 80 text

Ruby Process (Unicorn) AC::Dispatch

Slide 81

Slide 81 text

Ruby Process (Unicorn) AC::Dispatch MyController::Create

Slide 82

Slide 82 text

Ruby Process (Unicorn) AC::Dispatch MyController::Create Template::Render

Slide 83

Slide 83 text

Ruby Process (Unicorn) AC::Dispatch MyController::Create Template::Render Ar::Find

Slide 84

Slide 84 text

Ruby Process (Unicorn)

Slide 85

Slide 85 text

Ruby Process (Unicorn) StackProf.start rb_profile_frames() rb_profile_frames() rb_profile_frames() rb_profile_frames() StackProf.stop StackProf.dump

Slide 86

Slide 86 text

! [paperless@production-webapp10 current]$ stackprof tmp/stackprof-cpu-30715-1391204970.dump ================================== Mode: cpu(1000) Samples: 1761 (3.61% miss rate) GC: 128 (7.27%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 344 (19.5%) 342 (19.4%) Statsd#send_to_socket 393 (22.3%) 44 (2.5%) Statsd#sampled 44 (2.5%) 44 (2.5%) block in ActiveRecord::ConnectionAdapters::PostgreSQLPoolAdapter#execute 56 (3.2%) 29 (1.6%) block in ActiveSupport::Notifications::Fanout#listeners_for 29 (1.6%) 29 (1.6%) ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#extract_pg_identifier_from_name 26 (1.5%) 26 (1.5%) ActiveSupport::Notifications::Fanout::Subscribers::Evented#subscribed_to? 25 (1.4%) 25 (1.4%) String#blank? 25 (1.4%) 25 (1.4%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#select 24 (1.4%) 24 (1.4%) ActiveRecord::Base.scoped_methods 22 (1.2%) 22 (1.2%) Dalli::Server::KSocket#kgio_wait_readable 21 (1.2%) 21 (1.2%) ActiveSupport::CoreExtensions::Hash::Keys#assert_valid_keys 42 (2.4%) 20 (1.1%) block in Dalli::Server::KSocket#readfull 28 (1.6%) 19 (1.1%) ActiveRecord::ConnectionAdapters::ConnectionHandler#retrieve_connection_pool 18 (1.0%) 18 (1.0%) #.instrumenter 17 (1.0%) 16 (0.9%) Dalli::Server#deserialize 15 (0.9%) 15 (0.9%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#select_raw 14 (0.8%) 14 (0.8%) #.decode_www_form_component 13 (0.7%) 13 (0.7%) Dalli::Server#write 15 (0.9%) 11 (0.6%) ActiveSupport::CoreExtensions::Time::Calculations#minus_with_coercion 10 (0.6%) 10 (0.6%) block in ActiveRecord::Base.with_scope 10 (0.6%) 10 (0.6%) block in ActiveRecord::ConnectionAdapters::QueryCache#cache_sql 21 (1.2%) 10 (0.6%) Yajl::Encoder.encode 10 (0.6%) 10 (0.6%) Set#add 10 (0.6%) 10 (0.6%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#result_as_array 10 (0.6%) 10 (0.6%) ActiveSupport::CoreExtensions::Time::Calculations::ClassMethods#time_with_datetime_fallback 9 (0.5%) 9 (0.5%) ActiveRecord::DynamicFinderMatch#initialize 9 (0.5%) 9 (0.5%) ActiveSupport::LogSubscriber.logger 9 (0.5%) 9 (0.5%) block in ActionController::Base.action_methods 9 (0.5%) 9 (0.5%) block in ActionController::Base.action_methods 9 (0.5%) 9 (0.5%) block (2 levels) in ActiveRecord::Base.connection_handler=

Slide 87

Slide 87 text

Hmm, why is statsd slow?

Slide 88

Slide 88 text

Pull out good old benchmark

Slide 89

Slide 89 text

$ ruby test/profile/statsd.rb user system total real udp with connect 0.010000 0.000000 0.010000 ( 0.074522) udp without connect 0.120000 0.530000 0.650000 ( 13.096515) statsd with connect 0.000000 0.090000 0.090000 ( 0.103520) statsd without connect 0.100000 0.620000 0.720000 ( 13.483539)

Slide 90

Slide 90 text

WIN!

Slide 91

Slide 91 text

No content

Slide 92

Slide 92 text

Case 3: THE HOLIDAY SCALE

Slide 93

Slide 93 text

No content

Slide 94

Slide 94 text

Some times you can throw money at the problem

Slide 95

Slide 95 text

No content

Slide 96

Slide 96 text

Case 4: SHRINKING THE GAP

Slide 97

Slide 97 text

Start at the top, work your way down. Starting with a HITLIST

Slide 98

Slide 98 text

Number of Requests x 90th Percentile Response Time Total Time

Slide 99

Slide 99 text

No content

Slide 100

Slide 100 text

No content

Slide 101

Slide 101 text

Using Stackprof flamegraphs on production.

Slide 102

Slide 102 text

Using Stackprof flamegraphs on production. SET IT ON FIRE!

Slide 103

Slide 103 text

No content

Slide 104

Slide 104 text

No content

Slide 105

Slide 105 text

No content

Slide 106

Slide 106 text

No content

Slide 107

Slide 107 text

No content

Slide 108

Slide 108 text

Big wins are not the point

Slide 109

Slide 109 text

If you’re not failing you’re not being honest

Slide 110

Slide 110 text

Don’t just make tools, learn to use them

Slide 111

Slide 111 text

twitter: @aq github.com/quirkey github.com/paperlesspost Thanks!