Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Real World Ruby Performance
Search
Aaron Quint
November 19, 2014
Programming
5
300
Real World Ruby Performance
My talk from RubyConf 2014 about Ruby Performance and the philosophy of performance.
Aaron Quint
November 19, 2014
Tweet
Share
More Decks by Aaron Quint
See All by Aaron Quint
Beyond JSON: Improving Inter-app Communication
aq
0
230
Fast Everything: Ruby Performance Tools and Understanding
aq
4
620
The Good, The Bad, The Ugly of Growth
aq
0
320
Chromium Embedded Framework - Go + JS
aq
0
1.5k
The Future of Ruby Performance Tooling
aq
2
790
Working with Rubyists
aq
1
160
Correlation: The Next Frontier
aq
0
430
DevStackup
aq
4
160
Paperless Ops Chef Workflow
aq
1
230
Other Decks in Programming
See All in Programming
Flutter × Firebase Genkit で加速する生成 AI アプリ開発
coborinai
0
160
チームリードになって変わったこと
isaka1022
0
200
『品質』という言葉が嫌いな理由
korimu
0
160
さいきょうのレイヤードアーキテクチャについて考えてみた
yahiru
3
750
Amazon S3 TablesとAmazon S3 Metadataを触ってみた / 20250201-jawsug-tochigi-s3tables-s3metadata
kasacchiful
0
160
『GO』アプリ バックエンドサーバのコスト削減
mot_techtalk
0
140
2024年のWebフロントエンドのふりかえりと2025年
sakito
2
250
Pulsar2 を雰囲気で使ってみよう
anoken
0
240
富山発の個人開発サービスで日本中の学校の業務を改善した話
krpk1900
4
390
Ruby on cygwin 2025-02
fd0
0
140
メンテが命: PHPフレームワークのコンテナ化とアップグレード戦略
shunta27
0
120
Writing documentation can be fun with plugin system
okuramasafumi
0
120
Featured
See All Featured
Being A Developer After 40
akosma
89
590k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
4
410
Bootstrapping a Software Product
garrettdimon
PRO
306
110k
Templates, Plugins, & Blocks: Oh My! Creating the theme that thinks of everything
marktimemedia
30
2.2k
How STYLIGHT went responsive
nonsquared
98
5.4k
The Power of CSS Pseudo Elements
geoffreycrofte
75
5.5k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
49
2.3k
Rails Girls Zürich Keynote
gr2m
94
13k
Building Flexible Design Systems
yeseniaperezcruz
328
38k
Docker and Python
trallard
44
3.3k
Documentation Writing (for coders)
carmenintech
67
4.6k
Product Roadmaps are Hard
iamctodd
PRO
50
11k
Transcript
Real WORLD RUBY PERFORMANCE Aaron Quint / @aq / Ruby
Conf 2014
@tmm1 @SamSaffron @_ko1 SHOUTOUT
We’ll come back to who I am later. It’s [relatively]
unimportant. SKIPPING THE INTRO
I’ve learned so much over the past 5 years, what
could I share? This TALK was HARD TO WRITE
It’s a ⌘+C ⌘+P culture. TIPS And tricks are the
CLIFF NOTES of tech learning
How to THINK about a problem is much more interesting
than how to solve it. As a mentor I want to teach philosophy not snippets
The tools and tricks will change over time. Today, Take
away the process
A multi-step process. Ruby Performance as therapy
It’s a multi-step process Relax, Open up We’re going to
go deep
Step 1: Acceptance
It’s your Fault.
Really?
Yes.
None
It’s not you, It’s me.
It’s not you, It’s me.
— George Costanza (Inventor of “It’s not you, it’s me”)
It’s not you, It’s me.
Performance is about context
Doesn’t scale for what? To what degree? With what hardware?
… “X Doesn’t SCALE” IS BS
So when we talk about our ruby being slow
None
Rails
Rails 10ms
Rails Your application 10ms
Rails Your application DB 10ms
Rails Your application DB 10ms 20ms
Rails Your application DB Cache 10ms 20ms
Rails Your application DB Cache 10ms 20ms 10ms
Rails Your application DB Cache 10ms 20ms 10ms 250ms
IT’s MY FAULT.
Step 2: Diagnosis
Where did I go wrong?
METRICS! Measurement! MMMNUMBERS! Milliseconds MATTER!
Use the right one for the job. Tools abound!
Step 3: Treatment
what are the steps to fix this problem?
How many strokes for the lowest #? Playing golf.
Two angles of optimization
Proxies/Balancers Application Datastores Filesystem/OS/Hardware Individual Request Path (Controller#action)
aka, speeding up a single query, controller action, or code
path Vertical: Fix individual Elements
aka, Adding more workers per-node, buying better hardware Horizontal: Address
hardware or software across a cluster
Important Themes:
Context is crucial to acceptance
Visibility and Introspect- ability are crucial to diagnosis
Knowing your tools is crucial to treatment
I’m Aaron Quint. I’m the chief Scientist at Paperless Post.
None
Opposing forces. Features vs. speed
We realized that being fast meant being stable
CASE STUDIES in performance therapy
None
Case 1: JSON FOR DAYS
None
None
None
package:7292:1123434234234
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424 package:7292:1123434234234
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424 package:7292:1123434234234
Uncached performance is still a problem
ppprofiler to the rescue
ppprofiler
ppprofiler • Auto-cache toggling • Benchmark • Rblineprof • As::Notification
Counts (SQL/Cache, etc) • MemoryProfiler (NEW!) • Gist-able (markdown) output
None
None
None
Rinse and Repeat Make the slowest lines faster
None
None
None
None
None
Case 2: FINGER IN THE SOCKET
Before Vday we were looking for any wins
IN BETWEEN THE LINES! stackprof + stackprof-remote
None
Ruby Process (Unicorn)
Ruby Process (Unicorn)
Ruby Process (Unicorn)
Ruby Process (Unicorn) AC::Dispatch
Ruby Process (Unicorn) AC::Dispatch MyController::Create
Ruby Process (Unicorn) AC::Dispatch MyController::Create Template::Render
Ruby Process (Unicorn) AC::Dispatch MyController::Create Template::Render Ar::Find
Ruby Process (Unicorn)
Ruby Process (Unicorn) StackProf.start rb_profile_frames() rb_profile_frames() rb_profile_frames() rb_profile_frames() StackProf.stop StackProf.dump
! [paperless@production-webapp10 current]$ stackprof tmp/stackprof-cpu-30715-1391204970.dump ================================== Mode: cpu(1000) Samples: 1761
(3.61% miss rate) GC: 128 (7.27%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 344 (19.5%) 342 (19.4%) Statsd#send_to_socket 393 (22.3%) 44 (2.5%) Statsd#sampled 44 (2.5%) 44 (2.5%) block in ActiveRecord::ConnectionAdapters::PostgreSQLPoolAdapter#execute 56 (3.2%) 29 (1.6%) block in ActiveSupport::Notifications::Fanout#listeners_for 29 (1.6%) 29 (1.6%) ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#extract_pg_identifier_from_name 26 (1.5%) 26 (1.5%) ActiveSupport::Notifications::Fanout::Subscribers::Evented#subscribed_to? 25 (1.4%) 25 (1.4%) String#blank? 25 (1.4%) 25 (1.4%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#select 24 (1.4%) 24 (1.4%) ActiveRecord::Base.scoped_methods 22 (1.2%) 22 (1.2%) Dalli::Server::KSocket#kgio_wait_readable 21 (1.2%) 21 (1.2%) ActiveSupport::CoreExtensions::Hash::Keys#assert_valid_keys 42 (2.4%) 20 (1.1%) block in Dalli::Server::KSocket#readfull 28 (1.6%) 19 (1.1%) ActiveRecord::ConnectionAdapters::ConnectionHandler#retrieve_connection_pool 18 (1.0%) 18 (1.0%) #<Module:0x00000002004b08>.instrumenter 17 (1.0%) 16 (0.9%) Dalli::Server#deserialize 15 (0.9%) 15 (0.9%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#select_raw 14 (0.8%) 14 (0.8%) #<Module:0x000000033e25d0>.decode_www_form_component 13 (0.7%) 13 (0.7%) Dalli::Server#write 15 (0.9%) 11 (0.6%) ActiveSupport::CoreExtensions::Time::Calculations#minus_with_coercion 10 (0.6%) 10 (0.6%) block in ActiveRecord::Base.with_scope 10 (0.6%) 10 (0.6%) block in ActiveRecord::ConnectionAdapters::QueryCache#cache_sql 21 (1.2%) 10 (0.6%) Yajl::Encoder.encode 10 (0.6%) 10 (0.6%) Set#add 10 (0.6%) 10 (0.6%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#result_as_array 10 (0.6%) 10 (0.6%) ActiveSupport::CoreExtensions::Time::Calculations::ClassMethods#time_with_datetime_fallback 9 (0.5%) 9 (0.5%) ActiveRecord::DynamicFinderMatch#initialize 9 (0.5%) 9 (0.5%) ActiveSupport::LogSubscriber.logger 9 (0.5%) 9 (0.5%) block in ActionController::Base.action_methods 9 (0.5%) 9 (0.5%) block in ActionController::Base.action_methods 9 (0.5%) 9 (0.5%) block (2 levels) in ActiveRecord::Base.connection_handler=
Hmm, why is statsd slow?
Pull out good old benchmark
$ ruby test/profile/statsd.rb user system total real udp with connect
0.010000 0.000000 0.010000 ( 0.074522) udp without connect 0.120000 0.530000 0.650000 ( 13.096515) statsd with connect 0.000000 0.090000 0.090000 ( 0.103520) statsd without connect 0.100000 0.620000 0.720000 ( 13.483539)
WIN!
None
Case 3: THE HOLIDAY SCALE
None
Some times you can throw money at the problem
None
Case 4: SHRINKING THE GAP
Start at the top, work your way down. Starting with
a HITLIST
Number of Requests x 90th Percentile Response Time Total Time
None
None
Using Stackprof flamegraphs on production.
Using Stackprof flamegraphs on production. SET IT ON FIRE!
None
None
None
None
None
Big wins are not the point
If you’re not failing you’re not being honest
Don’t just make tools, learn to use them
twitter: @aq github.com/quirkey github.com/paperlesspost Thanks!