Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Real World Ruby Performance
Search
Aaron Quint
November 19, 2014
Programming
5
280
Real World Ruby Performance
My talk from RubyConf 2014 about Ruby Performance and the philosophy of performance.
Aaron Quint
November 19, 2014
Tweet
Share
More Decks by Aaron Quint
See All by Aaron Quint
Beyond JSON: Improving Inter-app Communication
aq
0
220
Fast Everything: Ruby Performance Tools and Understanding
aq
4
590
The Good, The Bad, The Ugly of Growth
aq
0
300
Chromium Embedded Framework - Go + JS
aq
0
1.5k
The Future of Ruby Performance Tooling
aq
2
740
Working with Rubyists
aq
1
150
Correlation: The Next Frontier
aq
0
410
DevStackup
aq
4
160
Paperless Ops Chef Workflow
aq
1
210
Other Decks in Programming
See All in Programming
色々なIaCツールを実際に触って比較してみる
iriikeita
0
330
ふかぼれ!CSSセレクターモジュール / Fukabore! CSS Selectors Module
petamoriken
0
150
3 Effective Rules for Using Signals in Angular
manfredsteyer
PRO
0
110
EventSourcingの理想と現実
wenas
6
2.3k
弊社の「意識チョット低いアーキテクチャ」10選
texmeijin
5
24k
Webの技術スタックで マルチプラットフォームアプリ開発を可能にするElixirDesktopの紹介
thehaigo
2
1k
アジャイルを支えるテストアーキテクチャ設計/Test Architecting for Agile
goyoki
9
3.3k
Generative AI Use Cases JP (略称:GenU)奮闘記
hideg
1
290
ピラミッド、アイスクリームコーン、SMURF: 自動テストの最適バランスを求めて / Pyramid Ice-Cream-Cone and SMURF
twada
PRO
10
1.3k
『ドメイン駆動設計をはじめよう』のモデリングアプローチ
masuda220
PRO
8
540
Tauriでネイティブアプリを作りたい
tsucchinoko
0
370
エンジニアとして関わる要件と仕様(公開用)
murabayashi
0
290
Featured
See All Featured
Being A Developer After 40
akosma
86
590k
Into the Great Unknown - MozCon
thekraken
32
1.5k
Designing for humans not robots
tammielis
250
25k
Building Applications with DynamoDB
mza
90
6.1k
A Modern Web Designer's Workflow
chriscoyier
693
190k
It's Worth the Effort
3n
183
27k
What’s in a name? Adding method to the madness
productmarketing
PRO
22
3.1k
Facilitating Awesome Meetings
lara
50
6.1k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
250
21k
YesSQL, Process and Tooling at Scale
rocio
169
14k
Done Done
chrislema
181
16k
Designing the Hi-DPI Web
ddemaree
280
34k
Transcript
Real WORLD RUBY PERFORMANCE Aaron Quint / @aq / Ruby
Conf 2014
@tmm1 @SamSaffron @_ko1 SHOUTOUT
We’ll come back to who I am later. It’s [relatively]
unimportant. SKIPPING THE INTRO
I’ve learned so much over the past 5 years, what
could I share? This TALK was HARD TO WRITE
It’s a ⌘+C ⌘+P culture. TIPS And tricks are the
CLIFF NOTES of tech learning
How to THINK about a problem is much more interesting
than how to solve it. As a mentor I want to teach philosophy not snippets
The tools and tricks will change over time. Today, Take
away the process
A multi-step process. Ruby Performance as therapy
It’s a multi-step process Relax, Open up We’re going to
go deep
Step 1: Acceptance
It’s your Fault.
Really?
Yes.
None
It’s not you, It’s me.
It’s not you, It’s me.
— George Costanza (Inventor of “It’s not you, it’s me”)
It’s not you, It’s me.
Performance is about context
Doesn’t scale for what? To what degree? With what hardware?
… “X Doesn’t SCALE” IS BS
So when we talk about our ruby being slow
None
Rails
Rails 10ms
Rails Your application 10ms
Rails Your application DB 10ms
Rails Your application DB 10ms 20ms
Rails Your application DB Cache 10ms 20ms
Rails Your application DB Cache 10ms 20ms 10ms
Rails Your application DB Cache 10ms 20ms 10ms 250ms
IT’s MY FAULT.
Step 2: Diagnosis
Where did I go wrong?
METRICS! Measurement! MMMNUMBERS! Milliseconds MATTER!
Use the right one for the job. Tools abound!
Step 3: Treatment
what are the steps to fix this problem?
How many strokes for the lowest #? Playing golf.
Two angles of optimization
Proxies/Balancers Application Datastores Filesystem/OS/Hardware Individual Request Path (Controller#action)
aka, speeding up a single query, controller action, or code
path Vertical: Fix individual Elements
aka, Adding more workers per-node, buying better hardware Horizontal: Address
hardware or software across a cluster
Important Themes:
Context is crucial to acceptance
Visibility and Introspect- ability are crucial to diagnosis
Knowing your tools is crucial to treatment
I’m Aaron Quint. I’m the chief Scientist at Paperless Post.
None
Opposing forces. Features vs. speed
We realized that being fast meant being stable
CASE STUDIES in performance therapy
None
Case 1: JSON FOR DAYS
None
None
None
package:7292:1123434234234
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424 package:7292:1123434234234
package:7290:11234342343424 partner:8:11234342343424 partner:8:11234342343424 package:7292:1123434234234
Uncached performance is still a problem
ppprofiler to the rescue
ppprofiler
ppprofiler • Auto-cache toggling • Benchmark • Rblineprof • As::Notification
Counts (SQL/Cache, etc) • MemoryProfiler (NEW!) • Gist-able (markdown) output
None
None
None
Rinse and Repeat Make the slowest lines faster
None
None
None
None
None
Case 2: FINGER IN THE SOCKET
Before Vday we were looking for any wins
IN BETWEEN THE LINES! stackprof + stackprof-remote
None
Ruby Process (Unicorn)
Ruby Process (Unicorn)
Ruby Process (Unicorn)
Ruby Process (Unicorn) AC::Dispatch
Ruby Process (Unicorn) AC::Dispatch MyController::Create
Ruby Process (Unicorn) AC::Dispatch MyController::Create Template::Render
Ruby Process (Unicorn) AC::Dispatch MyController::Create Template::Render Ar::Find
Ruby Process (Unicorn)
Ruby Process (Unicorn) StackProf.start rb_profile_frames() rb_profile_frames() rb_profile_frames() rb_profile_frames() StackProf.stop StackProf.dump
! [paperless@production-webapp10 current]$ stackprof tmp/stackprof-cpu-30715-1391204970.dump ================================== Mode: cpu(1000) Samples: 1761
(3.61% miss rate) GC: 128 (7.27%) ================================== TOTAL (pct) SAMPLES (pct) FRAME 344 (19.5%) 342 (19.4%) Statsd#send_to_socket 393 (22.3%) 44 (2.5%) Statsd#sampled 44 (2.5%) 44 (2.5%) block in ActiveRecord::ConnectionAdapters::PostgreSQLPoolAdapter#execute 56 (3.2%) 29 (1.6%) block in ActiveSupport::Notifications::Fanout#listeners_for 29 (1.6%) 29 (1.6%) ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#extract_pg_identifier_from_name 26 (1.5%) 26 (1.5%) ActiveSupport::Notifications::Fanout::Subscribers::Evented#subscribed_to? 25 (1.4%) 25 (1.4%) String#blank? 25 (1.4%) 25 (1.4%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#select 24 (1.4%) 24 (1.4%) ActiveRecord::Base.scoped_methods 22 (1.2%) 22 (1.2%) Dalli::Server::KSocket#kgio_wait_readable 21 (1.2%) 21 (1.2%) ActiveSupport::CoreExtensions::Hash::Keys#assert_valid_keys 42 (2.4%) 20 (1.1%) block in Dalli::Server::KSocket#readfull 28 (1.6%) 19 (1.1%) ActiveRecord::ConnectionAdapters::ConnectionHandler#retrieve_connection_pool 18 (1.0%) 18 (1.0%) #<Module:0x00000002004b08>.instrumenter 17 (1.0%) 16 (0.9%) Dalli::Server#deserialize 15 (0.9%) 15 (0.9%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#select_raw 14 (0.8%) 14 (0.8%) #<Module:0x000000033e25d0>.decode_www_form_component 13 (0.7%) 13 (0.7%) Dalli::Server#write 15 (0.9%) 11 (0.6%) ActiveSupport::CoreExtensions::Time::Calculations#minus_with_coercion 10 (0.6%) 10 (0.6%) block in ActiveRecord::Base.with_scope 10 (0.6%) 10 (0.6%) block in ActiveRecord::ConnectionAdapters::QueryCache#cache_sql 21 (1.2%) 10 (0.6%) Yajl::Encoder.encode 10 (0.6%) 10 (0.6%) Set#add 10 (0.6%) 10 (0.6%) block (2 levels) in ActiveRecord::ConnectionAdapters::PostgreSQLAdapter#result_as_array 10 (0.6%) 10 (0.6%) ActiveSupport::CoreExtensions::Time::Calculations::ClassMethods#time_with_datetime_fallback 9 (0.5%) 9 (0.5%) ActiveRecord::DynamicFinderMatch#initialize 9 (0.5%) 9 (0.5%) ActiveSupport::LogSubscriber.logger 9 (0.5%) 9 (0.5%) block in ActionController::Base.action_methods 9 (0.5%) 9 (0.5%) block in ActionController::Base.action_methods 9 (0.5%) 9 (0.5%) block (2 levels) in ActiveRecord::Base.connection_handler=
Hmm, why is statsd slow?
Pull out good old benchmark
$ ruby test/profile/statsd.rb user system total real udp with connect
0.010000 0.000000 0.010000 ( 0.074522) udp without connect 0.120000 0.530000 0.650000 ( 13.096515) statsd with connect 0.000000 0.090000 0.090000 ( 0.103520) statsd without connect 0.100000 0.620000 0.720000 ( 13.483539)
WIN!
None
Case 3: THE HOLIDAY SCALE
None
Some times you can throw money at the problem
None
Case 4: SHRINKING THE GAP
Start at the top, work your way down. Starting with
a HITLIST
Number of Requests x 90th Percentile Response Time Total Time
None
None
Using Stackprof flamegraphs on production.
Using Stackprof flamegraphs on production. SET IT ON FIRE!
None
None
None
None
None
Big wins are not the point
If you’re not failing you’re not being honest
Don’t just make tools, learn to use them
twitter: @aq github.com/quirkey github.com/paperlesspost Thanks!