Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Scaling Rails for Black Friday / Cyber Monday a...
Search
Christian Joudrey
February 18, 2015
Technology
6
5.8k
Scaling Rails for Black Friday / Cyber Monday at Shopify
Talk given at ConFoo 2015 on February 18th, 2015 and RailsConf 2015.
Christian Joudrey
February 18, 2015
Tweet
Share
More Decks by Christian Joudrey
See All by Christian Joudrey
Writing NES games! with assembly!!
cjoudrey
1
740
Developing at Scale
cjoudrey
3
480
Tips and Tricks from Shopify's codebase
cjoudrey
2
560
Scaling Shopify
cjoudrey
3
530
#pairwithme
cjoudrey
3
250
Two-factor authentication
cjoudrey
4
390
Automate your Infrastructure with Chef
cjoudrey
9
610
Other Decks in Technology
See All in Technology
LLMで構造化出力の成功率をグンと上げる方法
keisuketakiguchi
0
770
猫でもわかるQ_CLI(CDK開発編)+ちょっとだけKiro
kentapapa
0
3.4k
人に寄り添うAIエージェントとアーキテクチャ #BetAIDay
layerx
PRO
9
2.2k
Claude Codeから我々が学ぶべきこと
oikon48
10
2.8k
オブザーバビリティプラットフォーム開発におけるオブザーバビリティとの向き合い / Hatena Engineer Seminar #34 オブザーバビリティの実現と運用編
arthur1
0
380
生成AI時代におけるAI・機械学習技術を用いたプロダクト開発の深化と進化 #BetAIDay
layerx
PRO
1
1.2k
相互運用可能な学修歴クレデンシャルに向けた標準技術と国際動向
fujie
0
240
Backlog AI アシスタントが切り開く未来
vvatanabe
1
130
Amazon GuardDuty での脅威検出:脅威検出の実例から学ぶ
kintotechdev
0
100
20250807 Applied Engineer Open House
sakana_ai
PRO
0
210
AWS re:Inforce 2025 re:Cap Update Pickup & AWS Control Tower の運用における考慮ポイント
htan
1
240
React Server ComponentsでAPI不要の開発体験
polidog
PRO
0
210
Featured
See All Featured
YesSQL, Process and Tooling at Scale
rocio
173
14k
Measuring & Analyzing Core Web Vitals
bluesmoon
8
550
Raft: Consensus for Rubyists
vanstee
140
7.1k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
8
750
Optimizing for Happiness
mojombo
379
70k
The Language of Interfaces
destraynor
158
25k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
60k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
50k
KATA
mclloyd
32
14k
Speed Design
sergeychernyshev
32
1.1k
Designing for Performance
lara
610
69k
The Cult of Friendly URLs
andyhume
79
6.5k
Transcript
scaling rails for Black Friday Cyber Monday
cjoudrey @
None
None
None
None
None
the stack
nginx unicorn • rails 4 • mysql 5.6 (percona) ruby
2.1 •
95 app servers 1,800 unicorn workers 18 job servers 1,400
job workers
scale?
None
400,000 reqs/min
3.7B$ annual GMV that’s 7,000$ per min
Black Friday Cyber Monday
Black Friday Cyber FUNday
None
~ 600,000 reqs/min
1 request 1 process =
scale++ ↓ resp. time ↑ workers
page caching
None
None
shopify/cacheable
generational caching
gzip • etag + 304 not modified
class PostsController < ApplicationController def index response_cache do @posts =
@shop.posts.paginate(params[:page]) respond_with(@posts) end end def cache_key_data { shop_id: @shop.id, path: request.path, format: request.format, params: params.slice(:page), shop_version: @shop.version } end end
md5( { shop_id: 1, path: '/posts', format: 'text/html', params: {
page: 2 }, shop_version: 123 }.to_s ) GET /posts?page=2
None
sale
query caching
shopify/identity_cache
full model caching
opt-in by design
after_commit expiry
class Product < ActiveRecord::Base include IdentityCache has_many :images cache_has_many :images,
:embed => true end product = Product.fetch(id) images = product.fetch_images
class Product < ActiveRecord::Base include IdentityCache cache_index :shop_id, :handle, :unique
=> true end Product.fetch_by_shop_id_and_handle(shop_id, handle)
None
sale
background jobs
webhooks emails • fraud detection • payment processing
None
priority queues payment • default • low realtime •
throttling
now what?
measure it! if it moves...
statsd
shopify/statsd-instrument
Liquid::Template.extend StatsD::Instrument Liquid::Template.statsd_measure :render, 'Liquid.Template.render'
PaymentProcessingJob.statsd_count :perform, 'PaymentProcessingJob.processed'
None
None
load testing
genghis khan
simulate Black Friday Cyber Monday before it happens
several times per week
slow queries
# User@Host: shopify[shopify] @ [127.0.0.1] # Thread_id: 264419969 Schema: shopify
Last_errno: 0 Killed: 0 # Query_time: 0.150491 Lock_time: 0.000057 Rows_sent: 1 Rows_examined: 147841 Rows_affected: 0 Rows_read: 147841 # Bytes_sent: 1214 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0 # InnoDB_trx_id: FF7021AAA # QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No # Filesort: Yes Filesort_on_disk: No Merge_passes: 0 # InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000 # InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000 # InnoDB_pages_distinct: 475 SET timestamp=1393385020; SELECT `discounts`.* FROM `discounts` WHERE `discounts`.`shop_id` = 1745470 AND `discounts`.`status` = 'enabled' ORDER BY ISNULL(ends_at) DESC, ends_at DESC LIMIT 1
determining root cause
https://github.com/newobj/nginx-x-rid-header nginx request_id header proxy_set_header X-Request-ID "$request_id"; log_format main '...
$request_id' step 1
https://gist.github.com/mnutt/566725 Complete 200 OK in 100ms (Views: 60ms | ActiveRecord:
40ms | request_id=bc12813bce...) log_process_action ActionController::Instrumentation step 2
https://github.com/basecamp/marginalia User Load (0.3ms) SELECT `users`.* FROM `users` WHERE `users`.`id`
= 1 LIMIT 1 /*application:Shopify, controller:users,action:show, request_id:bc12813bce...*/ basecamp/marginalia step 3
# User@Host: shopify[shopify] @ [127.0.0.1] # Thread_id: 264419969 Schema: shopify
Last_errno: 0 Killed: 0 # Query_time: 0.150491 Lock_time: 0.000057 Rows_sent: 1 Rows_examined: 147841 Rows_affected: 0 Rows_read: 147841 # Bytes_sent: 1214 Tmp_tables: 0 Tmp_disk_tables: 0 Tmp_table_sizes: 0 # InnoDB_trx_id: FF7021AAA # QC_Hit: No Full_scan: No Full_join: No Tmp_table: No Tmp_table_on_disk: No # Filesort: Yes Filesort_on_disk: No Merge_passes: 0 # InnoDB_IO_r_ops: 0 InnoDB_IO_r_bytes: 0 InnoDB_IO_r_wait: 0.000000 # InnoDB_rec_lock_wait: 0.000000 InnoDB_queue_wait: 0.000000 # InnoDB_pages_distinct: 475 SET timestamp=1393385020; SELECT `discounts`.* FROM `discounts` WHERE `discounts`.`shop_id` = 1745470 AND `discounts`.`status` = 'enabled' ORDER BY ISNULL(ends_at) DESC, ends_at DESC LIMIT 1 /*application:Shopify,controller:orders,action:pay, request_id:bc12813bce...*/ profit!
access.log rails.log slow_query.log profit! (2)
bonus! background jobs
schema migration with zero downtime
soundcloud/lhm
class NewIndexOnOrders < ActiveRecord::Migration def self.up Lhm.change_table :orders do |m|
m.add_index [:shop_id, :customer_id] end end def self.down # end end
orders 20140520_orders
insert/delete/update triggers
INSERT INTO ... SELECT ... insert/delete/update triggers
async caveat
resiliency
A resilient system is one that functions with one or
more components being unavailable or unacceptably slow. -
don’t let a take you down minor dependencies
shopify/toxiproxy ☢ redis memcached sessions your app toxiproxy
def load_customer if customer_id = session[:customer_id] @customer = Customer.find_by_id(customer_id) end
end
def load_customer if customer_id = session[:customer_id] @customer = Customer.find_by_id(customer_id) end
rescue Sessions::DataStoreUnavailable @customer = nil end
def test_storefront_resilient_to_sessions_down Toxiproxy[:sessions_data_store].down do get '/' assert_response :success end end
rinse & repeat http://www.shopify.com/technology/16906928-building-and-testing-resilient-ruby-on-rails-applications
what about resources? slow
shard 2 shard 3 shard 1 shop 4, 5, 6
shop 7, 8, 9 shop 1, 2, 3
rails request shard 2 shard 3 shard 1 shop 4,
5, 6 shop 7, 8, 9 shop 1, 2, 3
rails request shard 2 shard 3 shard 1 shop 4,
5, 6 shop 7, 8, 9 shop 1, 2, 3
how can we fail fast?
shopify/semian smart circuit-breaker
shopify/semian Semian.register(:mysql_shard_1, tickets: 5, timeout: 0.5, error_threshold: 100, error_timeout: 10,
success_threshold: 2)
shopify/semian Semian[:mysql_shard_1].acquire do # Query the resource end
rails request shard 2 shard 3 shard 1 shop 4,
5, 6 shop 7, 8, 9 shop 1, 2, 3
what else can go wrong?
Shopify Shipping rate providers Payment gateways Fulfillment services (FedEX, UPS,
USPS, etc..) (Stripe, PayPal, etc..) (Shipwire, etc…) Internal services (MySQL, Memcached, etc..)
manual circuit breakers around external dependencies
thanks! :)