Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Square Stays Up
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
pui
July 18, 2012
Technology
3
280
How Square Stays Up
I talk I gave on the tools and processes Square uses to stay stable and available
pui
July 18, 2012
Tweet
Share
Other Decks in Technology
See All in Technology
ClickHouseはどのように大規模データを活用したAIエージェントを全社展開しているのか
mikimatsumoto
0
220
SREチームをどう作り、どう育てるか ― Findy横断SREのマネジメント
rvirus0817
0
200
MCPでつなぐElasticsearchとLLM - 深夜の障害対応を楽にしたい / Bridging Elasticsearch and LLMs with MCP
sashimimochi
0
160
10Xにおける品質保証活動の全体像と改善 #no_more_wait_for_test
nihonbuson
PRO
2
230
All About Sansan – for New Global Engineers
sansan33
PRO
1
1.3k
SREじゃなかった僕らがenablingを通じて「SRE実践者」になるまでのリアル / SRE Kaigi 2026
aeonpeople
6
2.3k
AI駆動PjMの理想像 と現在地 -実践例を添えて-
masahiro_okamura
1
110
[CV勉強会@関東 World Model 読み会] Orbis: Overcoming Challenges of Long-Horizon Prediction in Driving World Models (Mousakhan+, NeurIPS 2025)
abemii
0
130
Introduction to Sansan for Engineers / エンジニア向け会社紹介
sansan33
PRO
6
68k
Amazon Bedrock Knowledge Basesチャンキング解説!
aoinoguchi
0
130
フルカイテン株式会社 エンジニア向け採用資料
fullkaiten
0
10k
超初心者からでも大丈夫!オープンソース半導体の楽しみ方〜今こそ!オレオレチップをつくろう〜
keropiyo
0
110
Featured
See All Featured
Statistics for Hackers
jakevdp
799
230k
How to build a perfect <img>
jonoalderson
1
4.9k
VelocityConf: Rendering Performance Case Studies
addyosmani
333
24k
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
0
270
Data-driven link building: lessons from a $708K investment (BrightonSEO talk)
szymonslowik
1
910
Skip the Path - Find Your Career Trail
mkilby
0
54
Rebuilding a faster, lazier Slack
samanthasiow
85
9.4k
The Spectacular Lies of Maps
axbom
PRO
1
520
Designing Powerful Visuals for Engaging Learning
tmiket
0
230
Code Reviewing Like a Champion
maltzj
527
40k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.4k
Visual Storytelling: How to be a Superhuman Communicator
reverentgeek
2
430
Transcript
How Square Stays Up Tools and Processes Square Uses to
Maintain Stability and Availability
@pui_ling Erica Kwan
1 2 3 4 Developing Deploying Monitoring On-calling
Developing 1
We pair program (sometimes)
We solo, then get a code review (other times)
Why?
PCI Compliance Read all about it: http://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard
It is also good practice
git checkout -b topic-branch do work* git checkout master git
merge --no-ff topic-branch
A clean commit history helps
A super good git workflow: http://sandofsky.com/blog/git-workflow.html
git rebase --interactive
git rebase protip: config rebase.autosquash = true
git commit -m “squash! Monkeys”
pick 8374d8e Monkeys squash 8374d8e squash! Monkeys pick 259a7e6 Better
monkeys
Deploying 2
We deploy lots
but there are processes around deploys
Some history
We do canary deploys
None
Our full deploys do rolling restarts
And automatically run integration tests
Monitoring 3
We use common monitoring tools
We have application level checks
We have custom metrics dashboards
Graphite (whisper) + Cubism.js http://square.github.com/cubism/ http://d3js.org/ More info:
Horizon Graph http://vis.berkeley.edu/papers/horizon/
None
On-Calling 4
Engineers are responsible for their work
Ad-hoc at first
First real on-call rotations were simple
Original escalation path:
Engineer 1
Engineer 2
@jack
General on-call could not be responsible for everything
Now, every engineering team has an on-call rotation
Process is still evolving
Do these 4 things well all the time
@pui_ling /pui