Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Square Stays Up
Search
pui
July 18, 2012
Technology
3
280
How Square Stays Up
I talk I gave on the tools and processes Square uses to stay stable and available
pui
July 18, 2012
Tweet
Share
Other Decks in Technology
See All in Technology
M&A 後の統合をどう進めるか ─ ナレッジワーク × Poetics が実践した組織とシステムの融合
kworkdev
PRO
1
420
AI駆動開発を事業のコアに置く
tasukuonizawa
1
130
Digitization部 紹介資料
sansan33
PRO
1
6.8k
MCPでつなぐElasticsearchとLLM - 深夜の障害対応を楽にしたい / Bridging Elasticsearch and LLMs with MCP
sashimimochi
0
150
会社紹介資料 / Sansan Company Profile
sansan33
PRO
15
400k
Bedrock PolicyでAmazon Bedrock Guardrails利用を強制してみた
yuu551
0
190
StrandsとNeptuneを使ってナレッジグラフを構築する
yakumo
1
100
顧客との商談議事録をみんなで読んで顧客解像度を上げよう
shibayu36
0
210
Data Hubグループ 紹介資料
sansan33
PRO
0
2.7k
マーケットプレイス版Oracle WebCenter Content For OCI
oracle4engineer
PRO
5
1.6k
Claude_CodeでSEOを最適化する_AI_Ops_Community_Vol.2__マーケティングx_AIはここまで進化した.pdf
riku_423
2
530
セキュリティについて学ぶ会 / 2026 01 25 Takamatsu WordPress Meetup
rocketmartue
1
300
Featured
See All Featured
Why Your Marketing Sucks and What You Can Do About It - Sophie Logan
marketingsoph
0
72
How to audit for AI Accessibility on your Front & Back End
davetheseo
0
180
Marketing to machines
jonoalderson
1
4.6k
Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS
baasie
0
220
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
160
HDC tutorial
michielstock
1
370
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2k
Java REST API Framework Comparison - PWX 2021
mraible
34
9.1k
The Organizational Zoo: Understanding Human Behavior Agility Through Metaphoric Constructive Conversations (based on the works of Arthur Shelley, Ph.D)
kimpetersen
PRO
0
240
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
2
180
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
1k
Transcript
How Square Stays Up Tools and Processes Square Uses to
Maintain Stability and Availability
@pui_ling Erica Kwan
1 2 3 4 Developing Deploying Monitoring On-calling
Developing 1
We pair program (sometimes)
We solo, then get a code review (other times)
Why?
PCI Compliance Read all about it: http://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard
It is also good practice
git checkout -b topic-branch do work* git checkout master git
merge --no-ff topic-branch
A clean commit history helps
A super good git workflow: http://sandofsky.com/blog/git-workflow.html
git rebase --interactive
git rebase protip: config rebase.autosquash = true
git commit -m “squash! Monkeys”
pick 8374d8e Monkeys squash 8374d8e squash! Monkeys pick 259a7e6 Better
monkeys
Deploying 2
We deploy lots
but there are processes around deploys
Some history
We do canary deploys
None
Our full deploys do rolling restarts
And automatically run integration tests
Monitoring 3
We use common monitoring tools
We have application level checks
We have custom metrics dashboards
Graphite (whisper) + Cubism.js http://square.github.com/cubism/ http://d3js.org/ More info:
Horizon Graph http://vis.berkeley.edu/papers/horizon/
None
On-Calling 4
Engineers are responsible for their work
Ad-hoc at first
First real on-call rotations were simple
Original escalation path:
Engineer 1
Engineer 2
@jack
General on-call could not be responsible for everything
Now, every engineering team has an on-call rotation
Process is still evolving
Do these 4 things well all the time
@pui_ling /pui