Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Square Stays Up
Search
pui
July 18, 2012
Technology
280
3
Share
How Square Stays Up
I talk I gave on the tools and processes Square uses to stay stable and available
pui
July 18, 2012
Other Decks in Technology
See All in Technology
名刺メーカーDevグループ 紹介資料
sansan33
PRO
0
1.1k
研究開発部メンバーの働き⽅ / Sansan R&D Profile
sansan33
PRO
4
23k
こんなアーキテクチャ図はいやだ / Anti-pattern in AWS Architecture Diagrams
naospon
1
440
Practical TypeProf: Lessons from Analyzing Optcarrot
mame
0
250
AI バイブコーティングでキーボード不要?!
samakada
0
530
Shipping AI Agents — Lessons from Production
vvatanabe
0
180
目的ファーストのハーネス設計 ~ハーネスの変更容易性を高めるための優先順位~
gotalab555
8
2.1k
AWS Agent Registry の基礎・概要を理解する/aws-agent-registry-intro
ren8k
3
370
AzureのIaC管理からログ調査まで、随所に役立つSkillsとCustom-Instructions / Boosting IaC and Log Analysis with Skills
aeonpeople
0
220
AI와 협업하는 조직으로의 여정
arawn
0
230
60分で学ぶ最新Webフロントエンド
mizdra
PRO
35
18k
2026年、知っておくべき最新 サーバレスTips10選/serverless-10-tips
slsops
13
5.2k
Featured
See All Featured
From π to Pie charts
rasagy
0
160
The Art of Programming - Codeland 2020
erikaheidi
57
14k
BBQ
matthewcrist
89
10k
Chasing Engaging Ingredients in Design
codingconduct
0
170
The Illustrated Children's Guide to Kubernetes
chrisshort
51
52k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
250
1.3M
職位にかかわらず全員がリーダーシップを発揮するチーム作り / Building a team where everyone can demonstrate leadership regardless of position
madoxten
62
53k
Odyssey Design
rkendrick25
PRO
2
570
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
200
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
420
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
260
Leveraging LLMs for student feedback in introductory data science courses - posit::conf(2025)
minecr
1
230
Transcript
How Square Stays Up Tools and Processes Square Uses to
Maintain Stability and Availability
@pui_ling Erica Kwan
1 2 3 4 Developing Deploying Monitoring On-calling
Developing 1
We pair program (sometimes)
We solo, then get a code review (other times)
Why?
PCI Compliance Read all about it: http://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard
It is also good practice
git checkout -b topic-branch do work* git checkout master git
merge --no-ff topic-branch
A clean commit history helps
A super good git workflow: http://sandofsky.com/blog/git-workflow.html
git rebase --interactive
git rebase protip: config rebase.autosquash = true
git commit -m “squash! Monkeys”
pick 8374d8e Monkeys squash 8374d8e squash! Monkeys pick 259a7e6 Better
monkeys
Deploying 2
We deploy lots
but there are processes around deploys
Some history
We do canary deploys
None
Our full deploys do rolling restarts
And automatically run integration tests
Monitoring 3
We use common monitoring tools
We have application level checks
We have custom metrics dashboards
Graphite (whisper) + Cubism.js http://square.github.com/cubism/ http://d3js.org/ More info:
Horizon Graph http://vis.berkeley.edu/papers/horizon/
None
On-Calling 4
Engineers are responsible for their work
Ad-hoc at first
First real on-call rotations were simple
Original escalation path:
Engineer 1
Engineer 2
@jack
General on-call could not be responsible for everything
Now, every engineering team has an on-call rotation
Process is still evolving
Do these 4 things well all the time
@pui_ling /pui