Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Square Stays Up
Search
pui
July 18, 2012
Technology
3
280
How Square Stays Up
I talk I gave on the tools and processes Square uses to stay stable and available
pui
July 18, 2012
Tweet
Share
Other Decks in Technology
See All in Technology
2025年夏 コーディングエージェントを統べる者
nwiizo
0
120
生成AI時代のデータ基盤
shibuiwilliam
6
3.8k
Autonomous Database - Dedicated 技術詳細 / adb-d_technical_detail_jp
oracle4engineer
PRO
4
10k
生成AI時代のデータ基盤設計〜ペースレイヤリングで実現する高速開発と持続性〜 / Levtech Meetup_Session_2
sansan_randd
1
140
シークレット管理だけじゃない!HashiCorp Vault でデータ暗号化をしよう / Beyond Secret Management! Let's Encrypt Data with HashiCorp Vault
nnstt1
3
230
Nstockの一人目エンジニアが 3年間かけて向き合ってきた セキュリティのこととこれから〜あれから半年〜
yo41sawada
0
210
ChatGPTとPlantUML/Mermaidによるソフトウェア設計
gowhich501
1
120
未経験者・初心者に贈る!40分でわかるAndroidアプリ開発の今と大事なポイント
operando
2
160
フィンテック養成勉強会#56
finengine
0
130
AWS環境のリソース調査を Claude Code で効率化 / aws investigate with cc devio2025
masahirokawahara
2
1.4k
品質視点から考える組織デザイン/Organizational Design from Quality
mii3king
0
150
COVESA VSSによる車両データモデルの標準化とAWS IoT FleetWiseの活用
osawa
1
220
Featured
See All Featured
The World Runs on Bad Software
bkeepers
PRO
70
11k
Fireside Chat
paigeccino
39
3.6k
The Cult of Friendly URLs
andyhume
79
6.6k
Build your cross-platform service in a week with App Engine
jlugia
231
18k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.5k
Building a Modern Day E-commerce SEO Strategy
aleyda
43
7.5k
A Tale of Four Properties
chriscoyier
160
23k
Writing Fast Ruby
sferik
628
62k
Java REST API Framework Comparison - PWX 2021
mraible
33
8.8k
Speed Design
sergeychernyshev
32
1.1k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
61k
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.1k
Transcript
How Square Stays Up Tools and Processes Square Uses to
Maintain Stability and Availability
@pui_ling Erica Kwan
1 2 3 4 Developing Deploying Monitoring On-calling
Developing 1
We pair program (sometimes)
We solo, then get a code review (other times)
Why?
PCI Compliance Read all about it: http://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard
It is also good practice
git checkout -b topic-branch do work* git checkout master git
merge --no-ff topic-branch
A clean commit history helps
A super good git workflow: http://sandofsky.com/blog/git-workflow.html
git rebase --interactive
git rebase protip: config rebase.autosquash = true
git commit -m “squash! Monkeys”
pick 8374d8e Monkeys squash 8374d8e squash! Monkeys pick 259a7e6 Better
monkeys
Deploying 2
We deploy lots
but there are processes around deploys
Some history
We do canary deploys
None
Our full deploys do rolling restarts
And automatically run integration tests
Monitoring 3
We use common monitoring tools
We have application level checks
We have custom metrics dashboards
Graphite (whisper) + Cubism.js http://square.github.com/cubism/ http://d3js.org/ More info:
Horizon Graph http://vis.berkeley.edu/papers/horizon/
None
On-Calling 4
Engineers are responsible for their work
Ad-hoc at first
First real on-call rotations were simple
Original escalation path:
Engineer 1
Engineer 2
@jack
General on-call could not be responsible for everything
Now, every engineering team has an on-call rotation
Process is still evolving
Do these 4 things well all the time
@pui_ling /pui