Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How Square Stays Up
Search
pui
July 18, 2012
Technology
3
280
How Square Stays Up
I talk I gave on the tools and processes Square uses to stay stable and available
pui
July 18, 2012
Tweet
Share
Other Decks in Technology
See All in Technology
SREのプラクティスを用いた3領域同時 マネジメントへの挑戦 〜SRE・情シス・セキュリティを統合した チーム運営術〜
coconala_engineer
2
640
レガシー共有バッチ基盤への挑戦 - SREドリブンなリアーキテクチャリングの取り組み
tatsukoni
0
210
超初心者からでも大丈夫!オープンソース半導体の楽しみ方〜今こそ!オレオレチップをつくろう〜
keropiyo
0
110
Context Engineeringが企業で不可欠になる理由
hirosatogamo
PRO
3
570
20260204_Midosuji_Tech
takuyay0ne
1
150
AI駆動開発を事業のコアに置く
tasukuonizawa
1
170
OpenShiftでllm-dを動かそう!
jpishikawa
0
100
プロポーザルに込める段取り八分
shoheimitani
1
230
Kiro IDEのドキュメントを全部読んだので地味だけどちょっと嬉しい機能を紹介する
khmoryz
0
180
Oracle Cloud Observability and Management Platform - OCI 運用監視サービス概要 -
oracle4engineer
PRO
2
14k
We Built for Predictability; The Workloads Didn’t Care
stahnma
0
140
生成AIを活用した音声文字起こしシステムの2つの構築パターンについて
miu_crescent
PRO
2
190
Featured
See All Featured
How to train your dragon (web standard)
notwaldorf
97
6.5k
What’s in a name? Adding method to the madness
productmarketing
PRO
24
3.9k
HDC tutorial
michielstock
1
380
世界の人気アプリ100個を分析して見えたペイウォール設計の心得
akihiro_kokubo
PRO
66
37k
What's in a price? How to price your products and services
michaelherold
247
13k
The Power of CSS Pseudo Elements
geoffreycrofte
80
6.2k
The Pragmatic Product Professional
lauravandoore
37
7.1k
Faster Mobile Websites
deanohume
310
31k
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
170
Why Our Code Smells
bkeepers
PRO
340
58k
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
1
51
How to Grow Your eCommerce with AI & Automation
katarinadahlin
PRO
0
110
Transcript
How Square Stays Up Tools and Processes Square Uses to
Maintain Stability and Availability
@pui_ling Erica Kwan
1 2 3 4 Developing Deploying Monitoring On-calling
Developing 1
We pair program (sometimes)
We solo, then get a code review (other times)
Why?
PCI Compliance Read all about it: http://en.wikipedia.org/wiki/Payment_Card_Industry_Data_Security_Standard
It is also good practice
git checkout -b topic-branch do work* git checkout master git
merge --no-ff topic-branch
A clean commit history helps
A super good git workflow: http://sandofsky.com/blog/git-workflow.html
git rebase --interactive
git rebase protip: config rebase.autosquash = true
git commit -m “squash! Monkeys”
pick 8374d8e Monkeys squash 8374d8e squash! Monkeys pick 259a7e6 Better
monkeys
Deploying 2
We deploy lots
but there are processes around deploys
Some history
We do canary deploys
None
Our full deploys do rolling restarts
And automatically run integration tests
Monitoring 3
We use common monitoring tools
We have application level checks
We have custom metrics dashboards
Graphite (whisper) + Cubism.js http://square.github.com/cubism/ http://d3js.org/ More info:
Horizon Graph http://vis.berkeley.edu/papers/horizon/
None
On-Calling 4
Engineers are responsible for their work
Ad-hoc at first
First real on-call rotations were simple
Original escalation path:
Engineer 1
Engineer 2
@jack
General on-call could not be responsible for everything
Now, every engineering team has an on-call rotation
Process is still evolving
Do these 4 things well all the time
@pui_ling /pui