Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
production: an owner's manual
Search
Igor Wiedler
April 23, 2018
Programming
190
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
production: an owner's manual
from exec(ut) 2018
Igor Wiedler
April 23, 2018
More Decks by Igor Wiedler
See All by Igor Wiedler
Redis Bedtime Stories
igorw
1
360
Wide Event Analytics (LISA19)
igorw
4
940
a day in the life of a request
igorw
0
170
The Power of 2
igorw
0
340
LISP 1.5 Programmer's Manual: A Dramatic Reading
igorw
0
480
The Moral Character of Software
igorw
1
310
interdisciplinary computing (domcode)
igorw
0
320
miniKanren (clojure berlin)
igorw
1
330
End the war on tabs (phpnw14)
igorw
1
1.1k
Other Decks in Programming
See All in Programming
正しくソフトウェアを作る、前提を疑うための認知の視点 / doubt-premise
minodriven
21
6.8k
Observability in Practice:Grafana 與 Edge Device SRE 的那些事
blueswen
0
170
スマートグラスで並列バイブコーディング
hyshu
0
240
LLM本来の能力を解き放つサンドボックス技術とAI民主化への適用
yukukotani
3
4.3k
AIで効率化できた業務・日常
ochtum
0
140
Oxcを導入して開発体験が向上した話
yug1224
4
320
Signal Forms: Details & Live Coding @enterJS 2026 in Mannheim
manfredsteyer
PRO
0
160
Claspは野良GASの夢をみるか
takter00
0
200
LLMによるContent Moderationの本番運用の裏側と品質担保への挑戦
suikabar
3
720
OSもどきOS
arkw
0
580
代数的データ型って何が嬉しいの? #frontend_phpcon_do
kajitack
8
3.7k
Performance Engineering for Everyone
elenatanasoiu
0
190
Featured
See All Featured
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.7k
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
2
580
Crafting Experiences
bethany
1
180
Bash Introduction
62gerente
615
220k
Statistics for Hackers
jakevdp
799
230k
Bridging the Design Gap: How Collaborative Modelling removes blockers to flow between stakeholders and teams @FastFlow conf
baasie
0
590
The AI Revolution Will Not Be Monopolized: How open-source beats economies of scale, even for LLMs
inesmontani
PRO
3
3.5k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.5k
For a Future-Friendly Web
brad_frost
183
10k
jQuery: Nuts, Bolts and Bling
dougneiner
66
8.5k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
56k
It's Worth the Effort
3n
188
29k
Transcript
production: an owner's manual
hello!
broken computers
None
getting sidetracked now so sorry* * not sorry
None
None
None
back to serious business
!
None
a production system is a system that serves real users
the goal of operations is to ensure services are reliable
in order to provide a good user experience
None
failure
app
app linux kernel cpu dram disk network power supply switches
load balancer dns submarine cables routers fiber
app linux kernel the cloud
None
• cosmic rays • disk failure • power outages •
software bugs • ...
entropy
None
capacity
None
None
None
cascading failure
None
system design
redundancy
"
scale
None
"
p1 m3 c1 m2 m1 p2 c2
data storage
"
"
protocols
None
monitoring
many components many req/s
None
measure all the things?
✅ ⏱
golden signals • latency • traffic • errors • saturation
golden signals • latency • traffic • errors • saturation
golden signals • latency • traffic • errors • saturation
golden signals • latency • traffic • errors • saturation
golden signals • latency • traffic • errors • saturation
0 - 50 [1620]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ (74.55%) 50 - 100 [ 447]: ∎∎∎∎∎∎∎∎∎∎ (20.57%) 100 - 150 [ 49]: ∎ (2.25%) 150 - 200 [ 15]: (0.69%) 200 - 250 [ 15]: (0.69%) 250 - 300 [ 10]: (0.46%) 300 - 350 [ 6]: (0.28%) 350 - 400 [ 1]: (0.05%) 400 - 450 [ 0]: (0.00%) 450 - 500 [ 4]: (0.18%)
golden signals • latency • traffic • errors • saturation
saturation traffic latency errors
None
humans
None
oops, deleted the database
bad human!
why does this button even exist?
app linux kernel cpu dram disk network power supply switches
load balancer dns submarine cables routers fiber
app linux kernel cpu dram disk network power supply switches
load balancer dns submarine cables routers fiber humans
app linux kernel cpu dram disk network power supply switches
load balancer dns submarine cables routers fiber humans h u m a n s
epic failure is almost always systemic
failure
recap
• a production system serves real users • users like
things that work and are fast • epic failure is almost always systemic
thx @igorwhilefalse
None