Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.9k
Serverless Containers
rakyll
1
260
Critical Path Analysis
rakyll
0
640
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.2k
Other Decks in Technology
See All in Technology
どうなる Remix 3
tanakahisateru
2
350
マーケットプレイス版Oracle WebCenter Content For OCI
oracle4engineer
PRO
3
1.3k
よくわからない人向けの IAM Identity Center とちょっとした落とし穴
kazzpapa3
2
680
Design and implementation of "Markdown to Google Slides" / phpconfuk 2025
k1low
1
390
コード1ミリもわからないけど Claude CodeでFigjamプラグインを作った話
abokadotyann
1
160
Datadog On-Call と Cloud SIEM で作る SOC 基盤
kuriyosh
0
150
ソフトウェアテストのAI活用_ver1.50
fumisuke
0
290
「データ無い! 腹立つ! 推論する!」から 「データ無い! 腹立つ! データを作る」へ チームでデータを作り、育てられるようにするまで / How can we create, use, and maintain data ourselves?
moznion
4
2.1k
us-east-1 の障害が 起きると なぜ ソワソワするのか
miu_crescent
PRO
2
740
龍昌餃子で理解するWebサーバーの並行処理モデル - 東葛.dev #9
kozy4324
1
140
激動の2025年、Modern Data Stackの最新技術動向
sagara
0
1.2k
Black Hat USA 2025 Recap ~ クラウドセキュリティ編 ~
kyohmizu
0
510
Featured
See All Featured
Music & Morning Musume
bryan
46
6.9k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
31
2.7k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.3k
Art, The Web, and Tiny UX
lynnandtonic
303
21k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
116
20k
We Have a Design System, Now What?
morganepeng
54
7.9k
Designing for humans not robots
tammielis
254
26k
Practical Orchestrator
shlominoach
190
11k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
Designing Experiences People Love
moore
142
24k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]