Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.8k
Serverless Containers
rakyll
1
250
Critical Path Analysis
rakyll
0
610
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.2k
Other Decks in Technology
See All in Technology
20250705 Headlamp: 專注可擴展性的 Kubernetes 用戶界面
pichuang
0
270
赤煉瓦倉庫勉強会「Databricksを選んだ理由と、絶賛真っ只中のデータ基盤移行体験記」
ivry_presentationmaterials
2
360
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
48
19k
いつの間にか入れ替わってる!?新しいAWS Security Hubとは?
cmusudakeisuke
0
120
KubeCon + CloudNativeCon Japan 2025 Recap Opening & Choose Your Own Adventureシリーズまとめ
mmmatsuda
0
270
MobileActOsaka_250704.pdf
akaitadaaki
0
120
マネジメントって難しい、けどおもしろい / Management is tough, but fun! #em_findy
ar_tama
7
1.1k
Flutter向けPDFビューア、pdfrxのpdfium WASM対応について
espresso3389
0
130
使いたいMCPサーバーはWeb APIをラップして自分で作る #QiitaBash
bengo4com
0
1.9k
Backlog ユーザー棚卸しRTA、多分これが一番早いと思います
__allllllllez__
1
150
CRE Camp #1 エンジニアリングを民主化するCREチームでありたい話
mntsq
1
120
Zephyr RTOSを使った開発コンペに参加した件
iotengineer22
1
220
Featured
See All Featured
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
34
3.1k
How To Stay Up To Date on Web Technology
chriscoyier
790
250k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
357
30k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
740
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
32
2.4k
Git: the NoSQL Database
bkeepers
PRO
430
65k
How to train your dragon (web standard)
notwaldorf
95
6.1k
Intergalactic Javascript Robots from Outer Space
tanoku
271
27k
Into the Great Unknown - MozCon
thekraken
40
1.9k
Making Projects Easy
brettharned
116
6.3k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]