Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
1.6k
3
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Servers are doomed to fail
JBD
May 17, 2019
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.2k
eBPF in Microservices Observability
rakyll
1
1.8k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.7k
Are you ready for production?
rakyll
8
3k
Serverless Containers
rakyll
1
300
Critical Path Analysis
rakyll
0
700
Monitoring and Debugging Containers
rakyll
2
1.2k
CPDD
rakyll
0
4.3k
Other Decks in Technology
See All in Technology
5分でわかるDuckDB Quack
chanyou0311
4
270
小さいから、全部わかる。— 常駐AI "xangi" のすすめ
sugupoko
0
160
When Platform Engineering Meets GenAI
sucitw
0
200
スタートアップにおけるアジャイルの実践について #shibuyagile
murabayashi
1
160
コミュニティの有益性 ~JAWS Days 2026 での体験を通して~ / The Benefits of a Community ~Through My Experience at JAWS Days 2026~
seike460
PRO
0
300
AI時代における最適なQA組織の作り方
ymty
3
190
自作お家AIエージェントスタックチャンFWで困っている所紹介
74th
0
160
はてなのサービス基盤を支える Kubernetes《足腰》
masayoshimaezawa
0
230
40代で“やっとエンジニアになれた”――閉じた学びを開き、空の青さを知る / 20260628 Naoki Takahashi
shift_evolve
PRO
4
1.3k
きのこカンファレンス2026_肩書きを外したとき私は誰か
yamasatimi
1
100
スタートアップにAmazon EKSは早すぎる? マルチプロダクト戦略を加速する Platform Engineeringの実践 / Is Amazon EKS Too Soon for Startups? Practical Platform Engineering to Accelerate a Multi-Product Strategy
elmodev09
1
1.9k
AWS Summit の片隅で、体育座りしながらコミュニティがにぎわう理由を考えた
k_adachi_01
2
290
Featured
See All Featured
Principles of Awesome APIs and How to Build Them.
keavy
128
18k
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
1k
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
280
The World Runs on Bad Software
bkeepers
PRO
72
12k
A Modern Web Designer's Workflow
chriscoyier
698
190k
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
350
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
330
Test your architecture with Archunit
thirion
1
2.3k
Speed Design
sergeychernyshev
33
1.9k
GitHub's CSS Performance
jonrohan
1033
470k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
56k
Music & Morning Musume
bryan
47
7.2k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]