Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
JBD
May 17, 2019
Technology
3
1.6k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.2k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.9k
Serverless Containers
rakyll
1
280
Critical Path Analysis
rakyll
0
690
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.3k
Other Decks in Technology
See All in Technology
The Rise of Browser Automation: AI-Powered Web Interaction in 2026
marcthompson_seo
0
210
Phase07_実務適用
overflowinc
0
510
めちゃくちゃ開発するQAエンジニアになって感じたメリットとこれからの課題感
ryuhei0000yamamoto
0
220
モジュラモノリス導入から4年間の総括:アーキテクチャと組織の相互作用について / Architecture and Organizational Interaction
nazonohito51
3
1.2k
20260321_エンベディングってなに?RAGってなに?エンベディングの説明とGemini Embedding 2 の紹介
tsho
0
130
_Architecture_Modernization_から学ぶ現状理解から設計への道のり.pdf
satohjohn
2
570
Everything Claude Code を眺める
oikon48
13
8.9k
フロントエンド刷新 4年間の軌跡
yotahada3
0
520
Phase06_ClaudeCode実践
overflowinc
0
530
Windows ファイル共有(SMB)を再確認する
murachiakira
PRO
0
210
【社内勉強会】新年度からコーディングエージェントを使いこなす - 構造と制約で引き出すClaude Codeの実践知
nwiizo
8
4.8k
頼れる Agentic AI を支える Datadog のオブザーバビリティ / Powering Reliable Agentic AI with Datadog Observability
aoto
PRO
0
240
Featured
See All Featured
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
55k
The Cost Of JavaScript in 2023
addyosmani
55
9.8k
技術選定の審美眼(2025年版) / Understanding the Spiral of Technologies 2025 edition
twada
PRO
118
110k
Amusing Abliteration
ianozsvald
0
140
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
200
Building the Perfect Custom Keyboard
takai
2
720
Done Done
chrislema
186
16k
Marketing to machines
jonoalderson
1
5k
VelocityConf: Rendering Performance Case Studies
addyosmani
333
24k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Embracing the Ebb and Flow
colly
88
5k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.6k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]