Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.2k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.9k
Serverless Containers
rakyll
1
270
Critical Path Analysis
rakyll
0
660
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.2k
Other Decks in Technology
See All in Technology
障害対応訓練、その前に
coconala_engineer
0
100
初めてのDatabricks AI/BI Genie
taka_aki
0
210
IAMユーザーゼロの運用は果たして可能なのか
yama3133
2
490
Python 3.14 Overview
lycorptech_jp
PRO
1
120
生成AI時代におけるグローバル戦略思考
taka_aki
0
200
寫了幾年 Code,然後呢?軟體工程師必須重新認識的 DevOps
cheng_wei_chen
1
1.5k
MySQLとPostgreSQLのコレーション / Collation of MySQL and PostgreSQL
tmtms
1
1k
AI 駆動開発勉強会 フロントエンド支部 #1 w/あずもば
1ftseabass
PRO
0
400
S3を正しく理解するための内部構造の読解
nrinetcom
PRO
3
170
Lessons from Migrating to OpenSearch: Shard Design, Log Ingestion, and UI Decisions
sansantech
PRO
1
150
Snowflakeでデータ基盤を もう一度作り直すなら / rebuilding-data-platform-with-snowflake
pei0804
6
1.6k
Microsoft Agent 365 についてゆっくりじっくり理解する!
skmkzyk
0
380
Featured
See All Featured
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
61
Agile Actions for Facilitating Distributed Teams - ADO2019
mkilby
0
85
Optimizing for Happiness
mojombo
379
70k
Music & Morning Musume
bryan
46
7k
Marketing to machines
jonoalderson
1
4.3k
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
Art, The Web, and Tiny UX
lynnandtonic
304
21k
What's in a price? How to price your products and services
michaelherold
246
13k
Building a Scalable Design System with Sketch
lauravandoore
463
34k
Into the Great Unknown - MozCon
thekraken
40
2.2k
Faster Mobile Websites
deanohume
310
31k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]