Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.8k
Serverless Containers
rakyll
1
260
Critical Path Analysis
rakyll
0
620
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.2k
Other Decks in Technology
See All in Technology
会社にデータエンジニアがいることでできるようになること
10xinc
9
1.6k
見てわかるテスト駆動開発
recruitengineers
PRO
6
630
Claude Code x Androidアプリ 開発
kgmyshin
1
600
人と組織に偏重したEMへのアンチテーゼ──なぜ、EMに設計力が必要なのか/An antithesis to the overemphasis of people and organizations in EM
dskst
6
630
フルカイテン株式会社 エンジニア向け採用資料
fullkaiten
0
8.6k
[CVPR2025論文読み会] Linguistics-aware Masked Image Modelingfor Self-supervised Scene Text Recognition
s_aiueo32
0
210
攻撃と防御で実践するプロダクトセキュリティ演習~導入パート~
recruitengineers
PRO
3
340
JOAI発表資料 @ 関東kaggler会
joai_committee
1
380
コスト削減の基本の「キ」~ コスト消費3大リソースへの対策 ~
smt7174
2
180
GitHub Copilot coding agent を推したい / AIDD Nagoya #1
tnir
4
4.7k
AIドリブンのソフトウェア開発 - うまいやり方とまずいやり方
okdt
PRO
9
650
マイクロモビリティシェアサービスを支える プラットフォームアーキテクチャ
grimoh
1
240
Featured
See All Featured
Scaling GitHub
holman
462
140k
Large-scale JavaScript Application Architecture
addyosmani
512
110k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
283
13k
Embracing the Ebb and Flow
colly
87
4.8k
GitHub's CSS Performance
jonrohan
1031
460k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
29
1.8k
Designing for Performance
lara
610
69k
Done Done
chrislema
185
16k
Product Roadmaps are Hard
iamctodd
PRO
54
11k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.6k
Git: the NoSQL Database
bkeepers
PRO
431
65k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]