Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.5k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.1k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.5k
Are you ready for production?
rakyll
8
2.7k
Serverless Containers
rakyll
1
240
Critical Path Analysis
rakyll
0
540
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.1k
Other Decks in Technology
See All in Technology
チームが毎日小さな変化と適応を続けたら1年間でスケール可能なアジャイルチームができた話 / Building a Scalable Agile Team
kakehashi
2
220
30分でわかる「リスクから学ぶKubernetesコンテナセキュリティ」/30min-k8s-container-sec
mochizuki875
3
440
シフトライトなテスト活動を適切に行うことで、無理な開発をせず、過剰にテストせず、顧客をビックリさせないプロダクトを作り上げているお話 #RSGT2025 / Shift Right
nihonbuson
3
2.1k
「隙間家具OSS」に至る道/Fujiwara Tech Conference 2025
fujiwara3
6
6.3k
Docker Desktop で Docker を始めよう
zembutsu
PRO
0
150
あなたの知らないクラフトビールの世界
miura55
0
120
comilioとCloudflare、そして未来へと向けて
oliver_diary
6
440
PaaSの歴史と、 アプリケーションプラットフォームのこれから
jacopen
7
1.3k
完全自律型AIエージェントとAgentic Workflow〜ワークフロー構築という現実解
pharma_x_tech
0
320
When Windows Meets Kubernetes…
pichuang
0
300
Reactフレームワークプロダクトを モバイルアプリにして、もっと便利に。 ユーザに価値を届けよう。/React Framework with Capacitor
rdlabo
0
110
iPadOS18でフローティングタブバーを解除してみた
sansantech
PRO
1
120
Featured
See All Featured
Mobile First: as difficult as doing things right
swwweet
222
9k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
26
1.9k
Making Projects Easy
brettharned
116
6k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
49
2.2k
Intergalactic Javascript Robots from Outer Space
tanoku
270
27k
Measuring & Analyzing Core Web Vitals
bluesmoon
5
210
VelocityConf: Rendering Performance Case Studies
addyosmani
327
24k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
120k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
173
51k
The Cost Of JavaScript in 2023
addyosmani
46
7.2k
Six Lessons from altMBA
skipperchong
27
3.6k
The Power of CSS Pseudo Elements
geoffreycrofte
74
5.4k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]