Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.4k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2k
eBPF in Microservices Observability
rakyll
1
1.6k
OpenTelemetry at AWS
rakyll
1
1.8k
Debugging Code Generation in Go
rakyll
5
1.4k
Are you ready for production?
rakyll
8
2.5k
Serverless Containers
rakyll
1
220
Critical Path Analysis
rakyll
0
410
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.1k
Other Decks in Technology
See All in Technology
よく聞くけど使ったことないソフトウェアNo.1 KafkaとSnowflake
foursue
4
470
One engineer company with Ruby on Rails
rstankov
2
410
Kernel MemoryでAzure OpenAI Serviceとお手軽データソース連携
mitsuzono
1
270
いいたいことちゃんという
tkengo
0
210
Grafana x PagerDuty Better Together
jacopen
1
240
チームでロジカルシンキングに改めて向き合っている話 〜学習環境と実践⽅法〜
sansantech
PRO
3
3.2k
Azure犬駆動開発の記録/GlobalAzureFukuoka2024_20240420
nina01
1
230
開発生産性大幅アップ!Postman VS Code拡張機能
nagix
3
620
2024春 注目のWeb系 OSS & SaaS 3選
makies
0
170
AOAI をきっかけに 社内の Azure 管理を見直した話
recruitengineers
PRO
1
430
エンジニア候補者向け資料2024.04.24.pdf
macloud
0
3.3k
Handling focus in 2024
tahia910
0
200
Featured
See All Featured
Fontdeck: Realign not Redesign
paulrobertlloyd
76
4.9k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
26
2.3k
The Mythical Team-Month
searls
216
42k
From Idea to $5000 a Month in 5 Months
shpigford
378
45k
How GitHub Uses GitHub to Build GitHub
holman
468
290k
Designing on Purpose - Digital PM Summit 2013
jponch
111
6.5k
The Brand Is Dead. Long Live the Brand.
mthomps
49
29k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
13
8.3k
Ruby is Unlike a Banana
tanoku
96
10k
The Cost Of JavaScript in 2023
addyosmani
19
3.9k
The Illustrated Children's Guide to Kubernetes
chrisshort
32
46k
jQuery: Nuts, Bolts and Bling
dougneiner
59
7.2k
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]