Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Servers are doomed to fail
Search
JBD
May 17, 2019
Technology
3
1.6k
Servers are doomed to fail
JBD
May 17, 2019
Tweet
Share
More Decks by JBD
See All by JBD
eBPF in Microservices Observability at eBPF Day
rakyll
1
2.2k
eBPF in Microservices Observability
rakyll
1
1.7k
OpenTelemetry at AWS
rakyll
1
1.9k
Debugging Code Generation in Go
rakyll
5
1.6k
Are you ready for production?
rakyll
8
2.9k
Serverless Containers
rakyll
1
270
Critical Path Analysis
rakyll
0
670
Monitoring and Debugging Containers
rakyll
2
1.1k
CPDD
rakyll
0
4.2k
Other Decks in Technology
See All in Technology
ALB「証明書上限問題」からの脱却
nishiokashinji
0
200
Vivre en Bitcoin : le tutoriel que votre banquier ne veut pas que vous voyiez
rlifchitz
0
270
【Oracle Cloud ウェビナー】ランサムウェアが突く「侵入の隙」とバックアップの「死角」 ~ 過去の教訓に学ぶ — 侵入前提の防御とデータ保護 ~
oracle4engineer
PRO
0
100
Kaggleコンペティション「MABe Challenge - Social Action Recognition in Mice」振り返り
yu4u
1
490
Proxmoxで作る自宅クラウド入門
koinunopochi
0
110
人工知能のための哲学塾 ニューロフィロソフィ篇 第零夜 「ニューロフィロソフィとは何か?」
miyayou
0
460
スクラムマスターが スクラムチームに入って取り組む5つのこと - スクラムガイドには書いてないけど入った当初から取り組んでおきたい大切なこと -
scrummasudar
3
2.2k
サラリーマンソフトウェアエンジニアのキャリア
yuheinakasaka
41
19k
AI Agent Agentic Workflow の可観測性 / Observability of AI Agent Agentic Workflow
yuzujoe
2
1.9k
これまでのネットワーク運用を変えるかもしれないアプデをおさらい
hatahata021
3
180
Sansan Engineering Unit 紹介資料
sansan33
PRO
1
3.6k
AI時代のアジャイルチームを目指して ー スクラムというコンフォートゾーンからの脱却 ー / Toward Agile Teams in the Age of AI
takaking22
11
6.7k
Featured
See All Featured
Java REST API Framework Comparison - PWX 2021
mraible
34
9.1k
Pawsitive SEO: Lessons from My Dog (and Many Mistakes) on Thriving as a Consultant in the Age of AI
davidcarrasco
0
47
SERP Conf. Vienna - Web Accessibility: Optimizing for Inclusivity and SEO
sarafernandez
1
1.3k
GraphQLとの向き合い方2022年版
quramy
50
14k
Marketing Yourself as an Engineer | Alaka | Gurzu
gurzu
0
110
Designing for Timeless Needs
cassininazir
0
120
Darren the Foodie - Storyboard
khoart
PRO
2
2.2k
The Mindset for Success: Future Career Progression
greggifford
PRO
0
210
The #1 spot is gone: here's how to win anyway
tamaranovitovic
1
890
Why Our Code Smells
bkeepers
PRO
340
58k
Utilizing Notion as your number one productivity tool
mfonobong
2
200
Exploring anti-patterns in Rails
aemeredith
2
230
Transcript
Servers are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Serverless is also doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Systems are doomed to fail Jaana B. Dogan
[email protected]
@rakyll
Is failure OK? Is failure an unexpected case?
Failure is not an exception. Systems change all the time.
“I haven’t touched the code for a century, it should
just work.” Said no one ever.
Failure is expected. Yes, it is.
None
@rakyll monitoring debugging postmortem
Monitoring is about saying if something is broken.
“99.99% of the requests should return in 100ms.”
@rakyll
@rakyll
Debugging
Debugging is collaborative.
Debugging comes in flavors. Logs Traces Metrics ...
Postmortems
Postmortems
Postmortems
Blameless? Focus on identifying problems.
Collaboration Design for collaboration.
Design for failure Set SLOs, plan for instrumentation, plan for
debugging.
Cross-stack debugging Accountability across stack with high cardinality data. speakerdeck.com/rakyll/rpc-metrics-at-google
Correlation Jump from monitoring/debugging data to data.
On-call debugging Jump from distributed tracing data to on-call information.
who to page?
Dynamic collection Capability to enable more collection in production when
needed.
Continuous collection Continuously collect signals, generate fleet-wide analysis reports.
Introspection Introspection pages provided from the services.
@rakyll monitoring debugging postmortem
Thank you Jaana B. Dogan Google
[email protected]