Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Production Debugging
Search
josh_robb
April 09, 2014
Technology
0
45
Production Debugging
What to do when shits on fire
josh_robb
April 09, 2014
Tweet
Share
More Decks by josh_robb
See All by josh_robb
Wellington Codecamp 2016
josh_robb
0
130
Codemania - Coupling, Cohesion, Connascence
josh_robb
1
580
Blameless Postmortems - Security by Inclusion
josh_robb
0
280
Effective Unit Testing
josh_robb
0
64
Coupling, Cohesion, Connascence
josh_robb
1
2.9k
Brown Field ASP.NET MVC with Webforms
josh_robb
1
140
Other Decks in Technology
See All in Technology
マルチプロダクト環境におけるSREの役割 / SRE NEXT 2025 lunch session
sugamasao
1
390
United airlines®️ USA Contact Numbers: Complete 2025 Support Guide
unitedflyhelp
0
340
モニタリング統一への道のり - 分散モニタリングツール統合のためのオブザーバビリティプロジェクト
niftycorp
PRO
1
360
LLM時代の検索
shibuiwilliam
2
640
いつの間にか入れ替わってる!?新しいAWS Security Hubとは?
cmusudakeisuke
0
160
NewSQLや分散データベースを支えるRaftの仕組み - 仕組みを理解して知る得意不得意
hacomono
PRO
3
230
ソフトウェアQAがハードウェアの人になったの
mineo_matsuya
3
110
DatabricksにOLTPデータベース『Lakebase』がやってきた!
inoutk
0
150
ロールが細分化された組織でSREは何をするか?
tgidgd
1
200
助けて! XからWaylandに移行しないと新しいGNOMEが使えなくなっちゃう 2025-07-12
nobutomurata
2
140
american aa airlines®️ USA Contact Numbers: Complete 2025 Support Guide
aaguide
0
500
OpenTelemetryセマンティック規約の恩恵とMackerel APMにおける活用例 / SRE NEXT 2025
mackerelio
3
1.6k
Featured
See All Featured
jQuery: Nuts, Bolts and Bling
dougneiner
63
7.8k
GraphQLの誤解/rethinking-graphql
sonatard
71
11k
Writing Fast Ruby
sferik
628
62k
The Invisible Side of Design
smashingmag
301
51k
We Have a Design System, Now What?
morganepeng
53
7.7k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
281
13k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
Imperfection Machines: The Place of Print at Facebook
scottboms
267
13k
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.8k
XXLCSS - How to scale CSS and keep your sanity
sugarenia
248
1.3M
A Tale of Four Properties
chriscoyier
160
23k
Making Projects Easy
brettharned
116
6.3k
Transcript
Production Debugging
Production Debugging What to do when shit’s on fire
Me @josh_robb Code Nanny @ Pushpay.com “We’re Hiring” (tm)
Overview Context Tools Demos Wrap-up
Context OODA - John Boyd
OODA - Observe - Orient - Decide - Act
Stressful situations Getting shot at in an aeroplane Fuck that
Getting shot at full stop! Flying a helicopter with no engine
How to train for stress?
How to train for stress?
None
Emotions Denial Fear Anxiety Fatigue (not an emotion - related)
Bad Judgement All of these things lead to poor quality
decision making.
What would NPH Do? First do no harm!
Don’t make things worse Evaluate your options - Can you
roll back? - Can you get a new job? - Can you roll forward?
MTTR Optimize for Mean Time To Recovery!
Don’t roll back in fear Roll forward to victory -
Etsy
Failure If you work somewhere failure is unacceptable (Apart from
Avionics or medical gear) Get a new job Seriously Failure in Tech is unavoidable Learn from it
Postmortems Postmortems are important Blameless ones are best http://codeascraft.com/2012/05/22/blameless- postmortems/
Demo Scenarios - Two of them - In one app
- Introducing
BrokenApp
BrokenApp Two Scenarios - Hang - Hi CPU
Tools - Process tools - Tracing Tools - Dump analysis
Tools - Process tools - Orient - Tracing Tools -
Observe - Dump analysis - Decide - ACT?
Process Tools Windows Server 2012/2008 - Resource Monitor! Also -
Task Manager - Process Explorer - Procmon
Process Tools Demo
Tracing Tools Perfview - Process sampling tool - Great for
what's happening over time? - Live profiling
Perfview XCopy deployable Offline analysis #FTW NOTE: Enable ASP.NET tracing
(DSIM) Demo!
Tracing Tools Message analyzer - Network traffic - Packet sniffing
Dump Analysis - Windbg (for masochists these days)
windbg
windbg - yeah - no - just say no -
debugdiag FTW!