Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
GitHub Universe 2015 Talk - Your software is br...
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
James Smith
October 02, 2015
Technology
120
1
Share
GitHub Universe 2015 Talk - Your software is broken — pay attention: Rethinking production monitoring
My talk from GitHub Universe 2015's "Deploy" track
James Smith
October 02, 2015
More Decks by James Smith
See All by James Smith
Why Are Android Apps So Crash-Prone?
loopj
0
190
RailsConf 2016 Talk - Your software is broken — pay attention: Rethinking production monitoring
loopj
1
420
Building A Popular Open-Source Android Library - Best practices and lessons learned
loopj
4
490
Building A Popular Open-Source javascript Library
loopj
0
97
JavaScript Stack Traces: The good, the bad, and the ugly
loopj
1
220
Other Decks in Technology
See All in Technology
Databricks 月刊サービスアップデート 2026年05月号
tyosi1212
0
200
「嘘をつくテスト」の失敗例から学ぶ 良いテストコード #frontend_phpcon_do
asumikam
0
160
Java正規表現エンジン(NFA)の仕組みと パフォーマンスを維持するための最適化手法
takeuchi_132917
0
180
ChatworkとBPaaS 異なる特性で学んだAI機能開発の ベストプラクティス
kubell_hr
2
2.2k
TypeScript Compiler APIとPHP-Parserを活用し、TypeScriptとPHPで型を共有する
shuta13
0
350
MIERUNE JCT 発表資料「宇宙から伊能忠敬ごっこ」
syuchimu
0
110
Oracle AI Database@AWS:サービス概要のご紹介
oracle4engineer
PRO
4
2.8k
実装は速くなった、レビューはどうする? ― 自身のレビューをAIで再現させるサーヴァントエンジニアリングのすゝめ / Implementation got faster. So what about reviews? — An invitation to Servant Engineering: Recreating your own code reviews with AI
nrslib
6
3k
AI フレンドリーなエラー監視を TypeScript で実現する
shinyaigeek
2
250
「気づいたら仕事が終わっている」バクラクAIエージェント本番運用の裏側 / layerx-bakuraku-aie2026
yuya4
18
9k
「コーディング」しない人のための Claude Code 入門 ChatGPT の次の一歩 — 業務に組み込む 育成・共有・自動化
rfdnxbro
2
1.1k
Ruby::Boxでできること、Refinementsでできること
joker1007
3
380
Featured
See All Featured
BBQ
matthewcrist
89
10k
SEO in 2025: How to Prepare for the Future of Search
ipullrank
3
3.5k
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
240
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
55k
Redefining SEO in the New Era of Traffic Generation
szymonslowik
1
320
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.5k
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
420
Designing Powerful Visuals for Engaging Learning
tmiket
1
390
Testing 201, or: Great Expectations
jmmastey
46
8.2k
Building Adaptive Systems
keathley
44
3k
Balancing Empowerment & Direction
lara
6
1.1k
Chasing Engaging Ingredients in Design
codingconduct
0
210
Transcript
RETHINKING PRODUCTION MONITORING YOUR SOFTWARE IS BROKEN — PAY ATTENTION
JAMES SMITH loopj loopj
None
CODE TEST DEPLOY YOLO ¯\_(ϑ)_/¯
CODE TEST DEPLOY YOLO CODE TEST DEPLOY CONFIDENCE ¯\_(ϑ)_/¯ :)
STABILITY PERFORMANCE AVAILABILITY
DELIVERING AN AWESOME EXPERIENCE TO CUSTOMERS
WHY MONITORING MATTERS
YOUR APP WILL LIVE OR DIE BASED ON ITS QUALITY
— CUSTOMERS HAVE A CHOICE
84% OF USERS ABANDON AFTER TWO CRASHES
49% OF ENGINEERING TIME FINDING & FIXING BUGS
SINS OF PRODUCTION MONITORING WHAT AM I DOING WRONG?
1. PRETENDING NOTHING IS WRONG
“But I’ve written tests!” “The QA Team will check that!”
“Works great for me!”
2. WAITING FOR CUSTOMERS TO COMPLAIN
“Nobody complained so everything must be OK”
3. LACK OF VISIBILITY
“We’ll just check the logs” “Did you remember to add
a log statement?”
4. LACK OF OWNERSHIP
“Not my problem!” “I’ve got a feature to ship” “My
code works fine”
HOW CAN WE DO BETTER?
ACCEPT AUTOMATE AGGREGATE NOTIFY PRIORITIZE DIAGNOSE TEND CORE PRINCIPLES OF
PRODUCTION MONITORING
1. ACCEPT ACCEPT THAT YOUR SOFTWARE WILL BREAK AFTER SHIPPING
2. AUTOMATE ADD HOOKS TO DETECT CRASHES/ERRORS/ISSUES IN PRODUCTION
3. AGGREGATE DON'T JUST HAVE A STREAM OF EVENTS -
GROUP LIKE ISSUES TOGETHER
4. NOTIFY ALERT YOUR DEV TEAM WHERE THEY ALREADY COMMUNICATE
5. PRIORITIZE YOU CAN'T FIX EVERY ERROR - SO FOCUS
ON THE MOST HARMFUL ONES
6. DIAGNOSE KNOWING ABOUT ISSUES ISN'T ENOUGH - THEY MUST
BE ACTIONABLE
7. TEND MAKE AN ORGANIZATIONAL CHANGE - SOMEONE NEEDS TO
CARE ABOUT ERRORS
TAKING ACTION
TOOLS
USES “FAILURE” HOOKS
ASSESS IMPACT
ASSESS SEVERITY
CAPTURES DIAGNOSTIC DATA
WORKFLOW
USE TEAM CHAT
EMBRACE COLLABORATION
TRACK PROGRESS OF FIXES
TEAM STRUCTURES
EMBRACE RAPID ITERATION
CREATE A “BUG TEAM”
OR CREATE A “BUG ROTATION”
OR KNOW “WHO LAST TOUCHED THIS CODE”?
TL;DR
AVOID THE SINS
EMBRACE CORE PRINCIPLES
TAKE ACTION
THANK YOU!
QUESTIONS?
IS HIRING! bugsnag.com/jobs @bugsnag