Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
GitHub Universe 2015 Talk - Your software is br...
Search
Sponsored
·
Ship Features Fearlessly
Turn features on and off without deploys. Used by thousands of Ruby developers.
→
James Smith
October 02, 2015
Technology
1
110
GitHub Universe 2015 Talk - Your software is broken — pay attention: Rethinking production monitoring
My talk from GitHub Universe 2015's "Deploy" track
James Smith
October 02, 2015
Tweet
Share
More Decks by James Smith
See All by James Smith
Why Are Android Apps So Crash-Prone?
loopj
0
190
RailsConf 2016 Talk - Your software is broken — pay attention: Rethinking production monitoring
loopj
1
410
Building A Popular Open-Source Android Library - Best practices and lessons learned
loopj
4
490
Building A Popular Open-Source javascript Library
loopj
0
96
JavaScript Stack Traces: The good, the bad, and the ugly
loopj
1
220
Other Decks in Technology
See All in Technology
韓非子に学ぶAI活用術
tomfook
2
550
スケールアップ企業でQA組織が機能し続けるための組織設計と仕組み〜ボトムアップとトップダウンを両輪としたアプローチ〜
tarappo
4
370
AWS Systems Managerのハイブリッドアクティベーションを使用したガバメントクラウド環境の統合管理
toru_kubota
0
160
SSoT(Single Source of Truth)で「壊して再生」する設計
kawauso
2
330
【PHPerKaigi2026】OpenTelemetry SDKを使ってPHPでAPMを自作する
fendo181
1
190
Laravelで学ぶOAuthとOpenID Connectの基礎と実装
kyoshidaxx
4
1.8k
テストプロセスにおけるAI活用 :人間とAIの共存
hacomono
PRO
0
160
モジュラモノリス導入から4年間の総括:アーキテクチャと組織の相互作用について / Architecture and Organizational Interaction
nazonohito51
6
2.9k
品質を経営にどう語るか #jassttokyo / Communicating the Strategic Value of Quality to Executive Leadership
kyonmm
PRO
3
1.2k
「コントロールの三分法」で考える「コト」への向き合い方 / phperkaigi2026
blue_goheimochi
0
150
Navigation APIと見るSvelteKitのWeb標準志向
yamanoku
2
110
Phase12_総括_自走化
overflowinc
0
1.4k
Featured
See All Featured
コードの90%をAIが書く世界で何が待っているのか / What awaits us in a world where 90% of the code is written by AI
rkaga
61
43k
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
210
Navigating the moral maze — ethical principles for Al-driven product design
skipperchong
2
300
More Than Pixels: Becoming A User Experience Designer
marktimemedia
3
360
Test your architecture with Archunit
thirion
1
2.2k
A better future with KSS
kneath
240
18k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
14k
First, design no harm
axbom
PRO
2
1.1k
Designing for humans not robots
tammielis
254
26k
16th Malabo Montpellier Forum Presentation
akademiya2063
PRO
0
79
How People are Using Generative and Agentic AI to Supercharge Their Products, Projects, Services and Value Streams Today
helenjbeal
1
140
Speed Design
sergeychernyshev
33
1.6k
Transcript
RETHINKING PRODUCTION MONITORING YOUR SOFTWARE IS BROKEN — PAY ATTENTION
JAMES SMITH loopj loopj
None
CODE TEST DEPLOY YOLO ¯\_(ϑ)_/¯
CODE TEST DEPLOY YOLO CODE TEST DEPLOY CONFIDENCE ¯\_(ϑ)_/¯ :)
STABILITY PERFORMANCE AVAILABILITY
DELIVERING AN AWESOME EXPERIENCE TO CUSTOMERS
WHY MONITORING MATTERS
YOUR APP WILL LIVE OR DIE BASED ON ITS QUALITY
— CUSTOMERS HAVE A CHOICE
84% OF USERS ABANDON AFTER TWO CRASHES
49% OF ENGINEERING TIME FINDING & FIXING BUGS
SINS OF PRODUCTION MONITORING WHAT AM I DOING WRONG?
1. PRETENDING NOTHING IS WRONG
“But I’ve written tests!” “The QA Team will check that!”
“Works great for me!”
2. WAITING FOR CUSTOMERS TO COMPLAIN
“Nobody complained so everything must be OK”
3. LACK OF VISIBILITY
“We’ll just check the logs” “Did you remember to add
a log statement?”
4. LACK OF OWNERSHIP
“Not my problem!” “I’ve got a feature to ship” “My
code works fine”
HOW CAN WE DO BETTER?
ACCEPT AUTOMATE AGGREGATE NOTIFY PRIORITIZE DIAGNOSE TEND CORE PRINCIPLES OF
PRODUCTION MONITORING
1. ACCEPT ACCEPT THAT YOUR SOFTWARE WILL BREAK AFTER SHIPPING
2. AUTOMATE ADD HOOKS TO DETECT CRASHES/ERRORS/ISSUES IN PRODUCTION
3. AGGREGATE DON'T JUST HAVE A STREAM OF EVENTS -
GROUP LIKE ISSUES TOGETHER
4. NOTIFY ALERT YOUR DEV TEAM WHERE THEY ALREADY COMMUNICATE
5. PRIORITIZE YOU CAN'T FIX EVERY ERROR - SO FOCUS
ON THE MOST HARMFUL ONES
6. DIAGNOSE KNOWING ABOUT ISSUES ISN'T ENOUGH - THEY MUST
BE ACTIONABLE
7. TEND MAKE AN ORGANIZATIONAL CHANGE - SOMEONE NEEDS TO
CARE ABOUT ERRORS
TAKING ACTION
TOOLS
USES “FAILURE” HOOKS
ASSESS IMPACT
ASSESS SEVERITY
CAPTURES DIAGNOSTIC DATA
WORKFLOW
USE TEAM CHAT
EMBRACE COLLABORATION
TRACK PROGRESS OF FIXES
TEAM STRUCTURES
EMBRACE RAPID ITERATION
CREATE A “BUG TEAM”
OR CREATE A “BUG ROTATION”
OR KNOW “WHO LAST TOUCHED THIS CODE”?
TL;DR
AVOID THE SINS
EMBRACE CORE PRINCIPLES
TAKE ACTION
THANK YOU!
QUESTIONS?
IS HIRING! bugsnag.com/jobs @bugsnag