Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
GitHub Universe 2015 Talk - Your software is br...
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
James Smith
October 02, 2015
Technology
110
1
Share
GitHub Universe 2015 Talk - Your software is broken — pay attention: Rethinking production monitoring
My talk from GitHub Universe 2015's "Deploy" track
James Smith
October 02, 2015
More Decks by James Smith
See All by James Smith
Why Are Android Apps So Crash-Prone?
loopj
0
190
RailsConf 2016 Talk - Your software is broken — pay attention: Rethinking production monitoring
loopj
1
410
Building A Popular Open-Source Android Library - Best practices and lessons learned
loopj
4
490
Building A Popular Open-Source javascript Library
loopj
0
97
JavaScript Stack Traces: The good, the bad, and the ugly
loopj
1
220
Other Decks in Technology
See All in Technology
Eight Engineering Unit 紹介資料
sansan33
PRO
3
7.2k
幾億の壁を超えて/Beyond Countless Walls(JP)
ikuodanaka
0
140
AWS DevOps Agentはチームメイトになれるのか?/ Can AWS DevOps Agent become a teammate
kinunori
6
670
マルチプロダクトの信頼性を効率良く保っていくために
kworkdev
PRO
0
120
弁護士ドットコム株式会社 エンジニア職向け 会社紹介資料
bengo4com
1
120
Azure PortalなどにみるWebアクセシビリティ
tomokusaba
0
370
JEDAI in Osaka 2026イントロ
taka_aki
0
270
Azure Speech で音声対応してみよう
kosmosebi
0
150
Azure Lifecycle with Copilot CLI
torumakabe
3
990
AI時代における技術的負債への取り組み
codenote
0
1.2k
All About Sansan – for New Global Engineers
sansan33
PRO
1
1.4k
Snowflake Intelligence導入で 分かった活用のコツ
wonohe
0
110
Featured
See All Featured
HU Berlin: Industrial-Strength Natural Language Processing with spaCy and Prodigy
inesmontani
PRO
0
320
Evolving SEO for Evolving Search Engines
ryanjones
0
180
[SF Ruby Conf 2025] Rails X
palkan
2
950
Building Flexible Design Systems
yeseniaperezcruz
330
40k
Money Talks: Using Revenue to Get Sh*t Done
nikkihalliwell
0
200
The B2B funnel & how to create a winning content strategy
katarinadahlin
PRO
1
330
Bioeconomy Workshop: Dr. Julius Ecuru, Opportunities for a Bioeconomy in West Africa
akademiya2063
PRO
1
94
Discover your Explorer Soul
emna__ayadi
2
1.1k
How GitHub (no longer) Works
holman
316
150k
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
160
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
How Software Deployment tools have changed in the past 20 years
geshan
0
33k
Transcript
RETHINKING PRODUCTION MONITORING YOUR SOFTWARE IS BROKEN — PAY ATTENTION
JAMES SMITH loopj loopj
None
CODE TEST DEPLOY YOLO ¯\_(ϑ)_/¯
CODE TEST DEPLOY YOLO CODE TEST DEPLOY CONFIDENCE ¯\_(ϑ)_/¯ :)
STABILITY PERFORMANCE AVAILABILITY
DELIVERING AN AWESOME EXPERIENCE TO CUSTOMERS
WHY MONITORING MATTERS
YOUR APP WILL LIVE OR DIE BASED ON ITS QUALITY
— CUSTOMERS HAVE A CHOICE
84% OF USERS ABANDON AFTER TWO CRASHES
49% OF ENGINEERING TIME FINDING & FIXING BUGS
SINS OF PRODUCTION MONITORING WHAT AM I DOING WRONG?
1. PRETENDING NOTHING IS WRONG
“But I’ve written tests!” “The QA Team will check that!”
“Works great for me!”
2. WAITING FOR CUSTOMERS TO COMPLAIN
“Nobody complained so everything must be OK”
3. LACK OF VISIBILITY
“We’ll just check the logs” “Did you remember to add
a log statement?”
4. LACK OF OWNERSHIP
“Not my problem!” “I’ve got a feature to ship” “My
code works fine”
HOW CAN WE DO BETTER?
ACCEPT AUTOMATE AGGREGATE NOTIFY PRIORITIZE DIAGNOSE TEND CORE PRINCIPLES OF
PRODUCTION MONITORING
1. ACCEPT ACCEPT THAT YOUR SOFTWARE WILL BREAK AFTER SHIPPING
2. AUTOMATE ADD HOOKS TO DETECT CRASHES/ERRORS/ISSUES IN PRODUCTION
3. AGGREGATE DON'T JUST HAVE A STREAM OF EVENTS -
GROUP LIKE ISSUES TOGETHER
4. NOTIFY ALERT YOUR DEV TEAM WHERE THEY ALREADY COMMUNICATE
5. PRIORITIZE YOU CAN'T FIX EVERY ERROR - SO FOCUS
ON THE MOST HARMFUL ONES
6. DIAGNOSE KNOWING ABOUT ISSUES ISN'T ENOUGH - THEY MUST
BE ACTIONABLE
7. TEND MAKE AN ORGANIZATIONAL CHANGE - SOMEONE NEEDS TO
CARE ABOUT ERRORS
TAKING ACTION
TOOLS
USES “FAILURE” HOOKS
ASSESS IMPACT
ASSESS SEVERITY
CAPTURES DIAGNOSTIC DATA
WORKFLOW
USE TEAM CHAT
EMBRACE COLLABORATION
TRACK PROGRESS OF FIXES
TEAM STRUCTURES
EMBRACE RAPID ITERATION
CREATE A “BUG TEAM”
OR CREATE A “BUG ROTATION”
OR KNOW “WHO LAST TOUCHED THIS CODE”?
TL;DR
AVOID THE SINS
EMBRACE CORE PRINCIPLES
TAKE ACTION
THANK YOU!
QUESTIONS?
IS HIRING! bugsnag.com/jobs @bugsnag