$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
GitHub Universe 2015 Talk - Your software is br...
Search
James Smith
October 02, 2015
Technology
1
110
GitHub Universe 2015 Talk - Your software is broken — pay attention: Rethinking production monitoring
My talk from GitHub Universe 2015's "Deploy" track
James Smith
October 02, 2015
Tweet
Share
More Decks by James Smith
See All by James Smith
Why Are Android Apps So Crash-Prone?
loopj
0
180
RailsConf 2016 Talk - Your software is broken — pay attention: Rethinking production monitoring
loopj
1
410
Building A Popular Open-Source Android Library - Best practices and lessons learned
loopj
4
480
Building A Popular Open-Source javascript Library
loopj
0
93
JavaScript Stack Traces: The good, the bad, and the ugly
loopj
1
220
Other Decks in Technology
See All in Technology
_第4回__AIxIoTビジネス共創ラボ紹介資料_20251203.pdf
iotcomjpadmin
0
130
ESXi のAIOps だ!2025冬
unnowataru
0
350
AIBuildersDay_track_A_iidaxs
iidaxs
4
1.3k
Claude Codeを使った情報整理術
knishioka
10
5.7k
MySQLのSpatial(GIS)機能をもっと充実させたい ~ MyNA望年会2025LT
sakaik
0
120
AIエージェント開発と活用を加速するワークフロー自動生成への挑戦
shibuiwilliam
5
850
「もしもデータ基盤開発で『強くてニューゲーム』ができたなら今の僕はどんなデータ基盤を作っただろう」
aeonpeople
0
250
Oracle Database@AWS:サービス概要のご紹介
oracle4engineer
PRO
1
400
New Relic 1 年生の振り返りと Cloud Cost Intelligence について #NRUG
play_inc
0
230
たまに起きる外部サービスの障害に備えたり備えなかったりする話
egmc
0
410
まだ間に合う! Agentic AI on AWSの現在地をやさしく一挙おさらい
minorun365
17
2.7k
ソフトウェアエンジニアとAIエンジニアの役割分担についてのある事例
kworkdev
PRO
0
250
Featured
See All Featured
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
122
21k
Build The Right Thing And Hit Your Dates
maggiecrowley
38
3k
Bridging the Design Gap: How Collaborative Modelling removes blockers to flow between stakeholders and teams @FastFlow conf
baasie
0
410
Digital Projects Gone Horribly Wrong (And the UX Pros Who Still Save the Day) - Dean Schuster
uxyall
0
110
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
31
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
1
410
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
141
34k
Abbi's Birthday
coloredviolet
0
3.8k
Color Theory Basics | Prateek | Gurzu
gurzu
0
150
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
Have SEOs Ruined the Internet? - User Awareness of SEO in 2025
akashhashmi
0
190
Lightning talk: Run Django tests with GitHub Actions
sabderemane
0
92
Transcript
RETHINKING PRODUCTION MONITORING YOUR SOFTWARE IS BROKEN — PAY ATTENTION
JAMES SMITH loopj loopj
None
CODE TEST DEPLOY YOLO ¯\_(ϑ)_/¯
CODE TEST DEPLOY YOLO CODE TEST DEPLOY CONFIDENCE ¯\_(ϑ)_/¯ :)
STABILITY PERFORMANCE AVAILABILITY
DELIVERING AN AWESOME EXPERIENCE TO CUSTOMERS
WHY MONITORING MATTERS
YOUR APP WILL LIVE OR DIE BASED ON ITS QUALITY
— CUSTOMERS HAVE A CHOICE
84% OF USERS ABANDON AFTER TWO CRASHES
49% OF ENGINEERING TIME FINDING & FIXING BUGS
SINS OF PRODUCTION MONITORING WHAT AM I DOING WRONG?
1. PRETENDING NOTHING IS WRONG
“But I’ve written tests!” “The QA Team will check that!”
“Works great for me!”
2. WAITING FOR CUSTOMERS TO COMPLAIN
“Nobody complained so everything must be OK”
3. LACK OF VISIBILITY
“We’ll just check the logs” “Did you remember to add
a log statement?”
4. LACK OF OWNERSHIP
“Not my problem!” “I’ve got a feature to ship” “My
code works fine”
HOW CAN WE DO BETTER?
ACCEPT AUTOMATE AGGREGATE NOTIFY PRIORITIZE DIAGNOSE TEND CORE PRINCIPLES OF
PRODUCTION MONITORING
1. ACCEPT ACCEPT THAT YOUR SOFTWARE WILL BREAK AFTER SHIPPING
2. AUTOMATE ADD HOOKS TO DETECT CRASHES/ERRORS/ISSUES IN PRODUCTION
3. AGGREGATE DON'T JUST HAVE A STREAM OF EVENTS -
GROUP LIKE ISSUES TOGETHER
4. NOTIFY ALERT YOUR DEV TEAM WHERE THEY ALREADY COMMUNICATE
5. PRIORITIZE YOU CAN'T FIX EVERY ERROR - SO FOCUS
ON THE MOST HARMFUL ONES
6. DIAGNOSE KNOWING ABOUT ISSUES ISN'T ENOUGH - THEY MUST
BE ACTIONABLE
7. TEND MAKE AN ORGANIZATIONAL CHANGE - SOMEONE NEEDS TO
CARE ABOUT ERRORS
TAKING ACTION
TOOLS
USES “FAILURE” HOOKS
ASSESS IMPACT
ASSESS SEVERITY
CAPTURES DIAGNOSTIC DATA
WORKFLOW
USE TEAM CHAT
EMBRACE COLLABORATION
TRACK PROGRESS OF FIXES
TEAM STRUCTURES
EMBRACE RAPID ITERATION
CREATE A “BUG TEAM”
OR CREATE A “BUG ROTATION”
OR KNOW “WHO LAST TOUCHED THIS CODE”?
TL;DR
AVOID THE SINS
EMBRACE CORE PRINCIPLES
TAKE ACTION
THANK YOU!
QUESTIONS?
IS HIRING! bugsnag.com/jobs @bugsnag