Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
GitHub Universe 2015 Talk - Your software is br...
Search
James Smith
October 02, 2015
Technology
1
110
GitHub Universe 2015 Talk - Your software is broken — pay attention: Rethinking production monitoring
My talk from GitHub Universe 2015's "Deploy" track
James Smith
October 02, 2015
Tweet
Share
More Decks by James Smith
See All by James Smith
Why Are Android Apps So Crash-Prone?
loopj
0
180
RailsConf 2016 Talk - Your software is broken — pay attention: Rethinking production monitoring
loopj
1
400
Building A Popular Open-Source Android Library - Best practices and lessons learned
loopj
4
480
Building A Popular Open-Source javascript Library
loopj
0
93
JavaScript Stack Traces: The good, the bad, and the ugly
loopj
1
210
Other Decks in Technology
See All in Technology
EM歴1年10ヶ月のぼくがぶち当たった苦悩とこれからへ向けて
maaaato
0
270
Snowflakeでデータ基盤を もう一度作り直すなら / rebuilding-data-platform-with-snowflake
pei0804
2
780
生成AI時代の自動E2Eテスト運用とPlaywright実践知_引持力哉
legalontechnologies
PRO
0
210
エンジニアとPMのドメイン知識の溝をなくす、 AIネイティブな開発プロセス
applism118
4
1k
Uncertainty in the LLM era - Science, more than scale
gaelvaroquaux
0
810
ログ管理の新たな可能性?CloudWatchの新機能をご紹介
ikumi_ono
1
540
日本Rubyの会の構造と実行とあと何か / hokurikurk01
takahashim
4
950
“決まらない”NSM設計への処方箋 〜ビットキーにおける現実的な指標デザイン事例〜 / A Prescription for "Stuck" NSM Design: Bitkey’s Practical Case Study
bitkey
PRO
1
580
Debugging Edge AI on Zephyr and Lessons Learned
iotengineer22
0
120
計算機科学をRubyと歩む 〜DFA型正規表現エンジンをつくる~
ydah
3
200
学習データって増やせばいいんですか?
ftakahashi
1
250
最近のLinux普段づかいWaylandデスクトップ元年
penguin2716
1
670
Featured
See All Featured
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
For a Future-Friendly Web
brad_frost
180
10k
Fireside Chat
paigeccino
41
3.7k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
12
970
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Building a Scalable Design System with Sketch
lauravandoore
463
34k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
35
2.3k
A Modern Web Designer's Workflow
chriscoyier
698
190k
Music & Morning Musume
bryan
46
7k
Agile that works and the tools we love
rasmusluckow
331
21k
Build your cross-platform service in a week with App Engine
jlugia
234
18k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.3k
Transcript
RETHINKING PRODUCTION MONITORING YOUR SOFTWARE IS BROKEN — PAY ATTENTION
JAMES SMITH loopj loopj
None
CODE TEST DEPLOY YOLO ¯\_(ϑ)_/¯
CODE TEST DEPLOY YOLO CODE TEST DEPLOY CONFIDENCE ¯\_(ϑ)_/¯ :)
STABILITY PERFORMANCE AVAILABILITY
DELIVERING AN AWESOME EXPERIENCE TO CUSTOMERS
WHY MONITORING MATTERS
YOUR APP WILL LIVE OR DIE BASED ON ITS QUALITY
— CUSTOMERS HAVE A CHOICE
84% OF USERS ABANDON AFTER TWO CRASHES
49% OF ENGINEERING TIME FINDING & FIXING BUGS
SINS OF PRODUCTION MONITORING WHAT AM I DOING WRONG?
1. PRETENDING NOTHING IS WRONG
“But I’ve written tests!” “The QA Team will check that!”
“Works great for me!”
2. WAITING FOR CUSTOMERS TO COMPLAIN
“Nobody complained so everything must be OK”
3. LACK OF VISIBILITY
“We’ll just check the logs” “Did you remember to add
a log statement?”
4. LACK OF OWNERSHIP
“Not my problem!” “I’ve got a feature to ship” “My
code works fine”
HOW CAN WE DO BETTER?
ACCEPT AUTOMATE AGGREGATE NOTIFY PRIORITIZE DIAGNOSE TEND CORE PRINCIPLES OF
PRODUCTION MONITORING
1. ACCEPT ACCEPT THAT YOUR SOFTWARE WILL BREAK AFTER SHIPPING
2. AUTOMATE ADD HOOKS TO DETECT CRASHES/ERRORS/ISSUES IN PRODUCTION
3. AGGREGATE DON'T JUST HAVE A STREAM OF EVENTS -
GROUP LIKE ISSUES TOGETHER
4. NOTIFY ALERT YOUR DEV TEAM WHERE THEY ALREADY COMMUNICATE
5. PRIORITIZE YOU CAN'T FIX EVERY ERROR - SO FOCUS
ON THE MOST HARMFUL ONES
6. DIAGNOSE KNOWING ABOUT ISSUES ISN'T ENOUGH - THEY MUST
BE ACTIONABLE
7. TEND MAKE AN ORGANIZATIONAL CHANGE - SOMEONE NEEDS TO
CARE ABOUT ERRORS
TAKING ACTION
TOOLS
USES “FAILURE” HOOKS
ASSESS IMPACT
ASSESS SEVERITY
CAPTURES DIAGNOSTIC DATA
WORKFLOW
USE TEAM CHAT
EMBRACE COLLABORATION
TRACK PROGRESS OF FIXES
TEAM STRUCTURES
EMBRACE RAPID ITERATION
CREATE A “BUG TEAM”
OR CREATE A “BUG ROTATION”
OR KNOW “WHO LAST TOUCHED THIS CODE”?
TL;DR
AVOID THE SINS
EMBRACE CORE PRINCIPLES
TAKE ACTION
THANK YOU!
QUESTIONS?
IS HIRING! bugsnag.com/jobs @bugsnag