Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Four years of breaking things in production, on...
Search
Eric Sigler
November 09, 2017
Technology
0
58
Four years of breaking things in production, on purpose.
Presented at Chaos Day Twin Cities, November 2017.
Eric Sigler
November 09, 2017
Tweet
Share
More Decks by Eric Sigler
See All by Eric Sigler
Instrumenting The Rest Of The Company: Hunting For Metrics
esigler
0
380
A Brief Introduction To DevOps
esigler
0
110
Humans are terrible compilers: A User's Guide
esigler
0
120
Do You Know If Your Service Is Working Properly? A Guide To Being Paranoid.
esigler
0
180
"Is there any strong objection?"
esigler
0
230
Fear, Uncertainty, and Continuous Deployment
esigler
1
120
3AM, a survey.
esigler
0
230
Strategies For Being On Call & Keeping Your Sanity At The Same Time
esigler
0
160
Engineering for Engineers
esigler
0
93
Other Decks in Technology
See All in Technology
ComposeではないコードをCompose化する case ビズリーチ / DroidKaigi 2025 koyasai
visional_engineering_and_design
0
100
やる気のない自分との向き合い方/How to Deal with Your Unmotivated Self
sanogemaru
0
470
AI時代こそ求められる設計力- AWSクラウドデザインパターン3選で信頼性と拡張性を高める-
kenichirokimura
3
290
神回のメカニズムと再現方法/Mechanisms and Playbook for Kamikai scrumat2025
moriyuya
4
720
Developer Advocate / Community Managerなるには?
tsho
0
130
Performance Insights 廃止から Database Insights 利用へ/transition-from-performance-insights-to-database-insights
emiki
0
200
SwiftUIのGeometryReaderとScrollViewを基礎から応用まで学び直す:設計と活用事例
fumiyasac0921
0
160
衛星画像超解像化によって実現する2D, 3D空間情報の即時生成と“AI as a Service”/ Real-time generation spatial data enabled_by satellite image super-resolution
lehupa
0
140
「AI駆動PO」を考えてみる - 作る速さから価値のスループットへ:検査・適応で未来を開発 / AI-driven product owner. scrummat2025
yosuke_nagai
3
820
LLMアプリの地上戦開発計画と運用実践 / 2025.10.15 GPU UNITE 2025
smiyawaki0820
1
390
OpenAI gpt-oss ファインチューニング入門
kmotohas
2
1.2k
許しとアジャイル
jnuank
1
140
Featured
See All Featured
StorybookのUI Testing Handbookを読んだ
zakiyama
31
6.2k
GraphQLの誤解/rethinking-graphql
sonatard
73
11k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
132
19k
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
The Power of CSS Pseudo Elements
geoffreycrofte
79
6k
Become a Pro
speakerdeck
PRO
29
5.5k
Done Done
chrislema
185
16k
Mobile First: as difficult as doing things right
swwweet
224
10k
KATA
mclloyd
32
15k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
30
2.9k
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
Transcript
Eric Sigler, Head of DevOps, PagerDuty @esigler Four years of
breaking things in production, on purpose.
@esigler Obligatory disclaimer: This is what works for us. Take
away ideas, not dogmas.
@esigler
@esigler 2013: Every Friday, 1 hour. 2013 2014 2015 2016
2017
@esigler 2013 2014 2015 2016 2017
None
@esigler 2014: Expanding Scope 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2015: Automation 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2016: Adding In Randomness 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Also 2016: Putting It All Together 2013 2014 2015
2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2017: Distributing Knowledge 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Failure Friday sessions: 133 Faults injected: 708 Fault injections
resulting in a public postmortem: 3
@esigler Simulated full AZ failures: 4 Simulated full Region failures:
3 Simulated partial Disaster Recovery: 2
@esigler Tickets created from Failure Friday: over 225 Distinct services
that had faults injected: 49
@esigler
@esigler Optimized for learning first, tooling second Built the toolchain
to enable other teams Distributed chaos engineering knowledge
@esigler