Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Four years of breaking things in production, on...
Search
Eric Sigler
November 09, 2017
Technology
0
52
Four years of breaking things in production, on purpose.
Presented at Chaos Day Twin Cities, November 2017.
Eric Sigler
November 09, 2017
Tweet
Share
More Decks by Eric Sigler
See All by Eric Sigler
Instrumenting The Rest Of The Company: Hunting For Metrics
esigler
0
330
A Brief Introduction To DevOps
esigler
0
99
Humans are terrible compilers: A User's Guide
esigler
0
110
Do You Know If Your Service Is Working Properly? A Guide To Being Paranoid.
esigler
0
160
"Is there any strong objection?"
esigler
0
210
Fear, Uncertainty, and Continuous Deployment
esigler
1
110
3AM, a survey.
esigler
0
210
Strategies For Being On Call & Keeping Your Sanity At The Same Time
esigler
0
160
Engineering for Engineers
esigler
0
84
Other Decks in Technology
See All in Technology
【令和最新版】AWS Direct Connectと愉快なGWたちのおさらい
minorun365
PRO
2
140
AWS Lambdaと歩んだ“サーバーレス”と今後 #lambda_10years
yoshidashingo
1
110
マイベストのデータ基盤の現在と未来 / mybest-data-infra-asis-tobe
mybestinc
2
1.9k
Mini Tokyo 3D × PLATEAU - 公共交通デジタルツインにリアルな風景を
nagix
1
230
地理情報データをデータベースに格納しよう~ GPUを活用した爆速データベース PG-Stromの紹介 ~
sakaik
1
110
ドメイン名の終活について - JPAAWG 7th -
mikit
31
17k
スクラムチームを立ち上げる〜チーム開発で得られたもの・得られなかったもの〜
ohnoeight
2
290
GraphRAGを用いたLLMによるパーソナライズド推薦の生成
naveed92
0
190
いざ、BSC討伐の旅
nikinusu
2
610
Railsで4GBのデカ動画ファイルのアップロードと配信、どう実現する?
asflash8
1
210
[FOSS4G 2024 Japan LT] LLMを使ってGISデータ解析を自動化したい!
nssv
1
170
Terraform未経験の御様に対してどの ように導⼊を進めていったか
tkikuchi
1
260
Featured
See All Featured
Mobile First: as difficult as doing things right
swwweet
222
8.9k
Happy Clients
brianwarren
97
6.7k
Writing Fast Ruby
sferik
627
61k
A better future with KSS
kneath
238
17k
Fantastic passwords and where to find them - at NoRuKo
philnash
50
2.9k
Optimising Largest Contentful Paint
csswizardry
33
2.9k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
92
16k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
4
360
Code Reviewing Like a Champion
maltzj
520
39k
Documentation Writing (for coders)
carmenintech
65
4.4k
4 Signs Your Business is Dying
shpigford
180
21k
How To Stay Up To Date on Web Technology
chriscoyier
788
250k
Transcript
Eric Sigler, Head of DevOps, PagerDuty @esigler Four years of
breaking things in production, on purpose.
@esigler Obligatory disclaimer: This is what works for us. Take
away ideas, not dogmas.
@esigler
@esigler 2013: Every Friday, 1 hour. 2013 2014 2015 2016
2017
@esigler 2013 2014 2015 2016 2017
None
@esigler 2014: Expanding Scope 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2015: Automation 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2016: Adding In Randomness 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Also 2016: Putting It All Together 2013 2014 2015
2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler 2017: Distributing Knowledge 2013 2014 2015 2016 2017
@esigler 2013 2014 2015 2016 2017
@esigler Failure Friday sessions: 133 Faults injected: 708 Fault injections
resulting in a public postmortem: 3
@esigler Simulated full AZ failures: 4 Simulated full Region failures:
3 Simulated partial Disaster Recovery: 2
@esigler Tickets created from Failure Friday: over 225 Distinct services
that had faults injected: 49
@esigler
@esigler Optimized for learning first, tooling second Built the toolchain
to enable other teams Distributed chaos engineering knowledge
@esigler