Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
dojo.pdf
Search
Rich Burroughs
April 18, 2019
Technology
0
130
dojo.pdf
Rich Burroughs
April 18, 2019
Tweet
Share
More Decks by Rich Burroughs
See All by Rich Burroughs
Virtual_Kubernetes_Clusters__Tips_and_Tricks_-_Rejekts.pdf
richburroughs
0
1.2k
What On-Call Does to Us
richburroughs
1
120
Other Decks in Technology
See All in Technology
マイクロサービスへの5年間 ぶっちゃけ何をしてどうなったか
joker1007
8
4.8k
コミューンのデータ分析AIエージェント「Community Sage」の紹介
fufufukakaka
0
510
ログ管理の新たな可能性?CloudWatchの新機能をご紹介
ikumi_ono
1
790
エンジニアとPMのドメイン知識の溝をなくす、 AIネイティブな開発プロセス
applism118
4
1.3k
Sansanが実践する Platform EngineeringとSREの協創
sansantech
PRO
2
900
品質のための共通認識
kakehashi
PRO
3
260
多様なデジタルアイデンティティを攻撃からどうやって守るのか / 20251212
ayokura
0
470
ExpoのインダストリーブースでみたAWSが見せる製造業の未来
hamadakoji
0
130
Lessons from Migrating to OpenSearch: Shard Design, Log Ingestion, and UI Decisions
sansantech
PRO
1
130
大企業でもできる!ボトムアップで拡大させるプラットフォームの作り方
findy_eventslides
1
800
re:Invent 2025 ふりかえり 生成AI版
takaakikakei
1
210
Haskell を武器にして挑む競技プログラミング ─ 操作的思考から意味モデル思考へ
naoya
6
1.5k
Featured
See All Featured
The Hidden Cost of Media on the Web [PixelPalooza 2025]
tammyeverts
1
100
Music & Morning Musume
bryan
46
7k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
54k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
Docker and Python
trallard
47
3.7k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
122
21k
YesSQL, Process and Tooling at Scale
rocio
174
15k
Stop Working from a Prison Cell
hatefulcrawdad
273
21k
[Rails World 2023 - Day 1 Closing Keynote] - The Magic of Rails
eileencodes
37
2.6k
Fantastic passwords and where to find them - at NoRuKo
philnash
52
3.5k
GraphQLとの向き合い方2022年版
quramy
50
14k
Transcript
Learning Through Failure Rich Burroughs Community Manager Gremlin, Inc. @richburroughs
None
None
Complexity is constantly increasing
None
None
None
What's changed?
None
None
"Catastrophe is always just around the corner"
"Change introduces new forms of failure"
"All practitioner actions are gambles"
None
None
What are some ways we can learn more about systems?
None
None
None
Chaos Engineering
"The science of performing intentional experimentation on a system by
injecting precise and measured amounts of harm to observe how the system responds for the purpose of improving the system’s resilience."
None
Prerequisites —Observability —Blameless Culture
Scientific Method —Ask a question —Research —Form a hypothesis —Experiment
to test the hypothesis —Analyze data and draw a conclusion —Share the results
Types of attacks —Shutdown —CPU —Memory —I/O —Network Latency —Packet
Loss —DNS —Blackhole
None
The goal is to experiment in Production
None
Example experiment —Application: Front End —Attack: CPU —Hypothesis: Adding CPU
load will cause additional hosts to spin up in our Autoscaling Group —Abort condition: Latency increases by 20%
Example experiment #2 —Application: Front End —Attack: Blackhole —Hypothesis: Blackholing
the hostname for the Twilio API will cause the SMS transmissions to time out —Abort condition: Error rate increases by 20%
Don't experiment on things you know are broken
None
Questions —Were we able to measure the results? —Did the
system respond the way we expected? —Are there things we need to fix?
Run experiments to simulate an incident you've had
What comes after Game Days?
Continuous Chaos
Maturity model —Running manual experiments —Running experiments using Chaos Engineering
tools —Regularly scheduled Game Days —Experimenting in Production —Continuous Chaos
Next steps: —Join our Chaos Engineering Slack: gremlin.com/ slack —Read
tutorials: gremlin.com/community —Chaos Conf: chaosconf.io —Gremlin Free: go.gremlin.com/richchaos
Thank you! Twitter: @richburroughs Email:
[email protected]
Slides: https://github.com/richburroughs/ dojo201904