Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
dojo.pdf
Search
Rich Burroughs
April 18, 2019
Technology
0
130
dojo.pdf
Rich Burroughs
April 18, 2019
Tweet
Share
More Decks by Rich Burroughs
See All by Rich Burroughs
Virtual_Kubernetes_Clusters__Tips_and_Tricks_-_Rejekts.pdf
richburroughs
0
1.2k
What On-Call Does to Us
richburroughs
1
120
Other Decks in Technology
See All in Technology
【NoMapsTECH 2025】AI Edge Computing Workshop
akit37
0
220
テストを軸にした生き残り術
kworkdev
PRO
0
210
サラリーマンの小遣いで作るtoCサービス - Cloudflare Workersでスケールする開発戦略
shinaps
2
470
新アイテムをどう使っていくか?みんなであーだこーだ言ってみよう / 20250911-rpi-jam-tokyo
akkiesoft
0
320
はじめてのOSS開発からみえたGo言語の強み
shibukazu
3
950
LLMを搭載したプロダクトの品質保証の模索と学び
qa
0
1.1k
データ分析エージェント Socrates の育て方
na0
5
1.5k
Codeful Serverless / 一人運用でもやり抜く力
_kensh
7
450
Snowflake×dbtを用いたテレシーのデータ基盤のこれまでとこれから
sagara
0
110
複数サービスを支えるマルチテナント型Batch MLプラットフォーム
lycorptech_jp
PRO
1
860
AIのグローバルトレンド2025 #scrummikawa / global ai trend
kyonmm
PRO
1
310
なぜスクラムはこうなったのか?歴史が教えてくれたこと/Shall we explore the roots of Scrum
sanogemaru
5
1.7k
Featured
See All Featured
Automating Front-end Workflow
addyosmani
1370
200k
Bootstrapping a Software Product
garrettdimon
PRO
307
110k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3k
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
A better future with KSS
kneath
239
17k
GraphQLとの向き合い方2022年版
quramy
49
14k
Rebuilding a faster, lazier Slack
samanthasiow
83
9.2k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
Practical Orchestrator
shlominoach
190
11k
Code Reviewing Like a Champion
maltzj
525
40k
Rails Girls Zürich Keynote
gr2m
95
14k
Stop Working from a Prison Cell
hatefulcrawdad
271
21k
Transcript
Learning Through Failure Rich Burroughs Community Manager Gremlin, Inc. @richburroughs
None
None
Complexity is constantly increasing
None
None
None
What's changed?
None
None
"Catastrophe is always just around the corner"
"Change introduces new forms of failure"
"All practitioner actions are gambles"
None
None
What are some ways we can learn more about systems?
None
None
None
Chaos Engineering
"The science of performing intentional experimentation on a system by
injecting precise and measured amounts of harm to observe how the system responds for the purpose of improving the system’s resilience."
None
Prerequisites —Observability —Blameless Culture
Scientific Method —Ask a question —Research —Form a hypothesis —Experiment
to test the hypothesis —Analyze data and draw a conclusion —Share the results
Types of attacks —Shutdown —CPU —Memory —I/O —Network Latency —Packet
Loss —DNS —Blackhole
None
The goal is to experiment in Production
None
Example experiment —Application: Front End —Attack: CPU —Hypothesis: Adding CPU
load will cause additional hosts to spin up in our Autoscaling Group —Abort condition: Latency increases by 20%
Example experiment #2 —Application: Front End —Attack: Blackhole —Hypothesis: Blackholing
the hostname for the Twilio API will cause the SMS transmissions to time out —Abort condition: Error rate increases by 20%
Don't experiment on things you know are broken
None
Questions —Were we able to measure the results? —Did the
system respond the way we expected? —Are there things we need to fix?
Run experiments to simulate an incident you've had
What comes after Game Days?
Continuous Chaos
Maturity model —Running manual experiments —Running experiments using Chaos Engineering
tools —Regularly scheduled Game Days —Experimenting in Production —Continuous Chaos
Next steps: —Join our Chaos Engineering Slack: gremlin.com/ slack —Read
tutorials: gremlin.com/community —Chaos Conf: chaosconf.io —Gremlin Free: go.gremlin.com/richchaos
Thank you! Twitter: @richburroughs Email:
[email protected]
Slides: https://github.com/richburroughs/ dojo201904