Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
dojo.pdf
Search
Rich Burroughs
April 18, 2019
Technology
0
130
dojo.pdf
Rich Burroughs
April 18, 2019
Tweet
Share
More Decks by Rich Burroughs
See All by Rich Burroughs
Virtual_Kubernetes_Clusters__Tips_and_Tricks_-_Rejekts.pdf
richburroughs
0
1.2k
What On-Call Does to Us
richburroughs
1
120
Other Decks in Technology
See All in Technology
DBのスキルで生き残る技術 - AI時代におけるテーブル設計の勘所
soudai
PRO
48
19k
United Airlines Customer Service– Call 1-833-341-3142 Now!
airhelp
0
170
AIの全社活用を推進するための安全なレールを敷いた話
shoheimitani
2
510
CDKTFについてざっくり理解する!!~CloudFormationからCDKTFへ変換するツールも作ってみた~
masakiokuda
1
140
React開発にStorybookとCopilotを導入して、爆速でUIを編集・確認する方法
yu_kod
1
270
AI時代の開発生産性を加速させるアーキテクチャ設計
plaidtech
PRO
3
150
AWS認定を取る中で感じたこと
siromi
1
190
Should Our Project Join the CNCF? (Japanese Recap)
whywaita
PRO
0
340
タイミーのデータモデリング事例と今後のチャレンジ
ttccddtoki
6
2.4k
American airlines ®️ USA Contact Numbers: Complete 2025 Support Guide
airhelpsupport
0
380
IPA&AWSダブル全冠が明かす、人生を変えた勉強法のすべて
iwamot
PRO
2
130
OSSのSNSツール「Misskey」をさわってみよう(右下ワイプで私のOSCの20年を振り返ります) / 20250705-osc2025-do
akkiesoft
0
160
Featured
See All Featured
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
29
9.6k
Bash Introduction
62gerente
613
210k
Build your cross-platform service in a week with App Engine
jlugia
231
18k
Building Flexible Design Systems
yeseniaperezcruz
328
39k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
357
30k
Rebuilding a faster, lazier Slack
samanthasiow
82
9.1k
Statistics for Hackers
jakevdp
799
220k
How STYLIGHT went responsive
nonsquared
100
5.6k
The Illustrated Children's Guide to Kubernetes
chrisshort
48
50k
Visualization
eitanlees
146
16k
Docker and Python
trallard
44
3.5k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
281
13k
Transcript
Learning Through Failure Rich Burroughs Community Manager Gremlin, Inc. @richburroughs
None
None
Complexity is constantly increasing
None
None
None
What's changed?
None
None
"Catastrophe is always just around the corner"
"Change introduces new forms of failure"
"All practitioner actions are gambles"
None
None
What are some ways we can learn more about systems?
None
None
None
Chaos Engineering
"The science of performing intentional experimentation on a system by
injecting precise and measured amounts of harm to observe how the system responds for the purpose of improving the system’s resilience."
None
Prerequisites —Observability —Blameless Culture
Scientific Method —Ask a question —Research —Form a hypothesis —Experiment
to test the hypothesis —Analyze data and draw a conclusion —Share the results
Types of attacks —Shutdown —CPU —Memory —I/O —Network Latency —Packet
Loss —DNS —Blackhole
None
The goal is to experiment in Production
None
Example experiment —Application: Front End —Attack: CPU —Hypothesis: Adding CPU
load will cause additional hosts to spin up in our Autoscaling Group —Abort condition: Latency increases by 20%
Example experiment #2 —Application: Front End —Attack: Blackhole —Hypothesis: Blackholing
the hostname for the Twilio API will cause the SMS transmissions to time out —Abort condition: Error rate increases by 20%
Don't experiment on things you know are broken
None
Questions —Were we able to measure the results? —Did the
system respond the way we expected? —Are there things we need to fix?
Run experiments to simulate an incident you've had
What comes after Game Days?
Continuous Chaos
Maturity model —Running manual experiments —Running experiments using Chaos Engineering
tools —Regularly scheduled Game Days —Experimenting in Production —Continuous Chaos
Next steps: —Join our Chaos Engineering Slack: gremlin.com/ slack —Read
tutorials: gremlin.com/community —Chaos Conf: chaosconf.io —Gremlin Free: go.gremlin.com/richchaos
Thank you! Twitter: @richburroughs Email:
[email protected]
Slides: https://github.com/richburroughs/ dojo201904