Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
dojo.pdf
Search
Rich Burroughs
April 18, 2019
Technology
0
130
dojo.pdf
Rich Burroughs
April 18, 2019
Tweet
Share
More Decks by Rich Burroughs
See All by Rich Burroughs
Virtual_Kubernetes_Clusters__Tips_and_Tricks_-_Rejekts.pdf
richburroughs
0
1.2k
What On-Call Does to Us
richburroughs
1
120
Other Decks in Technology
See All in Technology
スマートファクトリーの第一歩 〜AWSマネージドサービスで 実現する予知保全と生成AI活用まで
ganota
2
210
なぜスクラムはこうなったのか?歴史が教えてくれたこと/Shall we explore the roots of Scrum
sanogemaru
5
1.6k
おやつは300円まで!の最適化を模索してみた
techtekt
PRO
0
300
5分でカオスエンジニアリングを分かった気になろう
pandayumi
0
230
新アイテムをどう使っていくか?みんなであーだこーだ言ってみよう / 20250911-rpi-jam-tokyo
akkiesoft
0
210
Automating Web Accessibility Testing with AI Agents
maminami373
0
1.2k
自作JSエンジンに推しプロポーザルを実装したい!
sajikix
1
170
JTCにおける内製×スクラム開発への挑戦〜内製化率95%達成の舞台裏/JTC's challenge of in-house development with Scrum
aeonpeople
0
210
Autonomous Database - Dedicated 技術詳細 / adb-d_technical_detail_jp
oracle4engineer
PRO
4
10k
研究開発と製品開発、両利きのロボティクス
youtalk
1
520
AWSで始める実践Dagster入門
kitagawaz
1
600
生成AIでセキュリティ運用を効率化する話
sakaitakeshi
0
620
Featured
See All Featured
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
139
34k
We Have a Design System, Now What?
morganepeng
53
7.8k
How to Think Like a Performance Engineer
csswizardry
26
1.9k
Testing 201, or: Great Expectations
jmmastey
45
7.7k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.9k
Measuring & Analyzing Core Web Vitals
bluesmoon
9
580
Speed Design
sergeychernyshev
32
1.1k
Reflections from 52 weeks, 52 projects
jeffersonlam
352
21k
Six Lessons from altMBA
skipperchong
28
4k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.5k
Faster Mobile Websites
deanohume
309
31k
Building Adaptive Systems
keathley
43
2.7k
Transcript
Learning Through Failure Rich Burroughs Community Manager Gremlin, Inc. @richburroughs
None
None
Complexity is constantly increasing
None
None
None
What's changed?
None
None
"Catastrophe is always just around the corner"
"Change introduces new forms of failure"
"All practitioner actions are gambles"
None
None
What are some ways we can learn more about systems?
None
None
None
Chaos Engineering
"The science of performing intentional experimentation on a system by
injecting precise and measured amounts of harm to observe how the system responds for the purpose of improving the system’s resilience."
None
Prerequisites —Observability —Blameless Culture
Scientific Method —Ask a question —Research —Form a hypothesis —Experiment
to test the hypothesis —Analyze data and draw a conclusion —Share the results
Types of attacks —Shutdown —CPU —Memory —I/O —Network Latency —Packet
Loss —DNS —Blackhole
None
The goal is to experiment in Production
None
Example experiment —Application: Front End —Attack: CPU —Hypothesis: Adding CPU
load will cause additional hosts to spin up in our Autoscaling Group —Abort condition: Latency increases by 20%
Example experiment #2 —Application: Front End —Attack: Blackhole —Hypothesis: Blackholing
the hostname for the Twilio API will cause the SMS transmissions to time out —Abort condition: Error rate increases by 20%
Don't experiment on things you know are broken
None
Questions —Were we able to measure the results? —Did the
system respond the way we expected? —Are there things we need to fix?
Run experiments to simulate an incident you've had
What comes after Game Days?
Continuous Chaos
Maturity model —Running manual experiments —Running experiments using Chaos Engineering
tools —Regularly scheduled Game Days —Experimenting in Production —Continuous Chaos
Next steps: —Join our Chaos Engineering Slack: gremlin.com/ slack —Read
tutorials: gremlin.com/community —Chaos Conf: chaosconf.io —Gremlin Free: go.gremlin.com/richchaos
Thank you! Twitter: @richburroughs Email:
[email protected]
Slides: https://github.com/richburroughs/ dojo201904