Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
dojo.pdf
Search
Rich Burroughs
April 18, 2019
Technology
0
130
dojo.pdf
Rich Burroughs
April 18, 2019
Tweet
Share
More Decks by Rich Burroughs
See All by Rich Burroughs
Virtual_Kubernetes_Clusters__Tips_and_Tricks_-_Rejekts.pdf
richburroughs
0
1.1k
What On-Call Does to Us
richburroughs
1
110
Other Decks in Technology
See All in Technology
【若手エンジニア応援LT会】ソフトウェアを学んできた私がインフラエンジニアを目指した理由
kazushi_ohata
0
150
OCI Network Firewall 概要
oracle4engineer
PRO
0
4.1k
安心してください、日本語使えますよ―Ubuntu日本語Remix提供休止に寄せて― 2024-11-17
nobutomurata
1
990
AGIについてChatGPTに聞いてみた
blueb
0
130
Lambda10周年!Lambdaは何をもたらしたか
smt7174
2
110
スクラムチームを立ち上げる〜チーム開発で得られたもの・得られなかったもの〜
ohnoeight
2
350
ドメイン名の終活について - JPAAWG 7th -
mikit
33
20k
[FOSS4G 2024 Japan LT] LLMを使ってGISデータ解析を自動化したい!
nssv
1
210
dev 補講: プロダクトセキュリティ / Product security overview
wa6sn
1
2.3k
隣接領域をBeyondするFinatextのエンジニア組織設計 / beyond-engineering-areas
stajima
1
270
rootlessコンテナのすゝめ - 研究室サーバーでもできる安全なコンテナ管理
kitsuya0828
3
380
Exadata Database Service on Dedicated Infrastructure(ExaDB-D) UI スクリーン・キャプチャ集
oracle4engineer
PRO
2
3.2k
Featured
See All Featured
The MySQL Ecosystem @ GitHub 2015
samlambert
250
12k
Bash Introduction
62gerente
608
210k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
44
6.8k
Why You Should Never Use an ORM
jnunemaker
PRO
54
9.1k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
31
2.7k
Side Projects
sachag
452
42k
YesSQL, Process and Tooling at Scale
rocio
169
14k
Product Roadmaps are Hard
iamctodd
PRO
49
11k
Making the Leap to Tech Lead
cromwellryan
133
8.9k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
47
2.1k
The Language of Interfaces
destraynor
154
24k
Teambox: Starting and Learning
jrom
133
8.8k
Transcript
Learning Through Failure Rich Burroughs Community Manager Gremlin, Inc. @richburroughs
None
None
Complexity is constantly increasing
None
None
None
What's changed?
None
None
"Catastrophe is always just around the corner"
"Change introduces new forms of failure"
"All practitioner actions are gambles"
None
None
What are some ways we can learn more about systems?
None
None
None
Chaos Engineering
"The science of performing intentional experimentation on a system by
injecting precise and measured amounts of harm to observe how the system responds for the purpose of improving the system’s resilience."
None
Prerequisites —Observability —Blameless Culture
Scientific Method —Ask a question —Research —Form a hypothesis —Experiment
to test the hypothesis —Analyze data and draw a conclusion —Share the results
Types of attacks —Shutdown —CPU —Memory —I/O —Network Latency —Packet
Loss —DNS —Blackhole
None
The goal is to experiment in Production
None
Example experiment —Application: Front End —Attack: CPU —Hypothesis: Adding CPU
load will cause additional hosts to spin up in our Autoscaling Group —Abort condition: Latency increases by 20%
Example experiment #2 —Application: Front End —Attack: Blackhole —Hypothesis: Blackholing
the hostname for the Twilio API will cause the SMS transmissions to time out —Abort condition: Error rate increases by 20%
Don't experiment on things you know are broken
None
Questions —Were we able to measure the results? —Did the
system respond the way we expected? —Are there things we need to fix?
Run experiments to simulate an incident you've had
What comes after Game Days?
Continuous Chaos
Maturity model —Running manual experiments —Running experiments using Chaos Engineering
tools —Regularly scheduled Game Days —Experimenting in Production —Continuous Chaos
Next steps: —Join our Chaos Engineering Slack: gremlin.com/ slack —Read
tutorials: gremlin.com/community —Chaos Conf: chaosconf.io —Gremlin Free: go.gremlin.com/richchaos
Thank you! Twitter: @richburroughs Email:
[email protected]
Slides: https://github.com/richburroughs/ dojo201904