Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
dojo.pdf
Search
Rich Burroughs
April 18, 2019
Technology
0
130
dojo.pdf
Rich Burroughs
April 18, 2019
Tweet
Share
More Decks by Rich Burroughs
See All by Rich Burroughs
Virtual_Kubernetes_Clusters__Tips_and_Tricks_-_Rejekts.pdf
richburroughs
0
1.1k
What On-Call Does to Us
richburroughs
1
110
Other Decks in Technology
See All in Technology
KubeCon NA 2024 Recap: How to Move from Ingress to Gateway API with Minimal Hassle
ysakotch
0
200
Amazon SageMaker Unified Studio(Preview)、Lakehouse と Amazon S3 Tables
ishikawa_satoru
0
150
あの日俺達が夢見たサーバレスアーキテクチャ/the-serverless-architecture-we-dreamed-of
tomoki10
0
430
マイクロサービスにおける容易なトランザクション管理に向けて
scalar
0
110
Snowflake女子会#3 Snowpipeの良さを5分で語るよ
lana2548
0
230
日本版とグローバル版のモバイルアプリ統合の開発の裏側と今後の展望
miichan
1
130
サーバレスアプリ開発者向けアップデートをキャッチアップしてきた #AWSreInvent #regrowth_fuk
drumnistnakano
0
190
.NET 9 のパフォーマンス改善
nenonaninu
0
800
フロントエンド設計にモブ設計を導入してみた / 20241212_cloudsign_TechFrontMeetup
bengo4com
0
1.9k
ずっと昔に Star をつけたはずの思い出せない GitHub リポジトリを見つけたい!
rokuosan
0
150
統計データで2024年の クラウド・インフラ動向を眺める
ysknsid25
2
840
Amazon Kendra GenAI Index 登場でどう変わる? 評価から学ぶ最適なRAG構成
naoki_0531
0
100
Featured
See All Featured
Scaling GitHub
holman
458
140k
Done Done
chrislema
181
16k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
26
1.9k
Building a Scalable Design System with Sketch
lauravandoore
460
33k
A Tale of Four Properties
chriscoyier
157
23k
Raft: Consensus for Rubyists
vanstee
137
6.7k
Visualization
eitanlees
146
15k
Fantastic passwords and where to find them - at NoRuKo
philnash
50
2.9k
A designer walks into a library…
pauljervisheath
204
24k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
26
1.5k
YesSQL, Process and Tooling at Scale
rocio
169
14k
Product Roadmaps are Hard
iamctodd
PRO
49
11k
Transcript
Learning Through Failure Rich Burroughs Community Manager Gremlin, Inc. @richburroughs
None
None
Complexity is constantly increasing
None
None
None
What's changed?
None
None
"Catastrophe is always just around the corner"
"Change introduces new forms of failure"
"All practitioner actions are gambles"
None
None
What are some ways we can learn more about systems?
None
None
None
Chaos Engineering
"The science of performing intentional experimentation on a system by
injecting precise and measured amounts of harm to observe how the system responds for the purpose of improving the system’s resilience."
None
Prerequisites —Observability —Blameless Culture
Scientific Method —Ask a question —Research —Form a hypothesis —Experiment
to test the hypothesis —Analyze data and draw a conclusion —Share the results
Types of attacks —Shutdown —CPU —Memory —I/O —Network Latency —Packet
Loss —DNS —Blackhole
None
The goal is to experiment in Production
None
Example experiment —Application: Front End —Attack: CPU —Hypothesis: Adding CPU
load will cause additional hosts to spin up in our Autoscaling Group —Abort condition: Latency increases by 20%
Example experiment #2 —Application: Front End —Attack: Blackhole —Hypothesis: Blackholing
the hostname for the Twilio API will cause the SMS transmissions to time out —Abort condition: Error rate increases by 20%
Don't experiment on things you know are broken
None
Questions —Were we able to measure the results? —Did the
system respond the way we expected? —Are there things we need to fix?
Run experiments to simulate an incident you've had
What comes after Game Days?
Continuous Chaos
Maturity model —Running manual experiments —Running experiments using Chaos Engineering
tools —Regularly scheduled Game Days —Experimenting in Production —Continuous Chaos
Next steps: —Join our Chaos Engineering Slack: gremlin.com/ slack —Read
tutorials: gremlin.com/community —Chaos Conf: chaosconf.io —Gremlin Free: go.gremlin.com/richchaos
Thank you! Twitter: @richburroughs Email:
[email protected]
Slides: https://github.com/richburroughs/ dojo201904