Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
dojo.pdf
Search
Rich Burroughs
April 18, 2019
Technology
0
130
dojo.pdf
Rich Burroughs
April 18, 2019
Tweet
Share
More Decks by Rich Burroughs
See All by Rich Burroughs
Virtual_Kubernetes_Clusters__Tips_and_Tricks_-_Rejekts.pdf
richburroughs
0
1.2k
What On-Call Does to Us
richburroughs
1
120
Other Decks in Technology
See All in Technology
Claude Codeを使った情報整理術
knishioka
11
6.3k
マイクロサービスへの5年間 ぶっちゃけ何をしてどうなったか
joker1007
21
8.1k
Bedrock AgentCore Memoryの新機能 (Episode) を試してみた / try Bedrock AgentCore Memory Episodic functionarity
hoshi7_n
2
1.9k
New Relic 1 年生の振り返りと Cloud Cost Intelligence について #NRUG
play_inc
0
240
NIKKEI Tech Talk #41: セキュア・バイ・デザインからクラウド管理を考える
sekido
PRO
0
210
100以上の新規コネクタ提供を可能にしたアーキテクチャ
ooyukioo
0
260
AIエージェント開発と活用を加速するワークフロー自動生成への挑戦
shibuiwilliam
5
860
AI with TiDD
shiraji
1
290
SREが取り組むデプロイ高速化 ─ Docker Buildを最適化した話
capytan
0
150
2025-12-18_AI駆動開発推進プロジェクト運営について / AIDD-Promotion project management
yayoi_dd
0
160
2025年のデザインシステムとAI 活用を振り返る
leveragestech
0
260
事業の財務責任に向き合うリクルートデータプラットフォームのFinOps
recruitengineers
PRO
2
210
Featured
See All Featured
Being A Developer After 40
akosma
91
590k
世界の人気アプリ100個を分析して見えたペイウォール設計の心得
akihiro_kokubo
PRO
65
35k
How to make the Groovebox
asonas
2
1.8k
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
Why Mistakes Are the Best Teachers: Turning Failure into a Pathway for Growth
auna
0
28
jQuery: Nuts, Bolts and Bling
dougneiner
65
8.3k
Ruling the World: When Life Gets Gamed
codingconduct
0
100
JAMstack: Web Apps at Ludicrous Speed - All Things Open 2022
reverentgeek
1
300
Tips & Tricks on How to Get Your First Job In Tech
honzajavorek
0
400
Raft: Consensus for Rubyists
vanstee
141
7.3k
Agile that works and the tools we love
rasmusluckow
331
21k
Applied NLP in the Age of Generative AI
inesmontani
PRO
3
2k
Transcript
Learning Through Failure Rich Burroughs Community Manager Gremlin, Inc. @richburroughs
None
None
Complexity is constantly increasing
None
None
None
What's changed?
None
None
"Catastrophe is always just around the corner"
"Change introduces new forms of failure"
"All practitioner actions are gambles"
None
None
What are some ways we can learn more about systems?
None
None
None
Chaos Engineering
"The science of performing intentional experimentation on a system by
injecting precise and measured amounts of harm to observe how the system responds for the purpose of improving the system’s resilience."
None
Prerequisites —Observability —Blameless Culture
Scientific Method —Ask a question —Research —Form a hypothesis —Experiment
to test the hypothesis —Analyze data and draw a conclusion —Share the results
Types of attacks —Shutdown —CPU —Memory —I/O —Network Latency —Packet
Loss —DNS —Blackhole
None
The goal is to experiment in Production
None
Example experiment —Application: Front End —Attack: CPU —Hypothesis: Adding CPU
load will cause additional hosts to spin up in our Autoscaling Group —Abort condition: Latency increases by 20%
Example experiment #2 —Application: Front End —Attack: Blackhole —Hypothesis: Blackholing
the hostname for the Twilio API will cause the SMS transmissions to time out —Abort condition: Error rate increases by 20%
Don't experiment on things you know are broken
None
Questions —Were we able to measure the results? —Did the
system respond the way we expected? —Are there things we need to fix?
Run experiments to simulate an incident you've had
What comes after Game Days?
Continuous Chaos
Maturity model —Running manual experiments —Running experiments using Chaos Engineering
tools —Regularly scheduled Game Days —Experimenting in Production —Continuous Chaos
Next steps: —Join our Chaos Engineering Slack: gremlin.com/ slack —Read
tutorials: gremlin.com/community —Chaos Conf: chaosconf.io —Gremlin Free: go.gremlin.com/richchaos
Thank you! Twitter: @richburroughs Email:
[email protected]
Slides: https://github.com/richburroughs/ dojo201904