Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
dojo.pdf
Search
Rich Burroughs
April 18, 2019
Technology
0
130
dojo.pdf
Rich Burroughs
April 18, 2019
Tweet
Share
More Decks by Rich Burroughs
See All by Rich Burroughs
Virtual_Kubernetes_Clusters__Tips_and_Tricks_-_Rejekts.pdf
richburroughs
0
1.2k
What On-Call Does to Us
richburroughs
1
120
Other Decks in Technology
See All in Technology
E2Eテスト設計_自動化のリアル___Playwrightでの実践とMCPの試み__AIによるテスト観点作成_.pdf
findy_eventslides
1
460
生成AIを活用したZennの取り組み事例
ryosukeigarashi
0
210
社内報はAIにやらせよう / Let AI handle the company newsletter
saka2jp
5
610
関係性が駆動するアジャイル──GPTに人格を与えたら、対話を通してふりかえりを習慣化できた話
mhlyc
0
130
SOC2取得の全体像
shonansurvivors
1
400
LLM時代にデータエンジニアの役割はどう変わるか?
ikkimiyazaki
4
780
Large Vision Language Modelを用いた 文書画像データ化作業自動化の検証、運用 / shibuya_AI
sansan_randd
0
110
定期的な価値提供だけじゃない、スクラムが導くチームの共創化 / 20251004 Naoki Takahashi
shift_evolve
PRO
3
330
about #74462 go/token#FileSet
tomtwinkle
1
410
AI駆動開発を推進するためにサービス開発チームで 取り組んでいること
noayaoshiro
0
200
AI Agentと MCP Serverで実現する iOSアプリの 自動テスト作成の効率化
spiderplus_cb
0
510
成長自己責任時代のあるきかた/How to navigate the era of personal responsibility for growth
kwappa
3
280
Featured
See All Featured
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
53k
We Have a Design System, Now What?
morganepeng
53
7.8k
Scaling GitHub
holman
463
140k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
960
The Cult of Friendly URLs
andyhume
79
6.6k
The Straight Up "How To Draw Better" Workshop
denniskardys
237
140k
jQuery: Nuts, Bolts and Bling
dougneiner
64
7.9k
GitHub's CSS Performance
jonrohan
1032
460k
Large-scale JavaScript Application Architecture
addyosmani
514
110k
For a Future-Friendly Web
brad_frost
180
9.9k
Rebuilding a faster, lazier Slack
samanthasiow
84
9.2k
Documentation Writing (for coders)
carmenintech
75
5k
Transcript
Learning Through Failure Rich Burroughs Community Manager Gremlin, Inc. @richburroughs
None
None
Complexity is constantly increasing
None
None
None
What's changed?
None
None
"Catastrophe is always just around the corner"
"Change introduces new forms of failure"
"All practitioner actions are gambles"
None
None
What are some ways we can learn more about systems?
None
None
None
Chaos Engineering
"The science of performing intentional experimentation on a system by
injecting precise and measured amounts of harm to observe how the system responds for the purpose of improving the system’s resilience."
None
Prerequisites —Observability —Blameless Culture
Scientific Method —Ask a question —Research —Form a hypothesis —Experiment
to test the hypothesis —Analyze data and draw a conclusion —Share the results
Types of attacks —Shutdown —CPU —Memory —I/O —Network Latency —Packet
Loss —DNS —Blackhole
None
The goal is to experiment in Production
None
Example experiment —Application: Front End —Attack: CPU —Hypothesis: Adding CPU
load will cause additional hosts to spin up in our Autoscaling Group —Abort condition: Latency increases by 20%
Example experiment #2 —Application: Front End —Attack: Blackhole —Hypothesis: Blackholing
the hostname for the Twilio API will cause the SMS transmissions to time out —Abort condition: Error rate increases by 20%
Don't experiment on things you know are broken
None
Questions —Were we able to measure the results? —Did the
system respond the way we expected? —Are there things we need to fix?
Run experiments to simulate an incident you've had
What comes after Game Days?
Continuous Chaos
Maturity model —Running manual experiments —Running experiments using Chaos Engineering
tools —Regularly scheduled Game Days —Experimenting in Production —Continuous Chaos
Next steps: —Join our Chaos Engineering Slack: gremlin.com/ slack —Read
tutorials: gremlin.com/community —Chaos Conf: chaosconf.io —Gremlin Free: go.gremlin.com/richchaos
Thank you! Twitter: @richburroughs Email:
[email protected]
Slides: https://github.com/richburroughs/ dojo201904