Pro Yearly is on sale from $80 to $50! »

Easy Recipes for Building Resilience with Chaos Engineering

Ff4c82792a2a149102604bf71dcc6a78?s=47 Yury Nino
August 22, 2020

Easy Recipes for Building Resilience with Chaos Engineering

Ff4c82792a2a149102604bf71dcc6a78?s=128

Yury Nino

August 22, 2020
Tweet

Transcript

  1. Easy Recipes for building Resilience with Chaos Engineering Ekoparty Agosto

    22th
  2. YURY NIÑO ROA Site Reliability Engineer Chaos Engineering Advocate @yurynino

    https://www.yurynino.dev/
  3. • Why are you speaking about cook? • Cook &

    Science & Chaos. • Scientific method. • Cook chaos Recipes. • Ingredients: Cloud, Chaos Tools & Observability. • Learn for the next dinner! Agenda
  4. Why are you speaking about cook?

  5. Chaos Experimenting and Cooking is a combination of art and

    science! Chaos Engineering and Cooking have many things in common ...
  6. How many of you consider yourself as cook?

  7. Cook Taken from https://medium.com/shyp-design/a-designer-s-scientif ic-method-12671b41efb7

  8. How many of you consider yourself as scientists?

  9. How many of you have ever done an experiment?

  10. Science

  11. Engineers without a PhD can cook and also be Scientists!

  12. Really ... Are you kidding me

  13. A Cookbook is a hands-on guide to exploring a technology!

    A Cookbook is a guide for learning how to practice Chaos Engineering using recipes.
  14. Apply the scientific method to incident response! How? Through ...

  15. The infrastructure required by a software system can be as

    complex as the software itself. We need a hands-on guide to exploring the world of Chaos! Netflix Twitter
  16. What is Chaos Engineering?

  17. Chaos Engineering It is the discipline of experimenting failures in

    production in order to reveal their weakness and to build confidence in their resilience capability. https://principlesofchaos.org/
  18. Why is Science important in Engineering?

  19. Chaos Engineering It is a scientific method that consists in

    specifying and evaluating resilience hypotheses 1) injecting faults in production 2) observing the impact 3) building resilience Long Zhang. A Chaos Engineering System
  20. Principles Hypothesize about Steady State Run Experiments Vary Real-World Events

    Automate Experiments
  21. History 2008 Chaos Engineering was born at Netflix 2010 Chaos

    Monkey & Simian Army were launched 2016 Gremlin was born 2019 Chaos Massification 2017 SRE USenix Chaos IQ ChaosConf 2018 Book Chaos Eng 2020 Book Chaos Eng
  22. Chaos Maturity Model From Chaos Engineering Book 2020

  23. To Cook

  24. 1. Pick a Hypothesis: Recipe! 2. Choose the tools: Ingredients!

    3. Launch an attack: Cook! 4. Notify the Org: Invite! 5. Run the Experiment: Enjoy! 6. Analyze the Results 7. Automate To Cook
  25. Ingredients Chaos Monkey Chaos Toolkit Gremlin Chaos Mesh

  26. Recipes with Chaos

  27. Recipe 1 Tools/ Ingredients Gremlin, AWS Hypothesis Cloud can fail

    :O Environment My Home Duration 2 minutes Load 1 request Observability AWS Console Results ???
  28. Recipe 2 Tools/ Ingredients Gremlin, Local Hypothesis Local can fail

    :O Environment My Home Duration 2 minutes Load 1 request Observability Local Console Results ??? https://www.youtube.com/watch?v=PcwdZB_blLc
  29. More Recipes • Introduce latency on security controls. • Disable

    service event logging. • API gateway shutdown. • Unencrypted S3 Bucket. • Disable MFA. • Permission collision in a shared IAM role policy.
  30. Chaos Scenarios

  31. Who is practicing Chaos Engineering?

  32. None
  33. Disaster Piece Whenever they launch features or make changes, we

    test the fault tolerance of that new code! In January of 2018, they started a rigorous process of identifying failures that are likely to happen and that we must be able to tolerate, and then purposely causing them to happen in production. This isn’t Chaos Engineering as practiced and evangelized by Netflix. It’s the first step; we call it Disasterpiece Theater. Taken from Chaos Engineering Book 2020
  34. LinkedOut Taken from Chaos Engineering Book 2020

  35. Evolution Taken from Chaos Engineering Book 2020 CI/CD Tooling Culture

    Evangelism Team
  36. Security Chaos Engineering It is the identification of security control

    failures through proactive experimentation to build confidence in the system’s ability to defend against malicious conditions in production. Security Chaos Engineering Book
  37. How to Cook https://www.gremlin.com https://chaosengineering.slack.com https://github.com/dastergon/awesome-chaos- engineering https://www.infoq.com/chaos-engineering

  38. There is an ancient proverb that says: It's very difficult

    to find a black cat in a dark room, especially when there is no cat!
  39. Thanks for coming!!! @yurynino