Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Yury Niño - Chaos Engineering: Building Immunity in Distributed Systems

Yury Niño - Chaos Engineering: Building Immunity in Distributed Systems

DevOps Days GDL 2020 - February 20th

Cc51a96581c0e265b8b3325dcadb6d2c?s=128

DevOpsDays GDL

February 21, 2020
Tweet

More Decks by DevOpsDays GDL

Other Decks in Technology

Transcript

  1. Chaos Engineering Building Immunity in Distributed Systems DevOpsDays Guadalajara México

    February 20th
  2. YURY NIÑO DevOps Engineer Chaos Engineering Advocate @yurynino

  3. Why some survive to pandemics? What is the Immune System?

    Artificial Immune Software Systems. Injecting Chaos to build Immunity. Chaos Engineering: What, Why, Who and How? Chaos Example. Agenda
  4. None
  5. Why some survive to pandemics?

  6. Studies suggest that viral factors and medicines most likely contributed

    to reducing the number of deaths, however the most likely associated with the host's immune system.
  7. The immune system is a complex adaptive network of cells

    and proteins that defends the body against infection. The immune system keeps a record of every germ that has ever defeated so it can recognise and destroy it. Vaccines!
  8. Why are you talking about this in a Software Conference?

  9. None
  10. What is an Artificial Immune Software System?

  11. An artificial immune system is an intelligent system that learns

    to recognize relevant patterns that have been seen previously. Using computational techniques, these systems are able to construct patterns detectors and defend the systems of similar attacks.
  12. An artificial immune system is an intelligent system that recognize

    and learns from faults injected previously. Using computational techniques, these systems are able to use resilience patterns to build confidence in the system's capability to withstand turbulence.
  13. The World is Chaotic! Face them with Resilience! Circuit Breaker

    Pattern Bulkhead Pattern Compensating Transactions Health Endpoint Monitoring
  14. Chaos Engineering What, Why, Who and How

  15. What is Chaos Engineering? It is the discipline of experimenting

    failures in production in order to reveal their weakness and to build confidence in their resilience capability. https://principlesofchaos.org/
  16. What is Chaos Engineering? is a tool that we use

    to build immunity in our software systems by injecting harm, like latency, CPU failure, or network black holes, to find and mitigate potential weaknesses. Gremlin
  17. Why Chaos Engineering? Because testing on DEV/STG is not enough.

    Because unpredictable events are bound to happen ON PROD. You need to know the unknown!
  18. 2008 Chaos Engineering began at Netflix 2010 Chaos Monkey was

    launched 2018 A lot of resources for Chaos Engineering. 2014 Role of Chaos Engineer was created. History of Chaos Engineering Kolton Andrus Why Chaos Engineering?
  19. What my mom thinks I do What my friends thinks

    I do What software engineers think I do What I really do Who is a Chaos Engineer? Help service owners to increase their resilience through education, tools and encouragement.
  20. Conclusion Who are doing Chaos Engineering?

  21. How Chaos Engineering? Applying Chaos Principles Hypothesize about Steady State

    Run Experiments Vary Real-World Events Automate Experiments
  22. Chaos Days are dedicated days for your entire company to

    focus on building resilience instead of new products.
  23. How Chaos Engineering? Running Gamedays! First on Call member sees,

    triages, and tries to mitigate whatever failure the MoD has caused. Master of Disaster Decides the failure and declares start of incident and attack!!! Team will find and solve the issue in less than 75% of the allocated time. Finally they write up a Postmortem! Inspired in the James Burns’s work
  24. Demo Time!

  25. Chaos Example

  26. Chaos Example

  27. Chaos Example

  28. Configuration

  29. Configuration

  30. Configuration

  31. Observability

  32. That is why I work for a Lab! • We

    practice Engineering • We practice Science • We practice Methods • We practice Chaos Engineering
  33. How to start with Chaos Engineering?

  34. How to start? https://chaosengineering.slack.com https://github.com/dastergon/awesome-chaos-e ngineering https://www.infoq.com/chaos-engineering

  35. None
  36. Screws fall out all the time!!! The world is an

    imperfect place. We can not control the environment! But we can control how to face the virus, bacterias and the failures. Mine :)
  37. Thanks for coming!!! @yurynino