Slide 1

Slide 1 text

Chaos Engineering Building Immunity in Distributed Systems DevOpsDays Guadalajara México February 20th

Slide 2

Slide 2 text

YURY NIÑO DevOps Engineer Chaos Engineering Advocate @yurynino

Slide 3

Slide 3 text

Why some survive to pandemics? What is the Immune System? Artificial Immune Software Systems. Injecting Chaos to build Immunity. Chaos Engineering: What, Why, Who and How? Chaos Example. Agenda

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

Why some survive to pandemics?

Slide 6

Slide 6 text

Studies suggest that viral factors and medicines most likely contributed to reducing the number of deaths, however the most likely associated with the host's immune system.

Slide 7

Slide 7 text

The immune system is a complex adaptive network of cells and proteins that defends the body against infection. The immune system keeps a record of every germ that has ever defeated so it can recognise and destroy it. Vaccines!

Slide 8

Slide 8 text

Why are you talking about this in a Software Conference?

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

What is an Artificial Immune Software System?

Slide 11

Slide 11 text

An artificial immune system is an intelligent system that learns to recognize relevant patterns that have been seen previously. Using computational techniques, these systems are able to construct patterns detectors and defend the systems of similar attacks.

Slide 12

Slide 12 text

An artificial immune system is an intelligent system that recognize and learns from faults injected previously. Using computational techniques, these systems are able to use resilience patterns to build confidence in the system's capability to withstand turbulence.

Slide 13

Slide 13 text

The World is Chaotic! Face them with Resilience! Circuit Breaker Pattern Bulkhead Pattern Compensating Transactions Health Endpoint Monitoring

Slide 14

Slide 14 text

Chaos Engineering What, Why, Who and How

Slide 15

Slide 15 text

What is Chaos Engineering? It is the discipline of experimenting failures in production in order to reveal their weakness and to build confidence in their resilience capability. https://principlesofchaos.org/

Slide 16

Slide 16 text

What is Chaos Engineering? is a tool that we use to build immunity in our software systems by injecting harm, like latency, CPU failure, or network black holes, to find and mitigate potential weaknesses. Gremlin

Slide 17

Slide 17 text

Why Chaos Engineering? Because testing on DEV/STG is not enough. Because unpredictable events are bound to happen ON PROD. You need to know the unknown!

Slide 18

Slide 18 text

2008 Chaos Engineering began at Netflix 2010 Chaos Monkey was launched 2018 A lot of resources for Chaos Engineering. 2014 Role of Chaos Engineer was created. History of Chaos Engineering Kolton Andrus Why Chaos Engineering?

Slide 19

Slide 19 text

What my mom thinks I do What my friends thinks I do What software engineers think I do What I really do Who is a Chaos Engineer? Help service owners to increase their resilience through education, tools and encouragement.

Slide 20

Slide 20 text

Conclusion Who are doing Chaos Engineering?

Slide 21

Slide 21 text

How Chaos Engineering? Applying Chaos Principles Hypothesize about Steady State Run Experiments Vary Real-World Events Automate Experiments

Slide 22

Slide 22 text

Chaos Days are dedicated days for your entire company to focus on building resilience instead of new products.

Slide 23

Slide 23 text

How Chaos Engineering? Running Gamedays! First on Call member sees, triages, and tries to mitigate whatever failure the MoD has caused. Master of Disaster Decides the failure and declares start of incident and attack!!! Team will find and solve the issue in less than 75% of the allocated time. Finally they write up a Postmortem! Inspired in the James Burns’s work

Slide 24

Slide 24 text

Demo Time!

Slide 25

Slide 25 text

Chaos Example

Slide 26

Slide 26 text

Chaos Example

Slide 27

Slide 27 text

Chaos Example

Slide 28

Slide 28 text

Configuration

Slide 29

Slide 29 text

Configuration

Slide 30

Slide 30 text

Configuration

Slide 31

Slide 31 text

Observability

Slide 32

Slide 32 text

That is why I work for a Lab! ● We practice Engineering ● We practice Science ● We practice Methods ● We practice Chaos Engineering

Slide 33

Slide 33 text

How to start with Chaos Engineering?

Slide 34

Slide 34 text

How to start? https://chaosengineering.slack.com https://github.com/dastergon/awesome-chaos-e ngineering https://www.infoq.com/chaos-engineering

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

Screws fall out all the time!!! The world is an imperfect place. We can not control the environment! But we can control how to face the virus, bacterias and the failures. Mine :)

Slide 37

Slide 37 text

Thanks for coming!!! @yurynino