Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Chaos Engineering in Containers

Yury Nino
September 11, 2019

Chaos Engineering in Containers

Yury Nino

September 11, 2019
Tweet

More Decks by Yury Nino

Other Decks in Technology

Transcript

  1. 2 Yury Niño Roa Software Engineer | DevOps Engineer @yurynino

    Ingeniera de Sistemas. Especialista en Ingeniería de Software Coorganizadora de GDG Bogotá, GDG Cloud Bogotá y Women Techmakers Bogotá The best time to learn about fire is when you’re on fire. —Jen Hammond, New Relic Engineering Manager
  2. Image credit to Ashely McNamara’s Gopher Repo and Renee French

    Chaos Engineering in Containers Yury Niño Roa
  3. Agenda • Containers & Kubernetes. • The Kubernetes Promise. •

    Reliability Engineering. • Resilience Engineering. • Kubernetes is not perfect! • Chaos Engineering.
  4. What is a Container? It is “a lightweight OS-level virtualization

    method” “stand-alone piece of executable software” “NOT a virtual machine” A container running on a machine is simply a process on the host operating system like any other.
  5. SREs are focused on efficiency, automation, and reducing costs, taking

    manual and repetitive tasks and automating them.
  6. Resilience is here the ability to return to the steady-state

    following a perturbation. Resilience Engineering is about the characteristics of resilient performance per se, how we can recognise it, how we can measure it, how we can improve it.
  7. A distributed system on production needs to be resilient in

    order to be reliable and this is precisely a target that we Software Engineers, Systems Engineers, Site Reliability Engineers and Chaos Engineers always aim.
  8. Black Swans are events that comes as a surprise, have

    a major effect, and are often inappropriately rationalized. The term is based on an ancient saying that presumed black swans did not exist
  9. The World is Chaotic! Black swans take our systems down

    and keep them down for a long time. Laura Nolan, SRE in Slack
  10. Chaos Engineering It is the discipline of experimenting in production

    on a distributed system in order to reveal their weakness and to build confidence in their resilience capability. https://principlesofchaos.org/
  11. Chaos Engineering It is deliberately inducing stress or fault into

    software and/or hardware as a way of learning/verifying things about systems. https://www.gremlin.com
  12. 2008 Chaos Engineering began at Netflix 2010 Chaos Monkey was

    launched 2018 A lot of resources for Chaos Engineering. 2014 Role of Chaos Engineer was created. History of Chaos Engineering Kolton Andrus
  13. I want to emphasize that both sides of the equation

    [unit, regression and chaos side] are required to get you the level of availability you want.