Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Training Engineering Teams with Chaos Engineering

Yury Nino
September 15, 2020

Training Engineering Teams with Chaos Engineering

Yury Nino

September 15, 2020
Tweet

More Decks by Yury Nino

Other Decks in Technology

Transcript

  1. AGENDA Topics will be covered Motivations • Identifying needs •

    Training Engineers How to • Trainer Role • Instruction Principles GameDays • Chaos Engineering • Chaos GameDays https://www.yurynino.dev/
  2. Humans, are central to both the problem and the solution

    of challenges in engineering! https://www.yurynino.dev/
  3. • Be prepared. • State your objectives. • Be organized.

    • Use visuals. • Answer questions. • Be enthusiastic. Training Do’s • Provide feedback. • Be flexible. • Prepare for emergencies. • Encourage participation. • Establish rapport. • Be yourself. https://www.yurynino.dev/
  4. • Starting late and wasting time. • Being poorly prepared

    and lacking knowledge. • Displaying distracting habits. • Ignoring participants and interrupting their questions. • Lacking enthusiasm. • Reading from a script. Mistakes to avoid https://www.yurynino.dev/
  5. • Establish an informal atmosphere. • Encourage participants to take

    control. • Accept participants where they are. • Communicate openly and honestly. • Tap participants for the ideas. How to Ensure the Participation https://www.yurynino.dev/
  6. • To be able to construct a mental representation. •

    To be able to assess risks and threats as relevant. • To be able to switch from a situation under control. • To be able to maintain a relevant level of confidence. • To be able to make a decision in a complex situation. In an emergency https://www.yurynino.dev/
  7. • To be able to make an intelligent usage of

    procedures. • To be able to use available resources. • To be able to manage time and pressure. • To be able to cooperate with and crew members. • To be able to properly use and manage information. In an emergency https://www.yurynino.dev/
  8. The world is an imperfect place. We can not control

    the environment! But we can control how to face the failures. https://www.yurynino.dev/
  9. What is Chaos Engineering? It is the discipline of experimenting

    failures in production in order to reveal their weakness and to build confidence in their resilience capability. https://principlesofchaos.org/
  10. What is Chaos Engineering? It is a scientific method that

    consists in specifying and evaluating resilience hypotheses 1) injecting faults in production 2) observing the impact 3) building resilience https://principlesofchaos.org/
  11. 2008 Chaos Engineering began at Netflix 2010 Chaos Monkey &

    Simian Army were launched 2016 Gremlin born 2019 1 Book Chaos massification 2017 SRE Usenix Chaos IQ born ChaosConf 2018 1 Book Chaos Monkey for Spring Boot 2020 1 Book was published Chaos Engineering History
  12. 1. Pick a Hypothesis: Recipe! 2. Choose the tools: Ingredients!

    3. Launch an attack: Cook! 4. Notify the Org: Invite! 5. Run the Experiment: Enjoy! 6. Analyze the Results 7. Automate Chaos Principles
  13. Before After During • Pick a hypothesis. • Pick a

    style. • Decide who. • Decide where. • Decide when. • Document. • Get approval! • Detect the situation. • Take a deep breath. • Communicate. • Visit dashboards. • Analyze data. • Propose solutions. • Apply and solve! • Write a postmortem. • What Happened • Impact • Duration • Resolution Time • Resolution • Timeline • Action Items Chaos Methodology
  14. The infrastructure required by a software system can be as

    complex as the software itself. We need a hands-on guide to exploring the world of Chaos! Netflix Twitter Chaos Engineering Motivations
  15. Disaster Piece Whenever they launch features or make changes, we

    test the fault tolerance of that new code! In January of 2018, they started a rigorous process of identifying failures that are likely to happen and that we must be able to tolerate, and then purposely causing them to happen in production. This isn’t Chaos Engineering as practiced and evangelized by Netflix. It’s the first step; we call it Disasterpiece Theater. Who are practicing
  16. Chaos GameDays GameDays are an interactive, real-world and learning exercises.

    They are designed to give players a chance to put their skills in a technology to test. GameDays were created by Jesse Robbins inspired by his experience & training as a firefighter.
  17. GameDays Chaos Gamedays GameDays are interactive team-based learning exercises designed

    to give players a chance to put their skills to the test in a real-world, gamified, risk-free environment. A Chaos GameDay is a practice event, and although it can take a whole day, it usually requires only a few hours. The goal of a GameDay is to practice how you, your team, and your supporting systems deal with real-world turbulent conditions. Chaos References
  18. First on Call Monitors, triages, and tries to mitigate failures

    caused by the Master of Disaster. Master of Disaster Decides the failure and declares start of incident and attack!!! Team Find and solve the exhibited issues, and write up postmortem. Chaos GameDays