Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Training Site Reliability Engineers with Chaos Gamedays

Yury Nino
September 25, 2020

Training Site Reliability Engineers with Chaos Gamedays

Yury Nino

September 25, 2020
Tweet

More Decks by Yury Nino

Other Decks in Technology

Transcript

  1. Millions watched live in disbelief but behind the scenes, some

    feared it was inevitable. A riveting study of an tragedy. https://www.yurynino.dev/
  2. https://www.yurynino.dev/ DevOps is the combination of cultural philosophies, practices, and

    tools that increases an organization’s ability to deliver applications and services at high velocity: evolving and improving products at a faster pace than organizations using traditional software development and infrastructure management processes. DEVOPS https://aws.amazon.com/devops/what-is-devops/
  3. DevOps is a mindset for enhanced collaboration within operations teams

    and developer teams. SRE is considered specific technical expertise that is held by engineers. https://www.yurynino.dev/ SRE is an implementation of DevOps
  4. 2003 - 2008 Ben Treynor coined SRE DevOps is born

    2014 First Conference about SRE: SRECon 2016-2018 SRE Books are released 2019 SRE massification https://www.yurynino.dev/ SRE History
  5. Facing SRE challenges require to have a team prepared for

    alerts, emergencies and incidents. https://www.yurynino.dev/ https://www.yurynino.dev/
  6. SRE does not just exist in your processes, it also

    exists in your people. https://www.yurynino.dev/ https://www.yurynino.dev/
  7. https://www.yurynino.dev/ Sink or Swim is a low investment approach, it’s

    not a very inclusive approach, and could be described as grokking SRE the hard way
  8. https://www.yurynino.dev/ With self-study techniques new hires may feel like they

    are alone if they do not have an channel for asking questions or getting support.
  9. https://www.yurynino.dev/ Live person giving class provides a useful structure for

    students who can get questions answered. It is an opportunity to everybody learn!
  10. They are expensive They are not sustainable They create dependencies

    They hide the problems! They live with frustration https://www.yurynino.dev/ Heroes are a problem! even they solve everything!
  11. Chaos Engineering It is the discipline of experimenting failures in

    production in order to reveal their weakness and to build confidence in their resilience capability. https://principlesofchaos.org/
  12. Chaos Principles Hypothesize about Steady State Run Experiments Vary Real-World

    Events Automate Experiments https://www.yurynino.dev/
  13. GameDays were created by Jesse Robbins inspired by his experience

    & training as a firefighter. A Chaos GameDay is an event hosted to conduct chaos experiments to validate or invalidate a hypothesis resilience. https://www.yurynino.dev/
  14. GameDays -- Chaos Gamedays GameDays are interactive team-based learning exercises

    designed to give players a chance to put their skills to the test in a real-world, gamified, risk-free environment. A Chaos GameDay is a practice event, and although it can take a whole day, it usually requires only a few hours. The goal of a GameDay is to practice how you, your team, and your supporting systems deal with real-world turbulent conditions. https://www.yurynino.dev/
  15. https://www.yurynino.dev/ Framework Before After During • • • • •

    • • • • • • • • • • • • • • • • • https://www.yurynino.dev/