Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Performing chaos in a serverless world - Stockh...

Gunnar Grosch
September 05, 2019

Performing chaos in a serverless world - Stockholm Serverless Meetup September 5 2019

The principles of chaos engineering have been battle-tested for years using traditional infrastructure and containerized microservices, but how do they work with serverless functions and managed services? Join as we move from talking about principles to performing real chaos in a serverless world!

Gunnar Grosch

September 05, 2019
Tweet

More Decks by Gunnar Grosch

Other Decks in Technology

Transcript

  1. Stockholm Serverless Meetup @gunnargrosch Chaos Engineering has been battle-tested for

    years using traditional infrastructure and containerized microservices, but how does it work with serverless functions and managed services?
  2. Stockholm Serverless Meetup @gunnargrosch What we’ll cover What is Chaos

    Engineering? Running chaos experiments Challenges when using Chaos Engineering for serverless Serverless chaos experiments
  3. Stockholm Serverless Meetup @gunnargrosch A resilient system is a highly

    available and durable system that can maintain an acceptable level of service in the face of failure.
  4. Stockholm Serverless Meetup @gunnargrosch About me Evangelist and co-founder at

    Opsio Background in development and operations Organizer of AWS User Groups and AWS Community Day Nordics ServerlessDays Stockholm and Serverless Meetups organizer Father of three chaos monkeys
  5. Stockholm Serverless Meetup @gunnargrosch Chaos Engineering is the discipline of

    experimenting on a system in order to build confidence in the system’s capability to withstand turbulent conditions in production. principlesofchaos.org
  6. Stockholm Serverless Meetup @gunnargrosch Chaos Engineering is about finding the

    weaknesses in a system and fixing them before they break
  7. Stockholm Serverless Meetup @gunnargrosch “Everything fails, all the time!” Werner

    Vogels, CTO Amazon Source: HDMI No Signal To display Help, press the ? button
  8. Stockholm Serverless Meetup @gunnargrosch Don’t ask what happens if a

    system fails, but ask what happens when it fails.
  9. Stockholm Serverless Meetup @gunnargrosch Why run experiments? Are your customers

    getting the experience they should? Is downtime or issues costing you money? Are you confident in your monitoring and alerting? Is your organization ready to handle outages?
  10. Stockholm Serverless Meetup @gunnargrosch Step 1: Define steady state The

    normal behavior of a system over time System metrics and business metrics Steady state is not necessarily continuous Business metrics are usually more useful
  11. Stockholm Serverless Meetup @gunnargrosch Step 2: Form your hypothesis Chaos

    can be injected at any layer in the stack Use what if:s Always fix known problems first!
  12. Stockholm Serverless Meetup @gunnargrosch Step 3: Plan and run your

    experiment Whiteboard the experiment in detail Contain the blast radius Notify the organization Make sure to have a ”stop” button
  13. Stockholm Serverless Meetup @gunnargrosch Step 4: Measure and learn Use

    metrics to prove or disprove the hypothesis Was the system resilient to the injected failure? Did anything unexpected happen? Share your progress and success!
  14. Stockholm Serverless Meetup @gunnargrosch Step 5: Scale up or abort

    and fix With confidence you can scale-up Increased scope can reveal new effects
  15. Stockholm Serverless Meetup @gunnargrosch Serverless means new challenges No servers

    to manage Less heavy lifting Lots of services Per function configuration More granular architectures
  16. Stockholm Serverless Meetup @gunnargrosch Common serverless weaknesses Missing error handling

    Wrong timeout values Missing fallback Missing regional failover
  17. Stockholm Serverless Meetup @gunnargrosch Serverless chaos experiments • Inject errors

    in your code • One in X requests throws an error • Turn on and off using parameter or variable • Remove downstream services • Alter the concurrency of your functions • Restrict the capacity of your DynamoDB table • Add configuration errors • Security policies • CORS configuration
  18. Stockholm Serverless Meetup @gunnargrosch Serverless chaos experiments • Add latency

    to your functions • Cold starts • Cloud provider issues • Runtime or code issues • Integration issues • Timeouts • Yan Cui wrote an article and published sample code. • Adrian Hornsby built a Lambda Layer around these ideas.
  19. Stockholm Serverless Meetup @gunnargrosch Tools for serverless chaos experiments Gremlin

    gremlin.com Chaos Toolkit chaostoolkit.org Thundra thundra.io Build Your Own 127.0.0.1
  20. Stockholm Serverless Meetup @gunnargrosch Failure injection • What if my

    function take 300 ms extra for each invocation? • What if my function returns an error code? • What if there is an exception in the code? • Hypothesis: My app can handle that failure is injected on a function level. • Let’s do it!
  21. Stockholm Serverless Meetup @gunnargrosch Summary • Everything fails, all the

    time. • A resilient system maintains an acceptable level of service in the face of failure. • Chaos Engineering is about building confidence in your system and your organization. • Serverless introduces new challenges for Chaos Engineering. • Design the smallest possible experiment to test the system without causing an outage. • Understand how failure plays out, then scale it up as confidence in the system grows. • You can do it!
  22. Stockholm Serverless Meetup @gunnargrosch Do you want more? • Follow

    @serverlesschaos on Twitter • Try the Serverless Chaos Demo app: https://demo.serverlesschaos.com • YouTube videos and repositories: https://grosch.se • Chaos Engineering Slack Community: bit.ly/chaos-eng-slack • Chaos Engineering Google Group: https://groups.google.com/forum/#!forum/chaos-community • List of awesome Chaos Engineering resources: https://github.com/dastergon/awesome-chaos-engineering/