Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Chaos engineering for serverless applications - AWS Resiliency and Chaos Engineering October 27 2020

Chaos engineering for serverless applications - AWS Resiliency and Chaos Engineering October 27 2020

Presented at AWS Resiliency and Chaos Engineering, October 27th, 2020.

Video available here: https://youtu.be/a4SBhCqus_g

https://twitter.com/gunnargrosch
https://demo.serverlesschaos.com
https://github.com/gunnargrosch/failure-lambda

Planning and performing chaos experiments on instance- and container-based workloads have been battle-tested by companies of all sizes and industries. However, serverless functions and managed services present different failure modes and levels of abstraction. In this session, we focus on applying chaos engineering principles to serverless, both for serverless functions and managed services. This covers how hypotheses are formed to fit serverless, what the experiments can achieve, and how to perform them practically.

Gunnar Grosch

October 27, 2020
Tweet

More Decks by Gunnar Grosch

Other Decks in Technology

Transcript

  1. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Gunnar Grosch @gunnargrosch October 27-28, 2020 Chaos engineering for serverless applications AWS Resiliency and Chaos Engineering
  2. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Motivations behind chaos engineering
  3. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Motivations behind chaos engineering Downtime or issues Happy users Learning from incidents Monitoring and alerting Organizational confidence
  4. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Serverless chaos experiments
  5. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. What is serverless? No infrastructure provisioning, no management Automatic scaling Pay for value Highly available and secure
  6. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Serverless chaos experiments Errors Fallbacks Timeouts Events
  7. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Serverless chaos experiments Inject errors into your code Remove downstream services Alter the concurrency of functions Restrict the capacity of tables Client Amazon Simple Storage Service (Amazon S3) Amazon API Gateway AWS Lambda Amazon DynamoDB AWS Lambda Amazon Simple Storage Service (Amazon S3)
  8. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Serverless chaos experiments Security policy errors CORS configuration errors Service configuration errors Function disk space failure Client Amazon Simple Storage Service (Amazon S3) Amazon API Gateway AWS Lambda Amazon DynamoDB AWS Lambda Amazon Simple Storage Service (Amazon S3)
  9. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Serverless chaos experiments Add latency to your functions • Cold starts • Runtime or code issues • Integration issues • Timeouts Client Amazon Simple Storage Service (Amazon S3) Amazon API Gateway AWS Lambda Amazon DynamoDB AWS Lambda Amazon Simple Storage Service (Amazon S3)
  10. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Chaos engineering for serverless tools Chaos-lambda Python Failure-lambda NodeJS
  11. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Failure-lambda NodeJS NPM package for NodeJS Lambdas Configuration using Parameter Store or AWS AppConfig Several failure modes • Latency • Status code • Exception • Disk space • Denylist const failureLambda = require('failure-lambda’) exports.handler = failureLambda(async (event, context) => { ... }) { "isEnabled": false, "failureMode": "latency", "rate": 1, "minLatency": 100, "maxLatency": 400, "exceptionMsg": "Exception message!", "statusCode": 404, "diskSpace": 100, “denylist": [ "s3.*.amazonaws.com", "dynamodb.*.amazonaws.com" ] }
  12. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Serverless chaos demo
  13. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Serverless chaos demo
  14. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Serverless chaos demo Client Amazon S3 Amazon API Gateway AWS Lambda Amazon DynamoDB AWS Lambda AWS Lambda
  15. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Client Amazon S3 Amazon API Gateway AWS Lambda Amazon DynamoDB AWS Lambda AWS Lambda Serverless chaos demo • What if my function takes an extra 300 ms for each invocation? • What if my function returns an error code? • What if I can’t get data from DynamoDB?
  16. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Serverless chaos CI/CD demo
  17. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Serverless chaos CI/CD demo • What if my function takes an extra 300 ms for each invocation? • What if my function returns an error code? • What if I can’t get data from DynamoDB? Default deploy
  18. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Feature flag Serverless chaos CI/CD demo • What if my function takes an extra 300 ms for each invocation? • What if my function returns an error code? • What if I can’t get data from DynamoDB?
  19. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Canary deploy Serverless chaos CI/CD demo • What if my function takes an extra 300 ms for each invocation? • What if my function returns an error code? • What if I can’t get data from DynamoDB?
  20. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Application configuration control
  21. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Adding application configuration control AWS AppConfig allows us to create, manage, and quickly deploy config Features: • Validate failure configuration • Deploy configuration using gradual or non- gradual deploy strategy • Monitor deployed configuration with automatical rollback
  22. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Serverless chaos AppConfig demo
  23. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Summary Use chaos engineering to find weaknesses and fix them Use chaos engineering to verify that your system behaves as expected Use chaos engineering to build confidence Chaos engineering should be done regularly It’s easy to get started.
  24. © 2020, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Thank you! Gunnar Grosch @gunnargrosch