Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ML and Serverless

ML and Serverless

In this developer focused talk, we will learn about various ways to run containers on AWS without maintaining servers. There will be a focus on the capabilities of running Docker images on Lambda and several demo’s showing what kind of cool and unique solutions you can build with this. By the end of the presentation, you will have a good understanding of how you can pick the best AWS service to build your next serverless application!

Marek Kuczynski

May 10, 2021
Tweet

More Decks by Marek Kuczynski

Other Decks in Technology

Transcript

  1. © 2021, Amazon Web Services, Inc. or its Affiliates. Marek

    Kuczynski Strategic Accounts SA @marekq on Twitter [email protected] Serverless and containers - what’s possible today?
  2. © 2021, Amazon Web Services, Inc. or its Affiliates. Key

    Lambda announcements from re:Invent 2020 Package code and dependencies as a Docker or Open Container Initiative compatible container image (up to 10GB) Reduced the billing granularity for function duration from 100ms to 1ms Allocate up to 10 GB of memory to a Lambda function and get access to up to 6 vCPUs in each execution environment
  3. © 2021, Amazon Web Services, Inc. or its Affiliates. AWS

    Lambda and containers Python, Node.js, Java, .NET Core, Go, Ruby Custom runtimes Container images Lambda Function Event Databases AWS Services Third Party APIs Amazon SNS Amazon API Gateway Amazon DynamoDB Amazon S3 and many more… Amazon SQS
  4. © 2021, Amazon Web Services, Inc. or its Affiliates. AWS

    Lambda and containers Python, Node.js, Java, .NET Core, Go, Ruby Custom runtimes Container images Lambda Function Event Databases AWS Services Third Party APIs Amazon SNS Amazon API Gateway Amazon DynamoDB Amazon S3 and many more… Amazon SQS Automatically invoked
  5. © 2021, Amazon Web Services, Inc. or its Affiliates. AWS

    Lambda and containers Python, Node.js, Java, .NET Core, Go, Ruby Custom runtimes Container images Lambda Function Event Databases AWS Services Third Party APIs Amazon SNS Amazon API Gateway Amazon DynamoDB Amazon S3 and many more… Amazon SQS Automatically invoked Max 15 minute execution time
  6. © 2021, Amazon Web Services, Inc. or its Affiliates. AWS

    Lambda and containers Python, Node.js, Java, .NET Core, Go, Ruby Custom runtimes Container images Lambda Function Event Databases AWS Services Third Party APIs Amazon SNS Amazon API Gateway Amazon DynamoDB Amazon S3 and many more… Amazon SQS Paused between invocations Automatically invoked Max 15 minute execution time
  7. © 2021, Amazon Web Services, Inc. or its Affiliates. Packaging

    code for Lambda Managed runtimes (max 250 MB) Container images (max 10 GB) Function Code (/var/task) Function Container Image Function Layer (/opt) Function Layer (/opt) Operating system (AL or AL2)
  8. © 2021, Amazon Web Services, Inc. or its Affiliates. Handling

    requests (managed runtimes) Execution Environment Runtime API Runtime def handler(event, _): name = event.get("name", "World!") return f"Hello, {name}!" Lambda Extensions (Optional)
  9. © 2021, Amazon Web Services, Inc. or its Affiliates. Handling

    requests (for container images) Execution Environment Runtime API Container Image Function Code Extensions (Optional)
  10. © 2021, Amazon Web Services, Inc. or its Affiliates. One

    use case - Machine Learning! • We can use Lambda for ML inference from many different event sources. • Lambda can be significantly cheaper and faster for irregular, lower volume invokes compared to SageMaker or EC2. • Many pretrained models can be found on HuggingFace.co. We will be using a standard “distilbert” model.
  11. © 2021, Amazon Web Services, Inc. or its Affiliates. Keeping

    ML inference performant • Loading the ML model for the first, “cold” invocation can take a few seconds. Make sure to load the model outside the Lambda handler. • For synchronous invocations through API Gateway, you can use Provisioned Concurrency to significantly lower cold start times. • For queue or stream based invocations, the cold start is far less of a problem and you may not need to do anything.
  12. © 2021, Amazon Web Services, Inc. or its Affiliates. Execution

    Environment Lifecycle Initialization Invoke Invoke Invoke Invoke Invoke Shutdown Execution Environment Initialization Invoke Invoke Shutdown Execution Environment time
  13. © 2021, Amazon Web Services, Inc. or its Affiliates. Sizing

    the Execution Environment More memory = more CPU resources From 128MB to 10GB Up to 6 vCPUS https://github.com/alexcasalboni/aws-lambda- power-tuning
  14. © 2021, Amazon Web Services, Inc. or its Affiliates. Further

    reading and sources The CDK code for the ML demo https://github.com/marekq/lambda-pytorch A TerraForm sample for Lambda Docker https://github.com/marekq/terraform-lambda-docker Detailed documentation about the feature https://docs.aws.amazon.com/lambda/latest/dg/images-create.html