Upgrade to Pro — share decks privately, control downloads, hide ads and more …

All the Ops you need to know to Dev Serverless

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.
Avatar for TechMasters TechMasters
September 28, 2018

All the Ops you need to know to Dev Serverless

By Chris Munns - Amazon Web Services

Presented at Functions 2018 / ServerlessDays Toronto

https://functions.events/2018/toronto/chris-munns/

Avatar for TechMasters

TechMasters

September 28, 2018
Tweet

More Decks by TechMasters

Other Decks in Programming

Transcript

  1. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Chris Munns – Principal Developer Advocate – AWS Serverless All the Ops you need to know to Dev Serverless
  2. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. About me: Chris Munns - [email protected], @chrismunns • Principal Developer Advocate - Serverless • New Yorker • Previously: • AWS Business Development Manager – DevOps, July ’15 - Feb ‘17 • AWS Solutions Architect Nov, 2011- Dec 2014 • Formerly on operations teams @Etsy and @Meetup • Little time at a hedge fund, Xerox and a few other startups • Rochester Institute of Technology: Applied Networking and Systems Administration ’05 • Internet infrastructure geek
  3. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. https://secure.flickr.com/photos/mgifford/4525333972 Why are we here today?
  4. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. No servers to provision or manage Scales with usage Never pay for idle Availability and fault tolerance built in Serverless means…
  5. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. SERVICES (ANYTHING) Changes in data state Requests to endpoints Changes in resource state EVENT SOURCE FUNCTION Node.js Python Java C# Go Serverless applications
  6. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Two common cohorts of new serverless users Developers who need to learn operations Operations folks who need to learn development
  7. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Y-Hack 2013 https://secure.flickr.com/photos/psd/4389135567/
  8. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Two common cohorts of new serverless users Developers who need to learn operations Operations folks who need to learn development I’m more one of these folks! !
  9. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Two common cohorts of new serverless users Operations folks who need to learn development Developers who need to learn operations Going to focus more today on these folks
  10. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. 4 key operational areas • Availability • Networking • Security, Governance, Auditing • Monitoring, Metrics, Logs, Performance
  11. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Internet Mobile/Web apps AWS Amazon DynamoDB Basic Serverless API technology stack Amazon API Gateway AWS Lambda functions
  12. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Is this application available?
  13. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Then our application gets some traffic...
  14. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Is this application available? Ok! 100% Available!
  15. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Is this application available? For 16 invocations for about 1 second...
  16. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Is this application available? Availability is also a shared responsibility between AWS and you • If you misconfigure API Gateway and Lambda is fine, what is your availability of Lambda? • If your downstream services are DDOS’d what layer’s fault is it? • More importantly, where do you resolve it? • If you run into concurrency limits but everything else is fine, is that an availability issue? • Concurrent executions is very much a soft limit!
  17. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Concurrency controls • Concurrency is a shared pool by default • Separate using per function concurrency settings • Acts as reservation • Also acts as max concurrency per function • Especially critical for data sources like RDS • “Kill switch” – set per function concurrency to zero
  18. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Is this application available? Availability means something different in serverless applications than it does for traditional “server-full” applications: • Availability only exists at the time in invocation and so availability becomes a % related to total invocations vs. a time period based metric • The failure of downstream service(s) need to be handled/reported in a way to potentially lead to retries/safe handling • Retries can further confuse this (depending on invoke source) • ie. if I fail twice and the second retry succeeds, what's my availability?
  19. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. 1. Lambda directly invoked via invoke API SDK clients Lambda function Lambda API API provided by the Lambda service Used by all other services that invoke Lambda across all models Supports sync and async Can pass any event payload structure you want Client included in every SDK
  20. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Lambda networking region AWS Lambda VPC Lambda function execution environment
  21. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Lambda networking region AWS Lambda VPC Lambda function execution environment Invocations can only come in via the AWS Lambda API
  22. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Lambda networking region AWS Lambda VPC Lambda function execution environment Today that API is available publicly in the region Lambda is running Invocations can only come in via the AWS Lambda API
  23. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Lambda networking with a customer configured VPC region AWS Lambda VPC Customer VPC elastic network interface Lambda function execution environment
  24. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Lambda networking with a customer configured VPC region AWS Lambda VPC Customer VPC elastic network interface Lambda function execution environment Completely managed by the AWS Lambda team Customer configured/managed VPC. Customer controls Security Groups, Network ACLs, Route Tables
  25. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Lambda networking with a customer configured VPC region AWS Lambda VPC Customer VPC elastic network interface Lambda function execution environment Invocations still can only come in via the AWS Lambda API
  26. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Lambda networking with a customer configured VPC region AWS Lambda VPC Customer VPC elastic network interface Lambda function execution environment Invocations still can only come in via the AWS Lambda API Even with a private API Gateway endpoint or a VPC Endpoint provided service
  27. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Do I need to put my functions in an Amazon VPC? Putting your functions inside of a VPC provides little extra security benefit to your AWS Lambda functions
  28. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Do I need a VPC? Should my Lambda function be in a VPC? Does my function need to access any specific resources in a VPC? Does it also need to access resources or services in the public internet? Don’t put the function in a VPC Put the function in a private subnet Put the function in a subnet with a NAT’d route to the internet Yes Yes No No
  29. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Do I need a VPC? Should my Lambda function be in a VPC? Do I need to restrict outbound access from my function to the internet? Don’t put the function in a VPC Put the function in a private subnet Yes No
  30. Basic VPC Design Lambda Subnets ---------> Other Subnets ---------> VPC

    Availability Zone A Availability Zone B Subnet Subnet Subnet Subnet NAT per <----- AZ -----> VPC NAT gateway VPC NAT gateway
  31. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Basic VPC Design • ALWAYS configure a minimum of 2 Availability Zones • Give your Lambda functions their own subnets • Give your Lambda subnets a large IP range to handle potential scale • If your functions need to talk to a resource on the internet, you need a NAT! • ENIs are a pain, we know, we’re working on it !
  32. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Securing your Serverless Infrastructure Photo by Markus Spiske on Unsplash
  33. Lambda permissions model Fine grained security controls for both execution

    and invocation: Execution policies: • Define what AWS resources/API calls can this function access via IAM • Used in streaming invocations • E.g. “Lambda function A can read from DynamoDB table users” Function policies: • Used for sync and async invocations • E.g. “Actions on bucket X can invoke Lambda function Z" • Resource policies allow for cross account access
  34. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. "Action": "s3:*" makes puppies cry Photo by Matthew Henry on Unsplash
  35. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. From: https://github.com/awslabs/aws-serverless-samfarm/blob/master/api/saml.yaml <-THIS BECOMES THIS-> AWS SAM Templates
  36. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. AWS SAM Policy Templates MyQueueFunction: Type: AWS::Serverless::Function Properties: ... Policies: # Gives permissions to poll an SQS Queue - SQSPollerPolicy: queueName: !Ref MyQueue ... MyQueue: Type: AWS::SQS::Queue ...
  37. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. SAM Policy Templates 40+ predefined policies All found here: https://bit.ly/2xWycnj
  38. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. IAM + Lambda best practices • Where/when possible try to leverage the pre-created managed policies that exist today • If you are doing “service:*” be REALLY REALLY REALLY sure that’s what you should and need to do • Keep tight lockdown on who/what can invoke functions
  39. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. I will turn on CloudTrail, Config, and CloudTrail Data Events I will turn on CloudTrail, Config, and CloudTrail Data Events I will turn on CloudTrail, Config, and CloudTrail Data Events I will turn on CloudTrail, Config, and CloudTrail Data Events I will turn on CloudTrail, Config, and CloudTrail Data Events I will turn on CloudTrail, Config, and CloudTrail Data Events I will turn on CloudTrail, Config, and CloudTrail Data Events I will turn on CloudTrail, Config, and CloudTrail Data Events I will turn on CloudTrail, Config, and CloudTrail Data Events I will turn on CloudTrail, Config, and CloudTrail Data Events I will turn on CloudTrail, Config, and CloudTrail Data Events
  40. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Photo by Julieann Ragojo on Unsplash BONUS: Hardcoded secrets make fish cry
  41. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Lambda Environment Variables • Key-value pairs that you can dynamically pass to your function • Available via standard environment variable APIs such as process.env for Node.js or os.environ for Python • Can optionally be encrypted via AWS Key Management Service (KMS) • Allows you to specify in IAM what roles have access to the keys to decrypt the information • Useful for creating environments per stage (i.e. dev, testing, production)
  42. AWS Systems Manager – Parameter Store Centralized store to manage

    your configuration data • supports hierarchies • plain-text or encrypted with KMS • Can send notifications of changes to Amazon SNS/ AWS Lambda • Can be secured with IAM • Calls recorded in CloudTrail • Can be tagged • Available via API/SDK Useful for: centralized environment variables, secrets control, feature flags from __future__ import print_function import json import boto3 ssm = boto3.client('ssm', 'us-east-1') def get_parameters(): response = ssm.get_parameters( Names=['LambdaSecureString'],WithDe cryption=True ) for parameter in response['Parameters']: return parameter['Value'] def lambda_handler(event, context): value = get_parameters() print("value1 = " + value) return value # Echo back the first key value
  43. AWS Systems Manager – Parameter Store Centralized store to manage

    your configuration data • supports hierarchies • plain-text or encrypted with KMS • Can send notifications of changes to Amazon SNS/ AWS Lambda • Can be secured with IAM • Calls recorded in CloudTrail • Can be tagged • Available via API/SDK Useful for: centralized environment variables, secrets control, feature flags from __future__ import print_function import json import boto3 ssm = boto3.client('ssm', 'us-east-1') def get_parameters(): response = ssm.get_parameters( Names=['LambdaSecureString'],WithDe cryption=True ) for parameter in response['Parameters']: return parameter['Value'] def lambda_handler(event, context): value = get_parameters() print("value1 = " + value) return value # Echo back the first key value #somuchawesome
  44. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. https://secure.flickr.com/photos/ocarchives/5333790414 Fun with logs and metrics Weee eeeee
  45. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Metrics and logging are a universal right! CloudWatch Metrics: • 7 Built in metrics for Lambda • Invocation Count, Invocation duration, Invocation errors, Throttled Invocation, Iterator Age, DLQ Errors, Concurrency • Can call “put-metric-data” from your function code for custom metrics • 7 Built in metrics for API-Gateway • API Calls Count, Latency, 4XXs, 5XXs, Integration Latency, Cache Hit Count, Cache Miss Count • Error and Cache metrics support averages and percen,les
  46. Metrics and logging are a universal right! CloudWatch Logs: •

    API Gateway Logging • 2 Levels of logging, ERROR and INFO • Optionally log method request/body content • Set globally in stage, or override per method • Lambda Logging • Logging directly from your code with your language’s equivalent of console.log() • Basic request information included • Log Pivots • Build metrics based on log filters • Jump to logs that generated metrics • Export logs to AWS ElastiCache or S3 • Explore with Kibana or Athena/QuickSight
  47. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. https://secure.flickr.com/photos/joeross/6544781203
  48. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Dashboards https://secure.flickr.com/photos/joeross/6544781203
  49. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Dashboarding tips Make all metrics available – Good news, CloudWatch makes it easy! Focus main landing/“on tv” dashboards on core user/business driven metrics • “If this metric goes up, does it directly correlate with a user having a problem?” Make as many metrics available across team/function as possible • You can now embed CloudWatch “snapshots” in emails and other places!
  50. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Dashboarding tips Make all metrics available – Good news, CloudWatch makes it easy! Focus main landing/“on tv” dashboards on core user/business driven metrics • “If this metric goes up, does it directly correlate with a user having a problem?” • Make as many metrics available across team/function as possible • You can now embed CloudWatch “snapshots” in emails and other places!
  51. Tweak your function’s computer power Lambda exposes only a memory

    control, with the % of CPU core and network capacity allocated to a function proportionally Is your code CPU, Network or memory-bound? If so, it could be cheaper to choose more memory.
  52. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Smart resource allocation Match resource allocation (up to 3 GB!) to logic Stats for Lambda function that calculates 1000 times all prime numbers <= 1000000 128 MB 11.722965sec $0.024628 256 MB 6.678945sec $0.028035 512 MB 3.194954sec $0.026830 1024 MB 1.465984sec $0.024638 Green==Best Red==Worst
  53. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Smart resource allocation Match resource allocation (up to 3 GB!) to logic Stats for Lambda function that calculates 1000 times all prime numbers <= 1000000 128 MB 11.722965sec $0.024628 256 MB 6.678945sec $0.028035 512 MB 3.194954sec $0.026830 1024 MB 1.465984sec $0.024638 Green==Best Red==Worst +$0.00001 -10.256981sec
  54. Impact of a memory change 50% increase in memory 95th

    percentile changes from 3s to 2.1s https://blog.newrelic.com/2017/06/20/lambda-functions-xray-traces-custom-serverless-metrics/
  55. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Multithreading? Maybe! • <1.8GB is still single core • CPU bound workloads won’t see gains – processes share same resources • >1.8GB is multi core • CPU bound workloads will gains, but need to multi thread • I/O bound workloads WILL likely see gains • e.g. parallel calculations to return
  56. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. AWS X-Ray Integration with Serverless • API Gateway inserts a tracing header into HTTP calls as well as reports data back to X-Ray itself • Lambda instruments incoming requests for all supported languages and can capture calls made in code var AWSXRay = require(‘aws-xray-sdk-core‘); AWSXRay.middleware.setSamplingRules(‘sampling-rules.json’); var AWS = AWSXRay.captureAWS(require(‘aws-sdk’)); S3Client = AWS.S3();
  57. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. X-Ray Trace Example
  58. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. How do I figure out what’s wrong? These tools are here, so use them! 1. Turn on X-Ray now 1. look at wrapping your own calls with it via the X-Ray SDKs 2. Don’t underestimate the power of logging in Lambda 1. Simple “debug: in functionX” statements work great and are easy to find in CloudWatch Logs 3. The most valuable metrics are the ones closest to your customer/use-case 1. How many gizmos did this function call/create/process/etc
  59. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. FIN/ACK 4 key operational areas: • Availability • Networking • Security, Governance, Auditing • Monitoring, Metrics, Logs, Performance
  60. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. Chris Munns [email protected] @chrismunns https://www.flickr.com/photos/theredproject/3302110152/
  61. © 2018, Amazon Web Services, Inc. or its Affiliates. All

    rights reserved. ? https://secure.flickr.com/photos/dullhunk/202872717/