Advanced Serverless Architectural Patterns on AWS [Devoxx Poland]

@alex_casalboni #DevoxxPL Advanced Serverless Architectural Patterns on AWS Alex Casalboni
Sr. Technical Evangelist Amazon Web Services

About me • Software Engineer & Web Developer • Worked
in a startup for 4.5 years • ServerlessDays Organizer • AWS customer since 2013

Agenda Serverless foundations (quickly, I promise!) Advanced serverless patterns: 1.
Web application / API 2. Stream processing 3. Data lakes 4. Machine learning

Compute Spectrum AWS Lambda Amazon Kinesis Amazon S3 Amazon API
Gateway Amazon SQS Amazon DynamoDB AWS IoT Amazon EMR Amazon ElastiCache Amazon RDS Amazon Redshift Amazon Elasticsearch Managed Serverless Amazon EC2 Microsoft SQL Server “On Amazon EC2” Amazon Cognito Amazon CloudWatch Amazon Athena AWS X-Ray AWS Step Functions Amazon MQ Amazon SageMaker Amazon Neptune AWS Fargate Amazon DocumentDB

Serverless means… No server or container management Flexible scaling No
idle capacity $ High availability

Bootstrap the runtime Start your code Lambda: The execution lifecycle
Cold start Warm start Download your code Start new container Time

Tune your function’s resources Only a memory control - %
of CPU core and network capacity allocated to a function proportionally Is your code CPU, network or memory-bound? If so, it could be cheaper to choose more memory > Memory, > Cores, > Network

“AWS Lambda Power Tuning” Data-driven cost & performance optimization for
AWS Lambda github.com/alexcasalboni/aws-lambda-power-tuning Don’t guesstimate!

Lambda best practices Minimize your package size & use only
needed SDK modules Put your dependency (e.g. jar files) in a separate directory Improve dependency injection with smaller and simpler IoC frameworks that load quickly on startup, like Dagger2 Leverage smaller and faster frameworks like jackson-jr for Java data binding Use environment variables to modify operational behavior Secure secrets/tokens/passwords with Parameter Store and AWS Secrets Manager

AWS Serverless Application Model (SAM) AWS CloudFormation extension (Macro) to
simplify serverless apps New serverless resource types: functions, APIs, and tables Local testing with SAM CLI github.com/awslabs/serverless-application-model

Source Build Test Deploy AWS CodeCommit AWS CodeBuild Third Party
Tooling AWS CodeDeploy AWS CodePipeline AWS CodeStar AWS code services

Pattern 1 Web app / microservice / API

Web application (1) DynamoDB Lambda API Gateway Browser CloudFront Amazon
S3 Cognito

Choose the right API endpoint type Edge optimized: reduce latency
from anywhere on the Internet AWS Region API Gateway Internet edge location edge location edge location CloudFront Distribution API Gateway Managed

Web application (2) DynamoDB Lambda API Gateway Browser CloudFront S3
Cognito Lambda@Edge

Choose the right API endpoint type Regional AWS us-east-2 API
Gateway Internet AWS us-west-2 API Gateway Route 53 Lambda DynamoDB Lambda DynamoDB Global Tables

Regional API Gateway Internet API Gateway Route 53 Lambda DynamoDB
Lambda DynamoDB Global Tables Lambda@Edge CloudFront Choose the right API endpoint type AWS us-east-2 AWS us-west-2

Private: expose APIs only inside your VPC AWS Region API
Gateway Your VPC AWS Direct Connect On-premises Choose the right API endpoint type

DynamoDB Lambda API Gateway Browser CloudFront Amazon S3 Cognito Serverless
web app security

DynamoDB Lambda API Gateway Browser CloudFront S3 Cognito Serverless web
app security Static Content • Geo-Restrictions • Signed Cookies • Signed URLs • DDOS Protection • Bucket Policies • ACLs AuthZ • Cross Account • Throttling per method • Resource Policies • Usage Plans • Encryption at Rest • VPC Endpoint • Function policies • Env Variables • Parameters/Secrets

Lambda Authorizer Client Lambda API Gateway DynamoDB IAM Lambda authorizers

Pattern 2 Data processing (stream)

Streaming with Amazon Kinesis Collect, process, and analyze video and
data streams in real time Kinesis Data Firehose SQL Kinesis Data Analytics Kinesis Data Streams Kinesis Video Streams

Streaming data ingestion Amazon S3: Buffered files Kinesis Agent Record
producers Amazon Redshift: Table loads Amazon Elasticsearch Service: Domain loads Amazon S3: Source record backup Transformed records Put Records Kinesis Firehose: Delivery stream AWS Lambda: Transformations & enrichment Amazon DynamoDB: Lookup tables Raw Lookup Transformed

Streaming data ingestion (HTTP) HTTP POST/PUT API Gateway Browser Amazon
S3: Buffered files Amazon Redshift: Table loads Amazon Elasticsearch Service: Domain loads Amazon S3: Source record backup AWS Lambda: Transformations & enrichment Amazon DynamoDB: Lookup tables Raw Lookup Transformed Transformed records Kinesis Firehose: Delivery stream

Streaming data ingestion (at the edge) Amazon S3: Buffered files
Amazon Redshift: Table loads Amazon Elasticsearch Service: Domain loads Amazon S3: Source record backup AWS Lambda: Transformations & enrichment Amazon DynamoDB: Lookup tables Raw Lookup Transformed Transformed records Kinesis Firehose: Delivery stream HTTP POST/PUT CloudFront Lambda@Edge Browser

Kinesis Best practices Tune Firehose buffer size and buffer interval
• Larger objects = fewer Lambda invocations & Amazon S3 PUTs Enable compression to reduce storage costs Enable Parquet format transformation (columnar) Enable Source Record Backup for transformations • Recover from transformation errors

Kinesis Data Streams and Lambda # of shards corresponds to
concurrent invocations of Lambda function Batch size sets maximum # of records per invocation (min 1, max 10K) Data Stream Processor Function Streaming source Other AWS services

Fan-out pattern Trade strict message ordering for higher throughput &
lower latency Kinesis Data Streams: Stream Lambda: Dispatcher function Lambda: Processor function Increase throughput, reduce processing latency Streaming source github.com/aws-samples/aws-lambda-fanout

Real-time analytics Data Stream Kinesis Data Analytics: Time window aggregation
Kinesis Data Firehose: Error stream S3: Error records Record producers Lambda: Alert function DynamoDB SNS: Notifications

CREATE OR REPLACE PUMP "STREAM_PUMP" AS INSERT INTO "DESTINATION_SQL_STREAM" SELECT
STREAM "device_id", STEP("SOURCE_SQL_STREAM_001".ROWTIME BY INTERVAL '10' MINUTE) as "window_ts", SUM("measurement") as "sample_sum", COUNT(*) AS "sample_count" FROM "SOURCE_SQL_STREAM_001" GROUP BY "device_id", STEP("SOURCE_SQL_STREAM_001".ROWTIME BY INTERVAL '10' MINUTE); Kinesis Data Analytics Aggregation 10-minute tumbling window Kinesis Data Analytics: Time window aggregation Source stream Destination stream(s)

Pattern 3 Data Lakes

Data lake characteristics Collect, store, process, consume, and analyze organizational
data Structured, semi-structured, and unstructured data Decoupled compute and storage Fast automated ingestion Schema on-read Complementary to data warehouses

Serverless data lake S3 Elasticsearch Glue DynamoDB Catalog & search
Cognito API Gateway API/UI Athena QuickSight Redshift Spectrum Analytics & processing Lambda Kinesis Streams Kinesis Firehose Direct Connect Ingest AWS IoT KMS CloudTrail IAM Macie Security & auditing

Glue Crawlers Glue Data Catalog QuickSight Redshift Spectrum Athena S3
Bucket(s) How to “serverlessly” query your data lake

Athena – Just SQL (presto) Query duration: 44.66 seconds Data
scanned: 169.53GB Cost*: $0.85 * $5/TB or $0.005/GB SELECT gram, year, sum(count) FROM ngram WHERE gram = 'just say no' GROUP BY gram, year ORDER BY year ASC;

Athena best practices Partition data s3://my-bucket/my-data/parquet/year=2018/month=11/day=25/ Use columnar formats –
Apache Parquet, AVRO, ORC Compress files with splittable compression (bzip2) Optimize file sizes aws.amazon.com/blogs/big-data/top-10-performance-tuning-tips-for-amazon-athena

Serverless MapReduce Lambda: Splitter S3 Object DynamoDB: Mapper Results Lambda:
Mappers …. …. Lambda: Reducer S3 Results

Pywren - http://pywren.io Python library developed by UCI (University of
California, Berkeley) Up to 40 TFLOPS of peak compute power Over 700 GB/sec of read and 500 GB/sec of write performance using S3 “numpywren: Serverless Linear Algebra” https://arxiv.org/pdf/1810.09679.pdf

Pattern 4 Machine Learning

M L F R A M E W O R
K S & I N F R A S T R U C T U R E The Amazon ML Stack: Broadest & Deepest Set of Capabilities A I S E R V I C E S R E K O G N I T I O N I M A G E P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D C O M P R E H E N D M E D I C A L L E X R E K O G N I T I O N V I D E O Vision Speech Chatbots A M A Z O N S A G E M A K E R B U I L D T R A I N F O R E C A S T T E X T R A C T P E R S O N A L I Z E D E P L O Y Pre-built algorithms & notebooks Data labeling (G R O U N D T R U T H ) One-click model training & tuning Optimization ( N E O ) One-click deployment & hosting M L S E R V I C E S F r a m e w o r k s I n t e r f a c e s I n f r a s t r u c t u r e E C 2 P 3 & P 3 d n E C 2 C 5 F P G A s G R E E N G R A S S E L A S T I C I N F E R E N C E Models without training data (REINFORCEMENT LEARNING) Algorithms & models ( A W S M A R K E T P L A C E ) Language Forecasting Recommendations NEW NEW NEW NEW NEW NEW NEW NEW NEW

1. Upload 2. Submit image Image processing with Amazon Rekognition
Image Step Functions 3. Store image Lambda DynamoDB Elasticsearch 8. Store metadata & analysis 4. DetectFaces 7. DetectText 5. DetectLabels 6. DetectModeration

Media analysis solution S3: Web interface Cognito Amazon Rekognition Video:
Detect objects, scenes, faces, & celebrities Elasticsearch: Search index API Gateway: REST APIs https://aws.amazon.com/answers/media-entertainment/media-analysis-solution/ AWS Elemental MediaConvert: Transcode videos S3: Media storage Step Functions: Orchestrate analysis Transcribe Comprehend

Amazon Connect (Serverless contact center) Real time and historical analytics
High-quality voice capability Call recording Skills-based routing [Automatic Call Distribution (ACD)]

Intelligent call center chatbot Amazon Connect Customer Amazon Lex Lambda:
Chatbot Processing DynamoDB: Customer Data SNS: SMS Messaging Customer calls Connect to reschedule an appointment Connect calls Lex chatbot Lex chatbot calls Lambda function to get customer preferences and fulfil Intents Lambda function sends text message confirmation via SNS Customer receives appointment confirmation text message Lambda function writes updates to DynamoDB

Call center analytics Amazon Connect Customers Agents Call recordings S3:
Call recordings S3: Call transcripts Step Functions Transcribe Lambda S3: Sentiment, key phrases, entities Step Functions S3 Notifications for call transcripts Comprehend Lambda Athena QuickSight Contact trace records (CTRs) Kinesis Data Streams Kinesis Data Firehose S3: CTRs

Go Build! Here to help you build

Advanced Serverless Architectural Patterns on ...

Advanced Serverless Architectural Patterns on AWS [Devoxx Poland]

More Decks by Alex Casalboni

Other Decks in Programming

Featured

Transcript