Slide 1

Slide 1 text

Building And Automating Serverless Auto-Scaling Data Pipelines In AWS Damien Jones (he/him) AWS Consultant @ Steamhaus 2024-10-03 AWS Community Day NL @amazonwebshark MrDamienJones amazonwebshark.com [email protected] MrDamienJones https://www.flaticon.com/ @amazonwebshark

Slide 2

Slide 2 text

Agenda Problem Definition Solution Architecture Demos Summary & Questions github.com/MrDamienJones /Community-Sessions

Slide 3

Slide 3 text

Damien Jones Consultant @ Steamhaus Using AWS since 2019 Creator @ amazonwebshark.com Runner; Keen Gardener; Dog Dad He/Him Manchester UK Fin Fan

Slide 4

Slide 4 text

The 4 Vs Of Big Data Characteristics of Big Data… …and events… …and API requests… …metrics …traces …logs ...

Slide 5

Slide 5 text

Variety “The state of being diverse or varied.”

Slide 6

Slide 6 text

Variety “The state of being diverse or varied.” Structure Intent Sensitivity

Slide 7

Slide 7 text

Velocity “The speed at which something is moving in a given direction.”

Slide 8

Slide 8 text

Velocity “The speed at which something is moving in a given direction.” Streaming or Batch Synchronous or Asynchronous Scheduling

Slide 9

Slide 9 text

Veracity “The quality of being true or the habit of telling the truth.”

Slide 10

Slide 10 text

Veracity “The quality of being true or the habit of telling the truth.” External Security Validation & Health Internal Security

Slide 11

Slide 11 text

Volume “The amount of space occupied.”

Slide 12

Slide 12 text

Volume “The amount of space occupied.” Size or Amount Access Patterns Backups

Slide 13

Slide 13 text

Building And Automating Serverless Auto-Scaling Data Pipelines In AWS

Slide 14

Slide 14 text

Building And Automating Serverless Auto-Scaling Data Pipelines In AWS

Slide 15

Slide 15 text

Building And Automating Serverless Auto-Scaling Data Pipelines In AWS AWS Lambda Amazon S3

Slide 16

Slide 16 text

AWS Lambda Serverless compute service Supports multiple languages Auto-scales on demand Up to 1000 concurrent executions

Slide 17

Slide 17 text

Amazon S3 Serverless object storage Store anything for any reason 1000s of requests per second Object protection & integrity checks

Slide 18

Slide 18 text

Building And Automating Serverless Auto-Scaling Data Pipelines In AWS

Slide 19

Slide 19 text

Building And Automating Serverless Auto-Scaling Data Pipelines In AWS Amazon Athena AWS Glue

Slide 20

Slide 20 text

Amazon Athena Serverless interactive query service Query & access controls Create derived tables Read-Only & Open Table support

Slide 21

Slide 21 text

AWS Glue Fully managed serverless ETL service Crawlers discover data automatically Up to 2000 concurrent ETL job runs Data Catalog indexes data assets

Slide 22

Slide 22 text

Building And Automating Serverless Auto-Scaling Data Pipelines In AWS

Slide 23

Slide 23 text

Building And Automating Serverless Auto-Scaling Data Pipelines In AWS Amazon EventBridge Scheduler AWS Step Functions

Slide 24

Slide 24 text

AWS Step Functions Serverless task orchestration Invoke over 220 AWS services / 10k API calls Design workflows visually and as code Standard & Express workflows

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

AWS Step Functions Demo Lambda Function: API Call Glue Job: ETL Athena Query: MSCK REPAIR TABLE

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

Amazon EventBridge Scheduler Automate recurring & one-off tasks Invoke over 220 AWS services Set times or fixed-rate schedules Checks target response

Slide 32

Slide 32 text

Amazon EventBridge Scheduler Demo Set schedule Link Step Function workflow Set configuration

Slide 33

Slide 33 text

No content

Slide 34

Slide 34 text

Summary Problem Definition Solution Architecture Demos Summary & Questions github.com/MrDamienJones /Community-Sessions

Slide 35

Slide 35 text

Thanks! github.com/MrDamienJones /Community-Sessions @amazonwebshark MrDamienJones amazonwebshark.com [email protected] MrDamienJones @amazonwebshark AWS Community Day NL Feedback