cloud • Using Airflow ◦ Using dbt-cloud-plugin ◦ Using bash operator • Using an automation server (Code Deploy, Gitlab CI/CD, Bamboo or Jenkins) • Using cron 3
it anywhere • Effective isolation and resource sharing • Improved developer productivity and development pipeline • Easy integration with Continuous Delivery pipelines 5
maintaining underlying infrastructure ◦ No EC2 machine needed • Scale your application based on your need • Support longer execution compare to AWS Lambda (only 15 minutes) • Limited Volume Size: only 4GB • In 2019 AWS dropped the pricing for AWS Fargate by up to 50% 7
: 0.04048 $ • per vCPU per second e.g eu-west-1): 0.00001124444444 $ Example 1 A container running for 1 hour every day it will cost 0.3036$ per month with 0.25 vCPU Example 2 A container running for 20 minutes, every hour it will cost 2.4288$ per month with 0.25 vCPU Note When running dbt you can use the minimum container size, because the computation happen in the DB. 8
VPC, Internet Gateway, Subnet, Security Group • ECR Registry (or Docker repository) • Elastic Container Cluster (ECS) - Only a logical grouping of tasks • ECS IAM Role for ECS task + IAM policy • Cloudwatch Log Group • ECS task definition with Launch Type FARGATE 9
AWS Step Function. • A step function can be triggered using Cloudwatch events using simple cron syntax • Step Function enables the execution of complex workflows ◦ We can ingest data from an API using a Lambda function, then trigger a dbt run and a dbt test ◦ We can be informed when an dbt run/test fails • An example can be found here 11
image with the latest models on each merge to your master branch: • CircleCI (e.g you can use this orb to deploy to ECR) • Gitlab CI/CD • AWS Code Pipeline with Code Build • Github actions The ECS task will always use the latest image. 15
to trigger dbt jobs in ECS from Airflow using boto3. • Cheap alternative compare to dbt cloud Here an example on how an ECS plugin for Airflow looks like 17