Autoscaling with Ladder

Whoami Xabier Larrakoetxea slok @slok69

Autoscaling

Autoscaling Is a method used in cloud computing, whereby the
amount of computational resources in a server farm, typically measured in terms of the number of active servers, scales automatically based on the load on the farm.

Autoscaling Is a method used in cloud computing, whereby the
amount of computational resources in a server farm, typically measured in terms of the number of active servers, scales automatically based on the load on the farm. Meet contract requirements in the cheapest way

Autoscaling Election Autoscaling Do I need autoscaling?

Autoscaling Election Autoscaling No!

Autoscaling Election Autoscaling Yes!

Autoscaling Metrics Autoscaling Which metric is the one I have
to use to autoscale my platform?

Autoscaling Metrics Autoscaling Which metric is the one I have
to use to autoscale my platform? Depends on the requirement

Autoscaling Metrics Autoscaling Latency Resource reservation Business requirement RPS Load

Autoscaling Metrics Autoscaling Latency Resource reservation Business requirement Study your
own requirement! RPS Load

Autoscaling Patterns Autoscaling What kind of pattern has my demand?

Autoscaling Patterns Autoscaling What kind of pattern has my demand?
Depends on the client (type)

Autoscaling Patterns Autoscaling Fast growth

Autoscaling Patterns Autoscaling Fast growth Increases over time and is
common at first launch

Autoscaling Patterns Autoscaling Seasonal

Autoscaling Patterns Autoscaling Seasonal Cyclical periods of high and low
demand that are predictable

Autoscaling Patterns Autoscaling Spiky

Autoscaling Patterns Autoscaling Spiky Unpredictable high, mid and low demand

Autoscaling Patterns Autoscaling on/off no demand along with (Un)predictable demand

Autoscaling Use cases Autoscaling Streaming video requirements Buffering Video quality

Autoscaling Use cases Autoscaling Streaming video requirements Buffering Video quality
Autoscaling based on load and latency Seasonal & predictable pattern

Autoscaling Use cases Autoscaling http://bit.ly/netflix-scryer1 http://bit.ly/netflix-scryer2

Autoscaling Use cases Autoscaling Video generation on demand 1h deadline
Batch video generation

Autoscaling Use cases Autoscaling Video generation on demand 1h deadline
Batch video generation Autoscaling based on video generation time spiky and on/off & unpredictable pattern

Ladder

Ladder General purpose autoscaler Cloud native in mind Flexible and
extensible Reliable Easy setup Simple and light

Ladder src https://github.com/themotion/ladder Ladder

Ladder motivation Ladder Complex Tight to infrastructure Poor flexibility Segregation
and duplication

Ladder motivation Ladder Tight to infrastructure Poor flexibility Segregation and
duplication Decouple of everything Simple architecture Flexible & extensible One place & reusable Complex

Ladder state Ladder >8 months AWS friendly Container ready Prometheus
metrics

Ladder state Ladder >8 months AWS friendly Container ready Prometheus
metrics v0.1.0 release available

Architecture

Architecture Reusable blocks No DB nor SD dependency YAML configuration
Multiple autoscalers

Architecture Reusable blocks No DB nor SD dependency YAML configuration
Multiple autoscalers Optional SD lock planned

Architecture Autoscalers Architecture An autoscaler will calculate and decide periodically
the number of target units to set on the scaling target based on inputs and policies

Architecture Autoscalers Architecture Autoscaler 1 Autoscaler N

Architecture Inputters Architecture Inputters are an ephemeral block made of
a gatherer and a solver. It will get a value and will return a valid scaling target value

Architecture Inputters Architecture Autoscaler 1 Inputter 1 Inputter N Autoscaler
N Inputter 1 Inputter N

Architecture Gatherers Architecture The gatherer will get a value from
an external source

Architecture Gatherers Architecture Prometheus metrics AWS SQS msgs AWS Cloudwatch
https://themotion.github.io/ladder/blocks/gatherers

Architecture Gatherers Architecture Prometheus metrics AWS SQS msgs AWS Cloudwatch
https://themotion.github.io/ladder/blocks/gatherers Popular!

Architecture Gatherers Architecture Autoscaler 1 Inputter 1 Gatherer Inputter N
Gatherer Autoscaler N Inputter 1 Gatherer Inputter N Gatherer

Architecture Arrangers Architecture Arrangers will take gatherers input and the
current scaling target values and return a calculated new quantity for the scaling target

Architecture Arrangers Architecture Threshold https://themotion.github.io/ladder/blocks/arrangers Constant factor

Architecture Arrangers Architecture Threshold https://themotion.github.io/ladder/blocks/arrangers Constant factor Relative to current
value

Architecture Autoscalers Architecture Autoscaler 1 Inputter 1 Gatherer Arranger Inputter
N Gatherer Arranger Autoscaler N Inputter 1 Gatherer Arranger Inputter N Gatherer Arranger

Architecture Solvers Architecture Solvers will select one of the multiple
inputters result as the final result for the scaling target

Architecture Solvers Architecture Bound https://themotion.github.io/ladder/blocks/solvers

Architecture Solvers Architecture Bound https://themotion.github.io/ladder/blocks/solvers Simple max and min of
values

Architecture Arrangers Architecture Autoscaler 1 Inputter 1 Gatherer Arranger Solver
Inputter N Gatherer Arranger Autoscaler N Inputter 1 Gatherer Arranger Solver Inputter N Gatherer Arranger

Architecture Filters Architecture Filters process the received input from the
solver in a chain of filters breaking the chain whenever they decide and return the altered (or not) original scaling quantity

Architecture Filters Architecture Limit https://themotion.github.io/ladder/blocks/filters Scaling kind interval AWS ECS
running tasks

running tasks Don’t Destroy the company

running tasks Don’t close the business Remove scalation spikes

Architecture Filters Architecture Autoscaler 1 Inputter 1 Gatherer Arranger Solver
Filter 1 Filter N Inputter N Gatherer Arranger Autoscaler N Inputter 1 Gatherer Arranger Solver Filter 1 Filter N Inputter N Gatherer Arranger

Architecture Scalers Architecture Scalers know the current quantity and how
to scale up or scale down the scaling target. They may wait to finish for the scaling process

Architecture Scalers Architecture https://themotion.github.io/ladder/blocks/scalers AWS ASG AWS ECS

Architecture Scalers Architecture https://themotion.github.io/ladder/blocks/scalers AWS ASG AWS ECS Save money
on downscale ($$$/60’ )

Architecture Scalers Architecture https://themotion.github.io/ladder/blocks/scalers AWS ASG AWS ECS Save money
on downscale ($$$/60’ ) Similar to K8s replicas (TODO)

Architecture Scalers Architecture Autoscaler 1 Inputter 1 Gatherer Arranger Solver
Filter 1 Filter N Scaler Inputter N Gatherer Arranger Autoscaler N Inputter 1 Gatherer Arranger Solver Filter 1 Filter N Scaler Inputter N Gatherer Arranger

Architecture Ladder Architecture Autoscaler 1 Inputter 1 Gatherer Arranger Solver
Filter 1 Filter N Scaler Inputter N Gatherer Arranger Autoscaler N Inputter 1 Gatherer Arranger Solver Filter 1 Filter N Scaler Inputter N Gatherer Arranger

Extras

Metrics Extras Prometheus metrics Grafana dashboard Error metrics for alerts

Metrics Extras

Rest API Extras On/off autoscalers Autoscalers status Health check

Plugins Extras Interface implementation Golang >= 1.8 Every block can
be a plugin

Example

Requeriments Example Video generation on demand 1h deadline Batch video
generation Autoscaling based on video generation time spiky and on/off & unpredictable pattern

Scaler Example ASG scaler scale: kind: aws_autoscaling_group config: auto_scaling_group_name: "ASG-prod-XXXXXXXX"
aws_region: "eu-west-1" scale_up_wait_duration: 1m scale_down_wait_duration: 5s force_min_max: true remaining_closest_hour_limit_duration: 10m max_no_downscale_rch_limit: 180

Scaler Example ASG scaler scale: kind: aws_autoscaling_group config: auto_scaling_group_name: "ASG-prod-XXXXXXXX"
aws_region: "eu-west-1" scale_up_wait_duration: 1m scale_down_wait_duration: 5s force_min_max: true remaining_closest_hour_limit_duration: 10m max_no_downscale_rch_limit: 180 Scaling AWS ASG Wait 1’ after upscaling Wait 5’’ after downscaling At least every machine needs to be running 50’ before downscaling

Filters Example Limits & spikes filters: - kind: scaling_kind_interval config:
scale_up_duration: 30s scale_down_duration: 20m - kind: limit config: max: 2000 min: 1

Filters Example Limits & spikes filters: - kind: scaling_kind_interval config:
scale_up_duration: 30s scale_down_duration: 20m - kind: limit config: max: 2000 min: 1 Scale after N duration on the same state (up/down scale) Upscale after 30’’ wanting to upscale Downscale after 20’ wanting to downscale Protect from up/down scaling too much

Gatherer Example State of video generation gather: kind: prometheus_metric config:
addresses: - http://prometheus.prod.bi.themotion.lan - http://prometheus2.prod.bi.themotion.lan query: > sum( number_queue_messages{queue=~".*x_jobs.*"}) / ( 2400 / (sum( process_duration_seconds{process="x_job", quantile="0.9"} > 0 OR vector(0) ) ) )

Gatherer Example State of video generation gather: kind: prometheus_metric config:
addresses: - http://prometheus.prod.bi.themotion.lan - http://prometheus2.prod.bi.themotion.lan query: > sum( number_queue_messages{queue=~".*x_jobs.*"}) / ( 2400 / (sum( process_duration_seconds{process="x_job", quantile="0.9"} > 0 OR vector(0) ) ) ) Prometheus metric in HA Input based on remaining jobs and processing time

Gatherer Example Number of machines required based on process duration
& jobs sum( number_queue_messages{queue=~".*x_jobs.*"}) / ( 2400 / (sum( process_duration_seconds{process="x_job", quantile="0.9"} > 0 OR vector(0) ) ) ) 1 node processing time 1 hour: 40’ (2400’’) Job processing duration (90 perc.): 75’’ Jobs remaining: 4500

Gatherer Example Number of machines required based on process duration
& jobs sum( number_queue_messages{queue=~".*x_jobs.*"}) / ( 2400 / (sum( process_duration_seconds{process="x_job", quantile="0.9"} > 0 OR vector(0) ) ) ) 1 node processing time 1 hour: 40’ (2400’’) Job processing duration (90 perc.): 75’’ Jobs remaining: 4500 1 Node 32 jobs in 1 hour 4500 jobs == 141 nodes Dynamically calculated on every iteration

Questions

Thank you!

License Attribution 4.0 International (CC BY 4.0) slok @slok69

Autoscaling with Ladder

Autoscaling with Ladder

More Decks by Xabier Larrakoetxea

Other Decks in Programming

Featured

Transcript