Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Autoscaling with Ladder

Autoscaling with Ladder

Introduction to autoscaling infrastructure and how an autoscaler is implemented, in this case Ladder, a general purpose autoscaler. https://github.com/themotion/ladder

Xabier Larrakoetxea

June 21, 2017
Tweet

More Decks by Xabier Larrakoetxea

Other Decks in Programming

Transcript

  1. A

  2. Autoscaling Is a method used in cloud computing, whereby the

    amount of computational resources in a server farm, typically measured in terms of the number of active servers, scales automatically based on the load on the farm.
  3. Autoscaling Is a method used in cloud computing, whereby the

    amount of computational resources in a server farm, typically measured in terms of the number of active servers, scales automatically based on the load on the farm. Meet contract requirements in the cheapest way
  4. Autoscaling Metrics Autoscaling Which metric is the one I have

    to use to autoscale my platform? Depends on the requirement
  5. Autoscaling Use cases Autoscaling Streaming video requirements Buffering Video quality

    Autoscaling based on load and latency Seasonal & predictable pattern
  6. Autoscaling Use cases Autoscaling Video generation on demand 1h deadline

    Batch video generation Autoscaling based on video generation time spiky and on/off & unpredictable pattern
  7. Ladder General purpose autoscaler Cloud native in mind Flexible and

    extensible Reliable Easy setup Simple and light
  8. Ladder motivation Ladder Tight to infrastructure Poor flexibility Segregation and

    duplication Decouple of everything Simple architecture Flexible & extensible One place & reusable Complex
  9. Architecture Reusable blocks No DB nor SD dependency YAML configuration

    Multiple autoscalers Optional SD lock planned
  10. Architecture Autoscalers Architecture An autoscaler will calculate and decide periodically

    the number of target units to set on the scaling target based on inputs and policies
  11. Architecture Autoscalers Architecture An autoscaler will calculate and decide periodically

    the number of target units to set on the scaling target based on inputs and policies
  12. Architecture Inputters Architecture Inputters are an ephemeral block made of

    a gatherer and a solver. It will get a value and will return a valid scaling target value
  13. Architecture Gatherers Architecture Prometheus metrics AWS SQS msgs AWS Cloudwatch

    https://themotion.github.io/ladder/blocks/gatherers Popular!
  14. Architecture Gatherers Architecture Autoscaler 1 Inputter 1 Gatherer Inputter N

    Gatherer Autoscaler N Inputter 1 Gatherer Inputter N Gatherer
  15. Architecture Arrangers Architecture Arrangers will take gatherers input and the

    current scaling target values and return a calculated new quantity for the scaling target
  16. Architecture Autoscalers Architecture Autoscaler 1 Inputter 1 Gatherer Arranger Inputter

    N Gatherer Arranger Autoscaler N Inputter 1 Gatherer Arranger Inputter N Gatherer Arranger
  17. Architecture Solvers Architecture Solvers will select one of the multiple

    inputters result as the final result for the scaling target
  18. Architecture Arrangers Architecture Autoscaler 1 Inputter 1 Gatherer Arranger Solver

    Inputter N Gatherer Arranger Autoscaler N Inputter 1 Gatherer Arranger Solver Inputter N Gatherer Arranger
  19. Architecture Filters Architecture Filters process the received input from the

    solver in a chain of filters breaking the chain whenever they decide and return the altered (or not) original scaling quantity
  20. Architecture Filters Architecture Autoscaler 1 Inputter 1 Gatherer Arranger Solver

    Filter 1 Filter N Inputter N Gatherer Arranger Autoscaler N Inputter 1 Gatherer Arranger Solver Filter 1 Filter N Inputter N Gatherer Arranger
  21. Architecture Scalers Architecture Scalers know the current quantity and how

    to scale up or scale down the scaling target. They may wait to finish for the scaling process
  22. Architecture Scalers Architecture Autoscaler 1 Inputter 1 Gatherer Arranger Solver

    Filter 1 Filter N Scaler Inputter N Gatherer Arranger Autoscaler N Inputter 1 Gatherer Arranger Solver Filter 1 Filter N Scaler Inputter N Gatherer Arranger
  23. Architecture Ladder Architecture Autoscaler 1 Inputter 1 Gatherer Arranger Solver

    Filter 1 Filter N Scaler Inputter N Gatherer Arranger Autoscaler N Inputter 1 Gatherer Arranger Solver Filter 1 Filter N Scaler Inputter N Gatherer Arranger
  24. Requeriments Example Video generation on demand 1h deadline Batch video

    generation Autoscaling based on video generation time spiky and on/off & unpredictable pattern
  25. Scaler Example ASG scaler scale: kind: aws_autoscaling_group config: auto_scaling_group_name: "ASG-prod-XXXXXXXX"

    aws_region: "eu-west-1" scale_up_wait_duration: 1m scale_down_wait_duration: 5s force_min_max: true remaining_closest_hour_limit_duration: 10m max_no_downscale_rch_limit: 180
  26. Scaler Example ASG scaler scale: kind: aws_autoscaling_group config: auto_scaling_group_name: "ASG-prod-XXXXXXXX"

    aws_region: "eu-west-1" scale_up_wait_duration: 1m scale_down_wait_duration: 5s force_min_max: true remaining_closest_hour_limit_duration: 10m max_no_downscale_rch_limit: 180 Scaling AWS ASG Wait 1’ after upscaling Wait 5’’ after downscaling At least every machine needs to be running 50’ before downscaling
  27. Filters Example Limits & spikes filters: - kind: scaling_kind_interval config:

    scale_up_duration: 30s scale_down_duration: 20m - kind: limit config: max: 2000 min: 1
  28. Filters Example Limits & spikes filters: - kind: scaling_kind_interval config:

    scale_up_duration: 30s scale_down_duration: 20m - kind: limit config: max: 2000 min: 1 Scale after N duration on the same state (up/down scale) Upscale after 30’’ wanting to upscale Downscale after 20’ wanting to downscale Protect from up/down scaling too much
  29. Gatherer Example State of video generation gather: kind: prometheus_metric config:

    addresses: - http://prometheus.prod.bi.themotion.lan - http://prometheus2.prod.bi.themotion.lan query: > sum( number_queue_messages{queue=~".*x_jobs.*"}) / ( 2400 / (sum( process_duration_seconds{process="x_job", quantile="0.9"} > 0 OR vector(0) ) ) )
  30. Gatherer Example State of video generation gather: kind: prometheus_metric config:

    addresses: - http://prometheus.prod.bi.themotion.lan - http://prometheus2.prod.bi.themotion.lan query: > sum( number_queue_messages{queue=~".*x_jobs.*"}) / ( 2400 / (sum( process_duration_seconds{process="x_job", quantile="0.9"} > 0 OR vector(0) ) ) ) Prometheus metric in HA Input based on remaining jobs and processing time
  31. Gatherer Example Number of machines required based on process duration

    & jobs sum( number_queue_messages{queue=~".*x_jobs.*"}) / ( 2400 / (sum( process_duration_seconds{process="x_job", quantile="0.9"} > 0 OR vector(0) ) ) ) 1 node processing time 1 hour: 40’ (2400’’) Job processing duration (90 perc.): 75’’ Jobs remaining: 4500
  32. Gatherer Example Number of machines required based on process duration

    & jobs sum( number_queue_messages{queue=~".*x_jobs.*"}) / ( 2400 / (sum( process_duration_seconds{process="x_job", quantile="0.9"} > 0 OR vector(0) ) ) ) 1 node processing time 1 hour: 40’ (2400’’) Job processing duration (90 perc.): 75’’ Jobs remaining: 4500 1 Node 32 jobs in 1 hour 4500 jobs == 141 nodes Dynamically calculated on every iteration