Kubernetes on Spot Instances

a cost-effective and reliable cloud setup Kubernetes on Spot Instances

Own delivery fleets in 40 countries, delivering over a million
orders a week Workloads across 4 continents in 3 AWS regions Typically a few hundred Kubernetes nodes running for web and worker workloads Logistics Tech in Delivery Hero

Rails and Spring for web workloads, Scala & Akka for
real- time apps, Python, R and various solvers for batch tasks Mostly Amazon RDS for PostgreSQL and Amazon DynamoDB for persistence Kubernetes deployments currently transitioning from kops to Amazon EKS, usually 1-2 minor versions behind Logistics Tech in Delivery Hero

Cloud has been an obvious choice for us sample country
highlighted: we open the floodgates at 11.30am dealing with peaks and spikiness

..without spending time tuning to downscale Primary objective: Save money
on infrastructure

Our typical monthly AWS bill structure EC2 is typically the
biggest component of our bills

Focusing on big impact savings first Where are the largest
marginal improvements? à with the biggest cost contributor EC2 EC2

We could only reserve are base load without peaks or
be less elastic The business it too volatile for capacity predictions Our workloads change over time unpredictably (memory vs. CPU intensive) … We use them for Kubernetes masters (at least until we’re migrated to EKS) Why not Reserved Instances?

Cost per order Order growth vs cost growth Impact on
our AWS bill

Cost per order Order growth vs cost growth Impact on
our AWS bill Spots introduced

Example regional workload

Overview of Spot Instances Refresher on how the Spot Instance
Market works

Refresher on spot fleets Instead of a fixed list price,
there is a floating market price at which instance types are available. This price is typically less than 50% cheaper than the list price. The catch? 1. AWS can take the instance anytime from you with a 2min warning, if it’s needed elsewhere. 2. Some of your preferred instance types can be sold out.

Refresher on spot fleets Spot instance markets are defined by
an AZ and instance type. If you choose 2 instance types in 3 AZs, you are bidding in 6 different instance spot markets (pools). The more pools you specify, the lower the chance of not being able to maintain target capacity of a fleet.

A lot of recent effort went into spot instance usability

Provisioning spot fleets today Good console experience Well supported on
infra automation

Adapting to spot fleets for a more resilient architecture General
challenges with spots

Termination handling Close all connections Finish any long polling Stop
in-progress worker jobs Terminate your pods Remove from Load Balancer Re-scale to target capacity curl http://169.254.169.254/latest/meta-data/spot/termination-time

- That’s like Chaos Monkey, right? Yes, but it happens
24/7, not just business hours, and several instances might disappear at the same time. - My CTO’s never going to risk that! Termination handling

Actual issues arising from the volatility Applications not terminating gracefully
à abruptly terminated connections, stuck jobs Too much of target capacity being collocated on terminated nodes à too many pods of a deployment being affected New capacity not starting fast enough à a lot of apps starting at the same time can cause CPU starvation

We spent some time making Java pods w/ Spring boot
up more efficiently They have an ugly pattern of using 100% CPU until all classes are loaded and then idle under load Some help: -XX:TieredStopAtLevel=1 and remove bytecode instrumenting APM monitoring Case study: Spring Boot behavior on boot

Kubernetes native handling of spot instance terminations How to approach
all the challenges using Kubernetes

DaemonSet running on all nodes which drains the node immediately
upon seeing the notice. Optional Slack notification to give you a log to correlate monitoring noise / job disruptions with terminations. github.com/helm/charts/tree/master/incubator/kube-spot-termination-notice-handler (don’t worry, you’ll find all the links on the last slide) Spot Termination Notice Handler DaemonSet

Slack Notifications

Spot instances can stick around for a long time (~1
year no problem) Pods will pile up on those nodes, Kubernetes won’t reschedule by itself. Descheduler

Prevent pods of same deployment running on same node Target
node CPU/memory utilization à redistributes pods to less utilized nodes github.com/kubernetes-incubator/descheduler Descheduler

Goal: always have enough capacity to launch new pods Multiple
strategies in Delivery Hero: 1. Scaling spot fleet based on CPU/RAM reservations, not usage 2. Overprovisioning using PodPriority github.com/helm/charts/tree/master/stable/cluster-overprovisioner Auto-scaling strategy

Auto-scaling spot fleet based on custom metrics DaemonSet AWS CloudWatch
% CPU/RAM reserved of node Spot Fleet Autoscaling policy 80% CPU reserved 50% CPU reserved

Auto-scaling spot fleet based on custom metrics

Our dispatching algo runs on Akka It’s a stateful, actor-based
framework for distributed and concurrent apps The volatility of spots forced us to fix very broken cluster formation and split brain situations Case Study: beyond stateless apps, resiliency with stateful components

Take it step by step: move your most stateless, fast
to boot up pods over first, then continue one by one and monitor for noise. We migrated from on-demand over to spot within 6 months. No need to rush it our current fleet composition

Thanks! Questions? Vojtěch Vondra (@vvondra) tech.deliveryhero.com

Kubernetes on Spot Instances

Kubernetes on Spot Instances

Vojtěch Vondra

More Decks by Vojtěch Vondra

Other Decks in Programming

Featured

Transcript

a cost-effective and reliable cloud setup Kubernetes on Spot Instances

Own delivery fleets in 40 countries, delivering over a million

Rails and Spring for web workloads, Scala & Akka for

Cloud has been an obvious choice for us sample country

..without spending time tuning to downscale Primary objective: Save money

Our typical monthly AWS bill structure EC2 is typically the

Focusing on big impact savings first Where are the largest

We could only reserve are base load without peaks or

Cost per order Order growth vs cost growth Impact on

Cost per order Order growth vs cost growth Impact on

Example regional workload

Overview of Spot Instances Refresher on how the Spot Instance

Refresher on spot fleets Instead of a fixed list price,

Refresher on spot fleets Spot instance markets are defined by

A lot of recent effort went into spot instance usability

Provisioning spot fleets today Good console experience Well supported on

Adapting to spot fleets for a more resilient architecture General

Termination handling Close all connections Finish any long polling Stop

- That’s like Chaos Monkey, right? Yes, but it happens

Actual issues arising from the volatility Applications not terminating gracefully

We spent some time making Java pods w/ Spring boot

Kubernetes native handling of spot instance terminations How to approach

DaemonSet running on all nodes which drains the node immediately

Slack Notifications

Spot instances can stick around for a long time (~1

Prevent pods of same deployment running on same node Target

Goal: always have enough capacity to launch new pods Multiple

Auto-scaling spot fleet based on custom metrics DaemonSet AWS CloudWatch

Auto-scaling spot fleet based on custom metrics

Our dispatching algo runs on Akka It’s a stateful, actor-based

Take it step by step: move your most stateless, fast

Thanks! Questions? Vojtěch Vondra (@vvondra) tech.deliveryhero.com