Slide 1

Slide 1 text

a cost-effective and reliable cloud setup Kubernetes on Spot Instances

Slide 2

Slide 2 text

Own delivery fleets in 40 countries, delivering over a million orders a week Workloads across 4 continents in 3 AWS regions Typically a few hundred Kubernetes nodes running for web and worker workloads Logistics Tech in Delivery Hero

Slide 3

Slide 3 text

Rails and Spring for web workloads, Scala & Akka for real- time apps, Python, R and various solvers for batch tasks Mostly Amazon RDS for PostgreSQL and Amazon DynamoDB for persistence Kubernetes deployments currently transitioning from kops to Amazon EKS, usually 1-2 minor versions behind Logistics Tech in Delivery Hero

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

Cloud has been an obvious choice for us sample country highlighted: we open the floodgates at 11.30am dealing with peaks and spikiness

Slide 6

Slide 6 text

..without spending time tuning to downscale Primary objective: Save money on infrastructure

Slide 7

Slide 7 text

Our typical monthly AWS bill structure EC2 is typically the biggest component of our bills

Slide 8

Slide 8 text

Focusing on big impact savings first Where are the largest marginal improvements? à with the biggest cost contributor EC2 EC2

Slide 9

Slide 9 text

We could only reserve are base load without peaks or be less elastic The business it too volatile for capacity predictions Our workloads change over time unpredictably (memory vs. CPU intensive) … We use them for Kubernetes masters (at least until we’re migrated to EKS) Why not Reserved Instances?

Slide 10

Slide 10 text

Cost per order Order growth vs cost growth Impact on our AWS bill

Slide 11

Slide 11 text

Cost per order Order growth vs cost growth Impact on our AWS bill Spots introduced

Slide 12

Slide 12 text

Example regional workload

Slide 13

Slide 13 text

Overview of Spot Instances Refresher on how the Spot Instance Market works

Slide 14

Slide 14 text

Refresher on spot fleets Instead of a fixed list price, there is a floating market price at which instance types are available. This price is typically less than 50% cheaper than the list price. The catch? 1. AWS can take the instance anytime from you with a 2min warning, if it’s needed elsewhere. 2. Some of your preferred instance types can be sold out.

Slide 15

Slide 15 text

Refresher on spot fleets Spot instance markets are defined by an AZ and instance type. If you choose 2 instance types in 3 AZs, you are bidding in 6 different instance spot markets (pools). The more pools you specify, the lower the chance of not being able to maintain target capacity of a fleet.

Slide 16

Slide 16 text

A lot of recent effort went into spot instance usability

Slide 17

Slide 17 text

Provisioning spot fleets today Good console experience Well supported on infra automation

Slide 18

Slide 18 text

Adapting to spot fleets for a more resilient architecture General challenges with spots

Slide 19

Slide 19 text

Termination handling Close all connections Finish any long polling Stop in-progress worker jobs Terminate your pods Remove from Load Balancer Re-scale to target capacity curl http://169.254.169.254/latest/meta-data/spot/termination-time

Slide 20

Slide 20 text

- That’s like Chaos Monkey, right? Yes, but it happens 24/7, not just business hours, and several instances might disappear at the same time. - My CTO’s never going to risk that! Termination handling

Slide 21

Slide 21 text

Actual issues arising from the volatility Applications not terminating gracefully à abruptly terminated connections, stuck jobs Too much of target capacity being collocated on terminated nodes à too many pods of a deployment being affected New capacity not starting fast enough à a lot of apps starting at the same time can cause CPU starvation

Slide 22

Slide 22 text

We spent some time making Java pods w/ Spring boot up more efficiently They have an ugly pattern of using 100% CPU until all classes are loaded and then idle under load Some help: -XX:TieredStopAtLevel=1 and remove bytecode instrumenting APM monitoring Case study: Spring Boot behavior on boot

Slide 23

Slide 23 text

Kubernetes native handling of spot instance terminations How to approach all the challenges using Kubernetes

Slide 24

Slide 24 text

DaemonSet running on all nodes which drains the node immediately upon seeing the notice. Optional Slack notification to give you a log to correlate monitoring noise / job disruptions with terminations. github.com/helm/charts/tree/master/incubator/kube-spot-termination-notice-handler (don’t worry, you’ll find all the links on the last slide) Spot Termination Notice Handler DaemonSet

Slide 25

Slide 25 text

Slack Notifications

Slide 26

Slide 26 text

Spot instances can stick around for a long time (~1 year no problem) Pods will pile up on those nodes, Kubernetes won’t reschedule by itself. Descheduler

Slide 27

Slide 27 text

Prevent pods of same deployment running on same node Target node CPU/memory utilization à redistributes pods to less utilized nodes github.com/kubernetes-incubator/descheduler Descheduler

Slide 28

Slide 28 text

Goal: always have enough capacity to launch new pods Multiple strategies in Delivery Hero: 1. Scaling spot fleet based on CPU/RAM reservations, not usage 2. Overprovisioning using PodPriority github.com/helm/charts/tree/master/stable/cluster-overprovisioner Auto-scaling strategy

Slide 29

Slide 29 text

Auto-scaling spot fleet based on custom metrics DaemonSet AWS CloudWatch % CPU/RAM reserved of node Spot Fleet Autoscaling policy 80% CPU reserved 50% CPU reserved

Slide 30

Slide 30 text

Auto-scaling spot fleet based on custom metrics

Slide 31

Slide 31 text

Our dispatching algo runs on Akka It’s a stateful, actor-based framework for distributed and concurrent apps The volatility of spots forced us to fix very broken cluster formation and split brain situations Case Study: beyond stateless apps, resiliency with stateful components

Slide 32

Slide 32 text

Take it step by step: move your most stateless, fast to boot up pods over first, then continue one by one and monitor for noise. We migrated from on-demand over to spot within 6 months. No need to rush it our current fleet composition

Slide 33

Slide 33 text

Thanks! Questions? Vojtěch Vondra (@vvondra) tech.deliveryhero.com