Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Kubernetes on Spot Instances

Kubernetes on Spot Instances

How Delivery Hero uses Spot Instances in production. The caveats of running on spots can teach you a great deal about the resiliency of your applications.

Vojtěch Vondra

February 27, 2019
Tweet

More Decks by Vojtěch Vondra

Other Decks in Programming

Transcript

  1. a cost-effective and reliable cloud setup
    Kubernetes on Spot Instances

    View full-size slide

  2. Own delivery fleets in 40 countries, delivering over a
    million orders a week
    Workloads across 4 continents in 3 AWS regions
    Typically a few hundred Kubernetes nodes running for
    web and worker workloads
    Logistics Tech in Delivery Hero

    View full-size slide

  3. Rails and Spring for web workloads, Scala & Akka for real-
    time apps, Python, R and various solvers for batch tasks
    Mostly Amazon RDS for PostgreSQL and Amazon
    DynamoDB for persistence
    Kubernetes deployments currently transitioning from
    kops to Amazon EKS, usually 1-2 minor versions behind
    Logistics Tech in Delivery Hero

    View full-size slide

  4. Cloud has been an obvious choice for us
    sample country highlighted:
    we open the floodgates at 11.30am
    dealing with peaks and spikiness

    View full-size slide

  5. ..without spending time tuning to downscale
    Primary objective:
    Save money on infrastructure

    View full-size slide

  6. Our typical monthly AWS bill structure
    EC2 is typically the biggest component of our bills

    View full-size slide

  7. Focusing on big impact savings first
    Where are the largest
    marginal improvements?
    à with the biggest cost
    contributor
    EC2
    EC2

    View full-size slide

  8. We could only reserve are base load without peaks or be less elastic
    The business it too volatile for capacity predictions
    Our workloads change over time unpredictably (memory vs. CPU intensive)

    We use them for Kubernetes masters (at least until we’re migrated to EKS)
    Why not Reserved Instances?

    View full-size slide

  9. Cost per order
    Order growth vs cost growth
    Impact on our AWS bill

    View full-size slide

  10. Cost per order
    Order growth vs cost growth
    Impact on our AWS bill
    Spots
    introduced

    View full-size slide

  11. Example regional workload

    View full-size slide

  12. Overview of Spot Instances
    Refresher on how the Spot Instance Market
    works

    View full-size slide

  13. Refresher on spot fleets
    Instead of a fixed list price, there is a floating market price at
    which instance types are available. This price is typically less than
    50% cheaper than the list price.
    The catch?
    1. AWS can take the instance anytime from you with a 2min
    warning, if it’s needed elsewhere.
    2. Some of your preferred instance types can be sold out.

    View full-size slide

  14. Refresher on spot fleets
    Spot instance markets are defined by an AZ and instance type.
    If you choose 2 instance types in 3 AZs, you are bidding in 6
    different instance spot markets (pools).
    The more pools you specify, the lower the chance of not being able
    to maintain target capacity of a fleet.

    View full-size slide

  15. A lot of recent effort went into spot instance usability

    View full-size slide

  16. Provisioning spot fleets today
    Good console experience Well supported on infra automation

    View full-size slide

  17. Adapting to spot fleets for a more resilient
    architecture
    General challenges with spots

    View full-size slide

  18. Termination handling
    Close all connections
    Finish any long polling
    Stop in-progress worker jobs
    Terminate your pods
    Remove from Load Balancer
    Re-scale to target capacity
    curl http://169.254.169.254/latest/meta-data/spot/termination-time

    View full-size slide

  19. - That’s like Chaos Monkey, right?
    Yes, but it happens 24/7, not just business hours, and
    several instances might disappear at the same time.
    - My CTO’s never going to risk that!
    Termination handling

    View full-size slide

  20. Actual issues arising from the volatility
    Applications not terminating gracefully
    à abruptly terminated connections, stuck jobs
    Too much of target capacity being collocated on
    terminated nodes
    à too many pods of a deployment being affected
    New capacity not starting fast enough
    à a lot of apps starting at the same time can cause CPU starvation

    View full-size slide

  21. We spent some time making Java pods w/ Spring boot up more
    efficiently
    They have an ugly pattern of using 100% CPU until all classes
    are loaded and then idle under load
    Some help: -XX:TieredStopAtLevel=1 and remove
    bytecode instrumenting APM monitoring
    Case study: Spring Boot behavior on boot

    View full-size slide

  22. Kubernetes native handling
    of spot instance terminations
    How to approach all the challenges using
    Kubernetes

    View full-size slide

  23. DaemonSet running on all nodes which drains the node
    immediately upon seeing the notice.
    Optional Slack notification to give you a log to correlate
    monitoring noise / job disruptions with terminations.
    github.com/helm/charts/tree/master/incubator/kube-spot-termination-notice-handler
    (don’t worry, you’ll find all the links on the last slide)
    Spot Termination Notice Handler DaemonSet

    View full-size slide

  24. Slack Notifications

    View full-size slide

  25. Spot instances can stick around
    for a long time (~1 year no
    problem)
    Pods will pile up on those nodes,
    Kubernetes won’t reschedule by
    itself.
    Descheduler

    View full-size slide

  26. Prevent pods of same deployment running on same node
    Target node CPU/memory utilization à redistributes pods to
    less utilized nodes
    github.com/kubernetes-incubator/descheduler
    Descheduler

    View full-size slide

  27. Goal: always have enough capacity to launch new pods
    Multiple strategies in Delivery Hero:
    1. Scaling spot fleet based on CPU/RAM reservations, not usage
    2. Overprovisioning using PodPriority
    github.com/helm/charts/tree/master/stable/cluster-overprovisioner
    Auto-scaling strategy

    View full-size slide

  28. Auto-scaling spot fleet based on custom metrics
    DaemonSet AWS CloudWatch
    % CPU/RAM
    reserved of node
    Spot Fleet
    Autoscaling policy
    80% CPU
    reserved
    50% CPU
    reserved

    View full-size slide

  29. Auto-scaling spot fleet based on custom metrics

    View full-size slide

  30. Our dispatching algo runs on Akka
    It’s a stateful, actor-based framework for distributed and
    concurrent apps
    The volatility of spots forced us to fix
    very broken cluster formation and split brain situations
    Case Study: beyond stateless apps, resiliency with stateful components

    View full-size slide

  31. Take it step by step: move your most stateless, fast to boot up
    pods over first, then continue one by one and monitor for noise.
    We migrated from on-demand over to spot within 6 months.
    No need to rush it
    our current fleet composition

    View full-size slide

  32. Thanks! Questions?
    Vojtěch Vondra (@vvondra)
    tech.deliveryhero.com

    View full-size slide