Pro Yearly is on sale from $80 to $50! »

SwarmKit: Docker's Simplified Model for Complex Orchestration

Bde162f1d6702057fbf9ed41c05ab72e?s=47 Stephen Day
October 04, 2016

SwarmKit: Docker's Simplified Model for Complex Orchestration

SwarmKit is a new framework by Docker for building orchestration systems that powers Docker Engine's orchestration capabilities. In this talk, we'll dive into the model driven design and how the components fit together to build a user friendly orchestration system. Solving problems such as reconciliation, convergence and consistency at the model level ensure the system can evolve to meet modern use cases needed in orchestration applications. This approach leads to a simplified model that can reliably orchestrate complex deployments. Show me your data structures and I'll show you your orchestration system.

From ContainerCon EU 2016.

Docker Swarm Mode (
Docker SwarmKit (
Docker SwarmKit Protobufs/GRPC (
Borg Paper (
Raft Consensus Algorithm (
Control Theory (


Stephen Day

October 04, 2016


  1. SwarmKit: Docker’s Simple Model for Complex Orchestration Stephen Day Docker,

    Inc. ContainerCon+LinuxCon EU October 2016 v1
  2. Stephen Day Docker, Inc. @stevvooe

  3. SwarmKit A new framework by Docker for building orchestration systems.

  4. What is orchestration?

  5. 5 Orchestration - Orchestration Systems - Mostly use a service

    based model - Mostly wraps docker - Challenging to use - Standalone Swarm, July 2015 - Scale containers - Docker API “native” - No higher-level abstraction A Docker-oriented History
  6. Why orchestration?

  7. 7 Example $ docker network create -d overlay backend 31ue4lvbj4m301i7ef3x8022t

    $ docker service create -p 6379:6379 --network backend redis bhk0gw6f0bgrbhmedwt5lful6 $ docker service scale serene_euler=3 serene_euler scaled to 3 $ docker service ls ID NAME REPLICAS IMAGE COMMAND dj0jh3bnojtm serene_euler 3/3 redis
  8. Why services? What is wrong with containers?

  9. 9 Nodes - Arbitrary cluster resources - Connected across a

    common network - Topology control - Cryptographic identity Node 1 Node 2 Node 3
  10. 10 Services - Express desired state of the cluster -

    Abstraction to control a set of containers - Enumerates resources, network availability, placement - Leave the details of runtime to container process - Implement these services by distributing processes across a cluster Node 1 Node 2 Node 3
  11. 11 Networks - Defines broadcast domains - Services can attach

    to networks - Routing mesh will route connections to active service process Node 1 Node 2 Node 3
  12. Simple systems can exhibit complex behavior

  13. 13 Orchestration A control system for your cluster Cluster O

    - Δ S n D D = Desired State O = Orchestrator C = Cluster S t = State at time t Δ = Operations to converge S to D
  14. 14 Convergence A functional view D = Desired State O

    = Orchestrator C = Cluster S t = State at time t f(D, S n-1 , C) → S n | min(S-D)
  15. 15 Observability and Controllability The Problem Low Observability High Observability

    Failure Process State User Input
  16. 16 Data Model Requirements - Represent difference in cluster state

    - Maximize Observability - Support Convergence - Do this while being Extensible and Reliable
  17. Show me your data structures and I’ll show you your

    orchestration system
  18. Declarative

  19. 19 Declarative $ docker network create -d overlay backend 31ue4lvbj4m301i7ef3x8022t

    $ docker service create --network backend redis bhk0gw6f0bgrbhmedwt5lful6
  20. 20 Reconciliation Spec → Object Object Current State Spec Desired

  21. Orchestrator 21 Task Model Atomic Scheduling Unit of SwarmKit Object

    Current State Spec Desired State Task 0 Task 1 … Task n Scheduler
  22. Consistency

  23. 23 Versioned Updates Consistency service := getCurrentService() spec := service.Spec

    spec.Image = "my.serv/myimage:mytag" update(spec, service.Version)
  24. 24 Field Ownership Only one component of the system can

    write to a field Consistency
  25. Worker Pre-Run Preparing Manager Terminal States Task State New Allocated

    Assigned Ready Starting Running Complete Shutdown Failed Rejected
  26. Extensible

  27. Task Model Prepare: setup resources Start: start the task Wait:

    wait until task exits Shutdown: stop task, cleanly Terminate: kill the task, forcefully Update: update task metadata, without interruption Remove: remove resources used by task Runtime
  28. Reliable

  29. SwarmKit doesn’t Quit

  30. Architecture Data Structures

  31. Service Spec message ServiceSpec { // Task defines the task

    template this service will spawn. TaskSpec task = 2 [(gogoproto.nullable) = false]; // UpdateConfig controls the rate and policy of updates. UpdateConfig update = 6; // Service endpoint specifies the user provided configuration // to properly discover and load balance a service. EndpointSpec endpoint = 8; } Protobuf Example
  32. Service Object message Service { ServiceSpec spec = 3; //

    Runtime state of service endpoint. This may be different // from the spec version because the user may not have entered // the optional fields like node_port or virtual_ip and it // could be auto allocated by the system. Endpoint endpoint = 4; // UpdateStatus contains the status of an update, if one is in // progress. UpdateStatus update_status = 5; } Protobuf Example
  33. Task // Task specifies the parameters for implementing a Spec.

    A task is effectively // immutable and idempotent. Once it is dispatched to a node, it will not be // dispatched to another node. message Task { TaskSpec spec = 3; string service_id = 4; uint64 slot = 5; string node_id = 6; TaskStatus status = 9; TaskState desired_state = 10; repeated NetworkAttachment networks = 11; Endpoint endpoint = 12; Driver log_driver = 13; } Protobuf Example
  34. Blue Green Deployments Sillyproxy - Uses rolling updates to set

    proxy backends - Desired state is encoded in environment variables - Rolling updates can control traffic between backends Applied
  35. Distributed Period Scheduler From a GitHub comment - Scheduling criteria

    set via environment variables - Can leverage something like redis to do this, as well - Leverage restarts to dispatch to available nodes Applied
  36. Future

  37. Documentation - Docker Swarm Mode Source Code - SwarmKit -

    SwarmKit Protobuf/GRPC Interesting Topics - Borg Paper - Raft Consensus Algorithm - Control Theory Links
  38. Booth D38 @ LinuxCon + ContainerCon Tues Oct 4th •

    Build Distributed Systems without Docker, using Docker Plumbing Projects - Patrick Chanezon, David Chung and Captain Phil Estes • Getting Started with Docker Services - Mike Goelzer • Swarmkit: Docker’s Simplified Model for Complex Orchestration - Stephen Day • User Namespace and Seccomp Support in Docker Engine - Paul Novarese • Build Efficient Parallel Testing Systems with Docker - Docker Captain Laura Frank Wed Oct 5th • How Secure is your Container? A Docker Engine Security Update - Phil Estes • Docker Orchestration: Beyond the Basics - Aaron Lehmann • When the Going gets Tough, get TUF Going - Riyaz Faizullabhoy and Lily Guo Thurs Oct 6th • Orchestrating Linux Containers while Tolerating Failures - Drew Erny • Unikernels: When you Should and When you Shouldn’t - Amir Chaudhry • Berlin Docker Meetup Friday Oct 7th • Tutorial: Comparing Container Orchestration Tools - Neependra Khare • Tutorial: Orchestrate Containers in Production at Scale with Docker Swarm - Jerome Petazzoni