Heart of the SwarmKit: Object Model

Bde162f1d6702057fbf9ed41c05ab72e?s=47 Stephen Day
October 07, 2016

Heart of the SwarmKit: Object Model

The design of SwarmKit's object model minimizes problems that commonly occur in distributed orchestration systems. Slides from Docker's Distributed Systems Summit in Berlin.

Docker Swarm Mode (https://docs.docker.com/engine/swarm/)
Docker SwarmKit (https://github.com/docker/swarmkit)
Docker SwarmKit Protobufs/GRPC (https://github.com/docker/swarmkit/tree/master/api)
Borg Paper (http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf)
Raft Consensus Algorithm (https://raft.github.io/)
Control Theory (https://en.wikipedia.org/wiki/Control_theory)

Bde162f1d6702057fbf9ed41c05ab72e?s=128

Stephen Day

October 07, 2016
Tweet

Transcript

  1. 1.

    Heart of the SwarmKit: Object Model Stephen Day Docker, Inc.

    Docker Distributed Systems Summit, Berlin October 2016 v1
  2. 4.

    4 Orchestration A control system for your cluster Cluster O

    - Δ S t D D = Desired State O = Orchestrator C = Cluster S t = State at time t Δ = Operations to converge S to D https://en.wikipedia.org/wiki/Control_theory
  3. 5.

    5 Convergence A functional view D = Desired State O

    = Orchestrator C = Cluster S t = State at time t f(D, S n-1 , C) → S n | min(S-D)
  4. 7.

    7 Data Model Requirements - Represent difference in cluster state

    - Maximize Observability - Support Convergence - Do this while being Extensible and Reliable
  5. 9.

    9 Services - Express desired state of the cluster -

    Abstraction to control a set of containers - Enumerates resources, network availability, placement - Leave the details of runtime to container process - Implement these services by distributing processes across a cluster Node 1 Node 2 Node 3
  6. 10.

    10 Declarative $ docker network create -d overlay backend 31ue4lvbj4m301i7ef3x8022t

    $ docker service create -p 6379:6379 --network backend redis bhk0gw6f0bgrbhmedwt5lful6 $ docker service scale serene_euler=3 serene_euler scaled to 3 $ docker service ls ID NAME REPLICAS IMAGE COMMAND dj0jh3bnojtm serene_euler 3/3 redis
  7. 12.

    Orchestrator 12 Task Model Atomic Scheduling Unit of SwarmKit Object

    Current State Spec Desired State Task 0 Task 1 … Task n Scheduler
  8. 13.

    Task Model Prepare: setup resources Start: start the task Wait:

    wait until task exits Shutdown: stop task, cleanly Runtime
  9. 14.

    Service Spec message ServiceSpec { // Task defines the task

    template this service will spawn. TaskSpec task = 2 [(gogoproto.nullable) = false]; // UpdateConfig controls the rate and policy of updates. UpdateConfig update = 6; // Service endpoint specifies the user provided configuration // to properly discover and load balance a service. EndpointSpec endpoint = 8; } Protobuf Example
  10. 15.

    Service Object message Service { ServiceSpec spec = 3; //

    UpdateStatus contains the status of an update, if one is in // progress. UpdateStatus update_status = 5; // Runtime state of service endpoint. This may be different // from the spec version because the user may not have entered // the optional fields like node_port or virtual_ip and it // could be auto allocated by the system. Endpoint endpoint = 4; } Protobuf Example
  11. 19.

    TaskSpec message TaskSpec { oneof runtime { NetworkAttachmentSpec attachment =

    8; ContainerSpec container = 1; } // Resource requirements for the container. ResourceRequirements resources = 2; // RestartPolicy specifies what to do when a task fails or finishes. RestartPolicy restart = 4; // Placement specifies node selection constraints Placement placement = 5; // Networks specifies the list of network attachment // configurations (which specify the network and per-network // aliases) that this task spec is bound to. repeated NetworkAttachmentConfig networks = 7; } Protobuf Examples
  12. 20.

    Task message Task { TaskSpec spec = 3; string service_id

    = 4; uint64 slot = 5; string node_id = 6; TaskStatus status = 9; TaskState desired_state = 10; repeated NetworkAttachment networks = 11; Endpoint endpoint = 12; Driver log_driver = 13; } Protobuf Example Owner User Orchestrator Allocator Scheduler Shared
  13. 21.

    Worker Pre-Run Preparing Manager Terminal States Task State New Allocated

    Assigned Ready Starting Running Complete Shutdown Failed Rejected
  14. 24.

    24 Orchestration A control system for your cluster Cluster O

    - Δ S t D D = Desired State O = Orchestrator C = Cluster S t = State at time t Δ = Operations to converge S to D https://en.wikipedia.org/wiki/Control_theory
  15. 25.

    Orchestrator 25 Task Model Atomic Scheduling Unit of SwarmKit Object

    Current State Spec Desired State Task 0 Task 1 … Task n Scheduler
  16. 27.

    Documentation - Docker Swarm Mode Source Code - SwarmKit -

    SwarmKit Protobuf/GRPC Interesting Topics - Borg Paper - Raft Consensus Algorithm - Control Theory Links
  17. 28.