Heart of the SwarmKit: Object Model

Bde162f1d6702057fbf9ed41c05ab72e?s=47 Stephen Day
October 07, 2016

Heart of the SwarmKit: Object Model

The design of SwarmKit's object model minimizes problems that commonly occur in distributed orchestration systems. Slides from Docker's Distributed Systems Summit in Berlin.

Docker Swarm Mode (https://docs.docker.com/engine/swarm/)
Docker SwarmKit (https://github.com/docker/swarmkit)
Docker SwarmKit Protobufs/GRPC (https://github.com/docker/swarmkit/tree/master/api)
Borg Paper (http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf)
Raft Consensus Algorithm (https://raft.github.io/)
Control Theory (https://en.wikipedia.org/wiki/Control_theory)

Bde162f1d6702057fbf9ed41c05ab72e?s=128

Stephen Day

October 07, 2016
Tweet

Transcript

  1. Heart of the SwarmKit: Object Model Stephen Day Docker, Inc.

    Docker Distributed Systems Summit, Berlin October 2016 v1
  2. Stephen Day Docker, Inc. stephen@docker.com github.com/stevvooe @stevvooe

  3. SwarmKit A new framework by Docker for building orchestration systems.

  4. 4 Orchestration A control system for your cluster Cluster O

    - Δ S t D D = Desired State O = Orchestrator C = Cluster S t = State at time t Δ = Operations to converge S to D https://en.wikipedia.org/wiki/Control_theory
  5. 5 Convergence A functional view D = Desired State O

    = Orchestrator C = Cluster S t = State at time t f(D, S n-1 , C) → S n | min(S-D)
  6. 6 Observability and Controllability The Problem Low Observability High Observability

    Failure Process State User Input
  7. 7 Data Model Requirements - Represent difference in cluster state

    - Maximize Observability - Support Convergence - Do this while being Extensible and Reliable
  8. Show me your data structures and I’ll show you your

    orchestration system
  9. 9 Services - Express desired state of the cluster -

    Abstraction to control a set of containers - Enumerates resources, network availability, placement - Leave the details of runtime to container process - Implement these services by distributing processes across a cluster Node 1 Node 2 Node 3
  10. 10 Declarative $ docker network create -d overlay backend 31ue4lvbj4m301i7ef3x8022t

    $ docker service create -p 6379:6379 --network backend redis bhk0gw6f0bgrbhmedwt5lful6 $ docker service scale serene_euler=3 serene_euler scaled to 3 $ docker service ls ID NAME REPLICAS IMAGE COMMAND dj0jh3bnojtm serene_euler 3/3 redis
  11. 11 Reconciliation Spec → Object Object Current State Spec Desired

    State
  12. Orchestrator 12 Task Model Atomic Scheduling Unit of SwarmKit Object

    Current State Spec Desired State Task 0 Task 1 … Task n Scheduler
  13. Task Model Prepare: setup resources Start: start the task Wait:

    wait until task exits Shutdown: stop task, cleanly Runtime
  14. Service Spec message ServiceSpec { // Task defines the task

    template this service will spawn. TaskSpec task = 2 [(gogoproto.nullable) = false]; // UpdateConfig controls the rate and policy of updates. UpdateConfig update = 6; // Service endpoint specifies the user provided configuration // to properly discover and load balance a service. EndpointSpec endpoint = 8; } Protobuf Example
  15. Service Object message Service { ServiceSpec spec = 3; //

    UpdateStatus contains the status of an update, if one is in // progress. UpdateStatus update_status = 5; // Runtime state of service endpoint. This may be different // from the spec version because the user may not have entered // the optional fields like node_port or virtual_ip and it // could be auto allocated by the system. Endpoint endpoint = 4; } Protobuf Example
  16. Manager Task Task Data Flow ServiceSpec TaskSpec Service ServiceSpec TaskSpec

    Task TaskSpec Worker
  17. Consistency

  18. 18 Field Ownership Only one component of the system can

    write to a field Consistency
  19. TaskSpec message TaskSpec { oneof runtime { NetworkAttachmentSpec attachment =

    8; ContainerSpec container = 1; } // Resource requirements for the container. ResourceRequirements resources = 2; // RestartPolicy specifies what to do when a task fails or finishes. RestartPolicy restart = 4; // Placement specifies node selection constraints Placement placement = 5; // Networks specifies the list of network attachment // configurations (which specify the network and per-network // aliases) that this task spec is bound to. repeated NetworkAttachmentConfig networks = 7; } Protobuf Examples
  20. Task message Task { TaskSpec spec = 3; string service_id

    = 4; uint64 slot = 5; string node_id = 6; TaskStatus status = 9; TaskState desired_state = 10; repeated NetworkAttachment networks = 11; Endpoint endpoint = 12; Driver log_driver = 13; } Protobuf Example Owner User Orchestrator Allocator Scheduler Shared
  21. Worker Pre-Run Preparing Manager Terminal States Task State New Allocated

    Assigned Ready Starting Running Complete Shutdown Failed Rejected
  22. Field Handoff Task Status State Owner < Assigned Manager >=

    Assigned Worker
  23. 23 Observability and Controllability The Problem Low Observability High Observability

    Failure Process State User Input
  24. 24 Orchestration A control system for your cluster Cluster O

    - Δ S t D D = Desired State O = Orchestrator C = Cluster S t = State at time t Δ = Operations to converge S to D https://en.wikipedia.org/wiki/Control_theory
  25. Orchestrator 25 Task Model Atomic Scheduling Unit of SwarmKit Object

    Current State Spec Desired State Task 0 Task 1 … Task n Scheduler
  26. SwarmKit doesn’t Quit

  27. Documentation - Docker Swarm Mode Source Code - SwarmKit -

    SwarmKit Protobuf/GRPC Interesting Topics - Borg Paper - Raft Consensus Algorithm - Control Theory Links
  28. THANK YOU