Slide 1

Slide 1 text

Heart of the SwarmKit: Object Model Stephen Day Docker, Inc. Docker Distributed Systems Summit, Berlin October 2016 v1

Slide 2

Slide 2 text

Stephen Day Docker, Inc. [email protected] github.com/stevvooe @stevvooe

Slide 3

Slide 3 text

SwarmKit A new framework by Docker for building orchestration systems.

Slide 4

Slide 4 text

4 Orchestration A control system for your cluster Cluster O - Δ S t D D = Desired State O = Orchestrator C = Cluster S t = State at time t Δ = Operations to converge S to D https://en.wikipedia.org/wiki/Control_theory

Slide 5

Slide 5 text

5 Convergence A functional view D = Desired State O = Orchestrator C = Cluster S t = State at time t f(D, S n-1 , C) → S n | min(S-D)

Slide 6

Slide 6 text

6 Observability and Controllability The Problem Low Observability High Observability Failure Process State User Input

Slide 7

Slide 7 text

7 Data Model Requirements - Represent difference in cluster state - Maximize Observability - Support Convergence - Do this while being Extensible and Reliable

Slide 8

Slide 8 text

Show me your data structures and I’ll show you your orchestration system

Slide 9

Slide 9 text

9 Services - Express desired state of the cluster - Abstraction to control a set of containers - Enumerates resources, network availability, placement - Leave the details of runtime to container process - Implement these services by distributing processes across a cluster Node 1 Node 2 Node 3

Slide 10

Slide 10 text

10 Declarative $ docker network create -d overlay backend 31ue4lvbj4m301i7ef3x8022t $ docker service create -p 6379:6379 --network backend redis bhk0gw6f0bgrbhmedwt5lful6 $ docker service scale serene_euler=3 serene_euler scaled to 3 $ docker service ls ID NAME REPLICAS IMAGE COMMAND dj0jh3bnojtm serene_euler 3/3 redis

Slide 11

Slide 11 text

11 Reconciliation Spec → Object Object Current State Spec Desired State

Slide 12

Slide 12 text

Orchestrator 12 Task Model Atomic Scheduling Unit of SwarmKit Object Current State Spec Desired State Task 0 Task 1 … Task n Scheduler

Slide 13

Slide 13 text

Task Model Prepare: setup resources Start: start the task Wait: wait until task exits Shutdown: stop task, cleanly Runtime

Slide 14

Slide 14 text

Service Spec message ServiceSpec { // Task defines the task template this service will spawn. TaskSpec task = 2 [(gogoproto.nullable) = false]; // UpdateConfig controls the rate and policy of updates. UpdateConfig update = 6; // Service endpoint specifies the user provided configuration // to properly discover and load balance a service. EndpointSpec endpoint = 8; } Protobuf Example

Slide 15

Slide 15 text

Service Object message Service { ServiceSpec spec = 3; // UpdateStatus contains the status of an update, if one is in // progress. UpdateStatus update_status = 5; // Runtime state of service endpoint. This may be different // from the spec version because the user may not have entered // the optional fields like node_port or virtual_ip and it // could be auto allocated by the system. Endpoint endpoint = 4; } Protobuf Example

Slide 16

Slide 16 text

Manager Task Task Data Flow ServiceSpec TaskSpec Service ServiceSpec TaskSpec Task TaskSpec Worker

Slide 17

Slide 17 text

Consistency

Slide 18

Slide 18 text

18 Field Ownership Only one component of the system can write to a field Consistency

Slide 19

Slide 19 text

TaskSpec message TaskSpec { oneof runtime { NetworkAttachmentSpec attachment = 8; ContainerSpec container = 1; } // Resource requirements for the container. ResourceRequirements resources = 2; // RestartPolicy specifies what to do when a task fails or finishes. RestartPolicy restart = 4; // Placement specifies node selection constraints Placement placement = 5; // Networks specifies the list of network attachment // configurations (which specify the network and per-network // aliases) that this task spec is bound to. repeated NetworkAttachmentConfig networks = 7; } Protobuf Examples

Slide 20

Slide 20 text

Task message Task { TaskSpec spec = 3; string service_id = 4; uint64 slot = 5; string node_id = 6; TaskStatus status = 9; TaskState desired_state = 10; repeated NetworkAttachment networks = 11; Endpoint endpoint = 12; Driver log_driver = 13; } Protobuf Example Owner User Orchestrator Allocator Scheduler Shared

Slide 21

Slide 21 text

Worker Pre-Run Preparing Manager Terminal States Task State New Allocated Assigned Ready Starting Running Complete Shutdown Failed Rejected

Slide 22

Slide 22 text

Field Handoff Task Status State Owner < Assigned Manager >= Assigned Worker

Slide 23

Slide 23 text

23 Observability and Controllability The Problem Low Observability High Observability Failure Process State User Input

Slide 24

Slide 24 text

24 Orchestration A control system for your cluster Cluster O - Δ S t D D = Desired State O = Orchestrator C = Cluster S t = State at time t Δ = Operations to converge S to D https://en.wikipedia.org/wiki/Control_theory

Slide 25

Slide 25 text

Orchestrator 25 Task Model Atomic Scheduling Unit of SwarmKit Object Current State Spec Desired State Task 0 Task 1 … Task n Scheduler

Slide 26

Slide 26 text

SwarmKit doesn’t Quit

Slide 27

Slide 27 text

Documentation - Docker Swarm Mode Source Code - SwarmKit - SwarmKit Protobuf/GRPC Interesting Topics - Borg Paper - Raft Consensus Algorithm - Control Theory Links

Slide 28

Slide 28 text

THANK YOU