Orchestrating Linux Containers While Tolerating Failures

Orchestrating Linux Containers While Tolerating Failures Drew Erny, Docker

2 About Me Drew Erny B.S. from University of Alabama,
May 2016 Working @ Docker since May on Swarmkit @dperny on literally every social network ever

3 Idea of this Talk Introduce at a high level
the concepts that Docker uses to tolerate failures.

4 Contents of this talk • What is failure? •
How do we narrow this problem? • How can a orchestration manage failures? • How does orchestration work?

5 What is a failure?

6 Traditional App MangoDS Ruby on Rectangles Docker Engine Machine

7 Replication MangoDS Ruby on Rectangles Docker Engine Machine MangoDS
Ruby on Rectangles Docker Engine Machine

8 Replication

9 Replication

10 Failures occur at different levels

11 Failures occur at different levels MangoDS Ruby on Rectangles
Docker Engine Machine Application

Docker Engine Machine Hardware

Docker Engine Machine Racks  (or AZs)

Docker Engine Machine Datacenters

15 Failures occur at different levels

16 How do we narrow this problem?

17 Isolate the smallest fail-able unit

18 xkcd.com/1737

19 Smallest Fail-able Unit Ruby on Rectangles Container

20 These are all container failures

21 Problem Changed

22 MangoDS Ruby on Rectangles Ruby on Rectangles MangoDS Hardware
IDK Maybe Redis Hardware Hardware Hardware TriangleJS TriangleJS MangoDS Availability Zone Availability Zone

23 Imagine 94 different apps

24 Imagine 94 different apps On 37 different machines

25 Imagine 94 different apps On 37 different machines (That's
a lot of monitoring and fixing)

26 Clustering MangoDS Ruby on Rectangles Ruby on Rectangles MangoDS
Cluster MangoDS TriangleDB TriangleDB IDK Maybe Redis

27 Clustering Lets us treat many discrete units as one
big virtual computer. Gives us a layer of abstraction that can handle failures for us.

28 How can a orchestration manage failures?

29 Desired State Reconciliation

30 Desired State Reconciliation Declare what you WANT your application
state to be and let the cluster do the heavy lifting. If a failure occurs, the cluster will compensate

31 What does this look like? Node Docker Ruby on
Rectangles Node Docker Application Node Docker Desired State: 2 Instances of Rectangles 2 Instances of MangoDS MangoDS Ruby on Rectangles MangoDS

Rectangles Node Docker Ruby on Rectangles Node Docker Desired State: 2 Instances of Rectangles 2 Instances of MangoDS Now Ruby on Rectangles crashes. MangoDS MangoDS

33 What does this look like? Node Docker Node Docker
Ruby on Rectangles Node Docker Desired State: 2 Instances of Rectangles 2 Instances of MangoDS Now Ruby on Rectangles crashes. MangoDS MangoDS

Ruby on Rectangles Node Docker Desired State: 2 Instances of Rectangles 2 Instances of MangoDS And a new one is spawned MangoDS MangoDS Ruby on Rectangles

Ruby on Rectangles Node Docker Desired State: 2 Instances of Rectangles 2 Instances of MangoDS Now a node failure occurs MangoDS MangoDS Ruby on Rectangles

Ruby on Rectangles Node Docker Desired State: 2 Instances of Rectangles 2 Instances of MangoDS Two containers down MangoDS

Rectangles Node Docker Ruby on Rectangles Node Docker Desired State: 2 Instances of Rectangles 2 Instances of MangoDS Schedule 2 new ones MangoDS MangoDS

Rectangles Node Docker Ruby on Rectangles Node Docker Desired State: 2 Instances of Rectangles 2 Instances of MangoDS Problem solved! MangoDS MangoDS

Rectangles Node Docker Ruby on Rectangles Node Docker Desired State: 2 Instances of Rectangles 2 Instances of MangoDS Node comes back up Nothing changes!  Already have desired state! MangoDS MangoDS

41 The cluster responds to container failures

42 How does the cluster work?

43 Some Vocabulary Node - an individual unit of available
computing resources. One node is generally one Docker Engine Task - an individual atomic scheduling unit, belonging to a service. One task is generally one container. Service - Individual unit of desired state. Defines what application and how many replicas.

44 Some Vocabulary Node Docker Task Node Docker Node Docker
Task Task Task Service Service

45 Cluster Components

47 Managers Make the decisions about what, where, and how
a service will run Watch the workers for failures and adjust accordingly

49 "Quis custodiet ipsos custodes?" Who watches the watchmen?

50 Managers are a group

51 Consistent All active managers are guaranteed to have a
full and equal copy.

52 Raft Leader Follower Follower

53 Raft Leader Follower Follower Request

54 Raft Leader Follower Follower Request Request Request

55 Raft Leader Follower Follower Request Confirmation Confirmation

56 Raft Leader Follower Follower Request Commit Commit

58 Raft Leader Follower Follower Request Request

60 Raft Dead Follower Leader

61 Raft Follower Follower Leader

62 The Raft Algorithm One manager is elected Leader, all
others Followers Leader is the ultimate endpoint for all requests Leader informs all Followers about log changes, waits for acknowledgement from a quorum (more than half) of Followers before committing Followers proxy requests to the leader (all are valid endpoints) If Leader dies or goes missing from the quorum, a new one is elected

63 The Raft Algorithm We can guarantee correctness of state
replication with Raft. Raft lets us proceed as long as greater than half of nodes are available.

64 The Raft Algorithm http://thesecretlivesofdata.com/raft/ (or just search for "raft
algorithm", this will be one of the first results)

65 How many managers? Managers Quorum (strictly greater than half)
Failures Tolerated 1 1 > .5 0 2 2 > 1 0 3 2 > 1.5 1 4 3 > 2 1 5 3 > 2.5 2 6 4 > 3 2 7 4 > 3.5 3 n CEILING(n/2) FLOOR(n/2)

66 Workers Connect to the Docker Engine Actually spawn the
containers Report back container status Do not participate in the decision-making process Route requests internally among themselves

67 Networking Tasks need to find each other Tasks need
to know when other tasks fail External requests need to be routed correctly

68 Internal Requests Node Node Node Node Node Node Node
Node Node Internal Request

69 External Requests Node Node Node Node Node Node Node
Node Node Request

70 External Requests Node Node Node Node Node Node Node
Node Node Request Forward

71 Gossip Node Node Node Node Node Node Node Node
Node

Node

Node ? ? ?

Node

79 Gossip Protocol Workers share information about what nodes have
what services Every node must maintain a record of the services on every other node Done outside of Raft Eventually consistent, not guaranteed consistent

80 Full cluster topology

81 Recap • Failures can occur at many different levels
• If we focus on container failures, we narrow the problem. • Orchestration does the heavy lifting for us. • Swarmkit uses lots of tricks to make orchestration itself failure-tolerant.

Drew Erny  [email protected] @dperny on every social network   

Orchestrating Linux Containers While Tolerating...

Orchestrating Linux Containers While Tolerating Failures

Other Decks in Programming

Featured

Transcript