Slide 1

Slide 1 text

Consensus An Introduction to Raft

Slide 2

Slide 2 text

con·sen·sus /kənˈsensəs/ Agreeing upon state across distributed processes even in the presence of failures.

Slide 3

Slide 3 text

Problem • Distributed System • Consistency • Partition tolerance

Slide 4

Slide 4 text

Solution • Quorum • Replicated State Machines

Slide 5

Slide 5 text

Consensus Data —

Slide 6

Slide 6 text

We are sacrificing Availability

Slide 7

Slide 7 text

Why not Paxos? • Difficult to understand • Not practical enough to implement

Slide 8

Slide 8 text

Raft A Practical Paxos

Slide 9

Slide 9 text

Components • Consensus Module • State Machine • Log

Slide 10

Slide 10 text

Consensus Module • Roles: Leader, Follower, and Candidate • Time is divided into Terms • Commands: RequestVote and AppendEntries

Slide 11

Slide 11 text

Leader Accept commands from clients, commit entries, and send heartbeats Follower Replicate state from leaders and vote for candidates Candidate Start and handle leader elections

Slide 12

Slide 12 text

Follower Candidate Leader Times out, Starts election Times out, Restarts election Wins election Discovers new leader, Steps down Discovers current leader or new leader, Steps down

Slide 13

Slide 13 text

Term Higher numbers are used to determine leaders and check log entries. The term is incremented each time an election is started. Any command with an old term is ignored.

Slide 14

Slide 14 text

Example Happy Log Entry

Slide 15

Slide 15 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [] Role: Follower Term: 1 Commit Index: 0 Log: [] Role: Follower Term: 1 Commit Index: 0 Log: []

Slide 16

Slide 16 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [] Role: Follower Term: 1 Commit Index: 0 Log: [] Leader receives command ˒

Slide 17

Slide 17 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [˒] Leader sends log entries to followers

Slide 18

Slide 18 text

A B C Role: Leader Term: 1 Commit Index: 1 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [˒] Majority of followers respond with success

Slide 19

Slide 19 text

A B C Role: Leader Term: 1 Commit Index: 1 Log: [˒] Role: Follower Term: 1 Commit Index: 1 Log: [˒] Role: Follower Term: 1 Commit Index: 1 Log: [˒] Leader sends commit index to followers and responds to client

Slide 20

Slide 20 text

Example Sad Log Entry

Slide 21

Slide 21 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [] Role: Follower Term: 1 Commit Index: 0 Log: [] Role: Follower Term: 1 Commit Index: 0 Log: []

Slide 22

Slide 22 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [] Role: Follower Term: 1 Commit Index: 0 Log: [] Leader receives command ˒

Slide 23

Slide 23 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [] Leader sends log entries to followers

Slide 24

Slide 24 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [] Majority of followers do not respond

Slide 25

Slide 25 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [˒] Role: Follower Term: 1 Commit Index: 0 Log: [˒] Leader continues to retry log entry

Slide 26

Slide 26 text

Example Leader Failure

Slide 27

Slide 27 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [] Role: Follower Term: 1 Commit Index: 0 Log: [] Role: Follower Term: 1 Commit Index: 0 Log: [] D Role: Follower Term: 1 Commit Index: 0 Log: []

Slide 28

Slide 28 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [] Role: Follower Term: 1 Commit Index: 0 Log: [] Role: Follower Term: 1 Commit Index: 0 Log: [] Followers do not receive heartbeat D Role: Follower Term: 1 Commit Index: 0 Log: []

Slide 29

Slide 29 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [] Role: Candidate Term: 2 Commit Index: 0 Log: [] Role: Follower Term: 1 Commit Index: 0 Log: [] First follower to timeout becomes candidate D Role: Follower Term: 1 Commit Index: 0 Log: []

Slide 30

Slide 30 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [] Role: Candidate Term: 2 Commit Index: 0 Log: [] Role: Follower Term: 2 Commit Index: 0 Log: [] Candidate starts election and requests votes D Role: Follower Term: 2 Commit Index: 0 Log: []

Slide 31

Slide 31 text

A B C Role: Leader Term: 1 Commit Index: 0 Log: [] Role: Leader Term: 2 Commit Index: 0 Log: [] Role: Follower Term: 2 Commit Index: 0 Log: [] Followers respond with votes D Role: Follower Term: 2 Commit Index: 0 Log: []

Slide 32

Slide 32 text

Extras • Log safety and compaction • Cluster changes

Slide 33

Slide 33 text

Real-life Application • Distributed lock server • Configuration management • Background job storage

Slide 34

Slide 34 text

Smart People • Raft Paper by Diego Ongaro and John Ousterhout • Raft Implementation • ThinkDistributed