Slide 1

Slide 1 text

RAFT !1 An adventure in keeping your data in sync Joshua Thijssen @JayTaph

Slide 2

Slide 2 text

What is X? 2 Set X = 5 ok! 5!

Slide 3

Slide 3 text

3 What is X? 5! ... X ?! 3.14

Slide 4

Slide 4 text

4 Gunter, we need a distributed consensus algorithm! Wenk!

Slide 5

Slide 5 text

How do we make sure that... !5 ➡ Data stays consistent ➡ Servers can fail at any time ➡ Don’t rely on unreliable means (time, id’s, network etc)

Slide 6

Slide 6 text

Replicated State Machine !6 #1 set Y=5 #2 inc Y #3 set X = 3 Journal / Log State machine Y=5 Y=6 Y=6, X=3

Slide 7

Slide 7 text

Where? ➡ etcd / CoreOS ➡ Zookeeper ➡ MongoDB ➡ ElasticSearch ➡ ...basically everything clustered !7

Slide 8

Slide 8 text

PAXOS !8 Paxos is like, way too complex!

Slide 9

Slide 9 text

Diego Ongaro John Ousterhout !9 In Search of an Understandable Consensus Algorithm (Extended Version)

Slide 10

Slide 10 text

RAFT !10

Slide 11

Slide 11 text

➡ https://raft.github.io ➡ https://raft.github.io/raft.pdf !11

Slide 12

Slide 12 text

!12 ➡ Proven to be correct ➡ Designed for simplicity ➡ Relatively easy to implement ➡ "Good enough" for most cases

Slide 13

Slide 13 text

➡ Cluster of servers, usually 3 or 5. ➡ One single leader. ➡ Clients communicate with leader only. ➡ Leader sends logs to other members. !13

Slide 14

Slide 14 text

Three main pillars !14 Leader Election Log Replication Safety

Slide 15

Slide 15 text

Leader Election !15 "One leader to rule them all 
 (without the orcs and stuff)"

Slide 16

Slide 16 text

Follower Candidate Leader !16

Slide 17

Slide 17 text

!17 I'm the leader!

Slide 18

Slide 18 text

!18 ... Wenk!???

Slide 19

Slide 19 text

!19 ... Vote for my as leader!

Slide 20

Slide 20 text

!20 ... Wenk! Wenk!

Slide 21

Slide 21 text

!21 ... I'm the leader!

Slide 22

Slide 22 text

!22 Hey guys, I'm back! I'm the leader now!

Slide 23

Slide 23 text

!23 I'm the leader for term 1! I'm the leader for term 2!

Slide 24

Slide 24 text

!24

Slide 25

Slide 25 text

25

Slide 26

Slide 26 text

26 Servers Quorum Size Failure Tolerance 1 1 0 2 2 0 3 2 1 4 3 1 5 3 2 6 4 2 7 4 3

Slide 27

Slide 27 text

Three main pillars !27 Leader Election Log Replication Safety

Slide 28

Slide 28 text

Log Replication !28 "How do we make sure that stuff also happens somewhere else"

Slide 29

Slide 29 text

!29 1 2 3 4 5 6 7 8 9 10 11 12 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 3 x←5 3 x←4 index command term committed uncommitted

Slide 30

Slide 30 text

!30 Leader 1 2 3 4 5 6 7 8 9 10 11 12 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 3 x←5 3 x←4 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 Followers 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 Leader receives commands and applies to log 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7

Slide 31

Slide 31 text

!31 Leader 1 2 3 4 5 6 7 8 9 10 11 12 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 3 x←5 3 x←4 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 Followers 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 3 x←5 3 x←4 3 x←5 3 x←4 Leader pushes entries to the cluster 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7

Slide 32

Slide 32 text

!32 Leader 1 2 3 4 5 6 7 8 9 10 11 12 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 3 x←5 3 x←4 Followers Leader commits after majority and OKs to client 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 3 x←5 3 x←4 3 x←5 3 x←4 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7

Slide 33

Slide 33 text

!33 Leader 1 2 3 4 5 6 7 8 9 10 11 12 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 3 x←5 3 x←4 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 Followers 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 3 x←5 3 x←4 3 x←5 3 x←4 Followers receives logs and commitIndex from leader 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 1 x←3 1 x←1 1 y←9 2 x←2 3 x←0 3 y←7 3 x←5 3 x←4 3 x←5 3 x←4

Slide 34

Slide 34 text

!34 1 2 3 4 5 6 7 8 9 10 11 12 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 4 4 4 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 7 7 3 3 3 3 3 a b c d e f

Slide 35

Slide 35 text

1007 1008 1009 1010 1011 1012 24 x←5 24 x←4 24 x←2 24 y←4 1 2 3 4 1006 1 x←3 1 x←1 1 y←9 2 x←2 23 y←7 ... Snapshots last included index: 1006 last included term: 23 Y=7, X=4,Z="aap"

Slide 36

Slide 36 text

Three main pillars !36 Leader Election Log Replication Safety

Slide 37

Slide 37 text

safety !37

Slide 38

Slide 38 text

!38 ➡ #1 For any given term, there can be at most one leader. ➡ #2 A server will vote for at most one candidate per term. ➡ #3 If two logs contain an entry with the same index and term, both logs are identical up to the given index.

Slide 39

Slide 39 text

!39 ➡ #4 The leader in a term contains all the entries committed in previous terms. ➡ #5 Leaders always append new entries to logs. Entries are in leaders are never overwritten or deleted. ➡ #6 If server S1 applies committed log entry 'e' at index 'i' to its state machine, then no other server can apply a different committed log entry at index 'i' to theirs.

Slide 40

Slide 40 text

PHP? ➡ Actually, sort of ➡ https://github.com/Waqee/Raft-php ➡ https://github.com/JayTaph/raft !40

Slide 41

Slide 41 text

41 Questions?