Consensus in Distributed Systems

Brian (an ideas guy)

Brian (an ideas guy) Twitter for Alsatians?

Brian (an ideas guy) Uber for unicycles?

Brian (an ideas guy)

MySQL Database Ruby on Rails API

Leader (master) Follower (slave) Replication

Leader (master) Failover

Leader (master) Follower (slave) Network

Leader (master) Follower (slave) Network (actual physical cables and stuff)

Fallacies of distributed computing #1 The network is reliable. L
Peter Deutsch (et al)

Leader (master) Follower (slave) Network Order #17623

Leader (master) Follower (slave) Network Order #17623 Details of order
#17623 please Huh?

Options You’ve got two of ‘em

Leader (master) Follower (slave) Network Order #17623 Details of order
#17623 please Huh? Option A

Leader (master) Follower (slave) Network Order #17623 No Option C
(!)

CAP theorem (paraphrased) Eric Brewer When operating in a catastrophically
broken or unreliable network a distributed system must choose to either risk returning stale/outdated data or refuse to accept writes/updates

CAP theorem (paraphrased) Eric Brewer When operating in a catastrophically
broken or unreliable network (Partition Tolerance) a distributed system must choose to either risk returning stale/outdated data (Availability) or refuse to accept writes/updates (Consistency)

Trade-offs

Raft Consensus Algorithm

Strongly Consistent but also Highly Available

Quorum (and you need an odd number of nodes)

Distributed Log

best_programming_language = Ruby current_year = 2008 linux_on_desktop = Maybe State
Machine Distributed Log

best_programming_language = Ruby current_year = 2018 linux_on_desktop = Maybe State
Machine current_year = 2018 SET

best_programming_language = Go current_year = 2018 linux_on_desktop = Maybe State
Machine best_programming_language = Go SET current_year = 2018 SET

best_programming_language = Go current_year = 2018 State Machine best_programming_language =
Go SET current_year = 2018 SET linux_on_desktop DELETE

Getting a majority of servers in a cluster to agree
on What’s in the log

I like my leadership the same way I like my
☕ Strong. — Raft

Leader Election

⏰ Random Timers

* + , Monotonically Increasing Terms

every node starts off as a Follower if a follower
doesn’t hear from a leader for a while (random timer) it becomes a Candidate if the candidate receives votes from a majority of nodes it will become the Leader

In the case of a split-vote nodes will simply Wait
for another election

Leader Election

Leader goes AWOL

Log Replication

1. Client sends a command to the Leader. 2. Leader
appends an entry to its own log. 3. Leader issues an RPC (AppendEntries) to each Follower. 4. Follower appends the entry to its log and responds to the Leader to acknowledge the entry. 5. Once the entry has been acknowledged by a majority of Followers the Leader responds to the Client. 6. Leader issues a heartbeat RPC (AppendEntries) to each Follower which “commits” the entry and applies it to each Follower’s state machine.

Log Replication

Handling Turbulent Network Conditions

Safety Guarantees Election Safety Only a single leader will be
elected in each term. Append Only Leaders The leader will never delete or overwrite entries. Log Matching Any two logs with an entry of the same index and term, will contain the same value. Leader Completeness An entry committed in an earlier term will be present in the logs of leaders in later terms. State Machine Safety If a log entry at a given index has been applied to a server’s state machine, no other server will ever apply a different log entry at the same index.

Preventing Split-Brain 1 1 1 1 1

1 1 1 1 1 Preventing Split-Brain

2 2 2 1 1 Preventing Split-Brain

2 2 2 1 1 Preventing Split-Brain X=1 X=2

2 2 2 1 1 Preventing Split-Brain X=1 X=1 X=1
X=2 X=2

2 2 2 1 Preventing Split-Brain X=1 X=1 X=1 X=2
1 X=2

1 X=2 AppendEntries Term: 1 X = 2

1 X=2 NOPE. Term is 2 now

2

2 AppendEntries Term: 2 X = 1

2 2 2 2 Preventing Split-Brain X=1 X=1 X=1 2
X=1 X=1

Snapshots / Log Compaction

Thanks! https://raft.github.io/raft.pdf http://thesecretlivesofdata.com/raft/

Consensus in Distributed Systems

Consensus in Distributed Systems

Other Decks in Programming

Featured

Transcript