distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable. — Leslie Lamport CONSENSUS ALGORITHMS Tolerate failures Detect Handle Recover
Simple: Basic Paxos Lynch & Liskov Leslie Lamport Proved The Part Time Parliament Multi-paxos: Paxos + Complexity Rejected Published No mathematical proof 10 CONSENSUS ALGORITHMS
Choose a Proposal Number (n) Broadcast the number to all servers Prepare(n) If n > maxProposal: maxProposal = n promise CONSENSUS ALGORITHMS Respond Promise Won't accept proposal with n' < n
Choose a Proposal Number (n) Broadcast the number to all servers Prepare(n) Respond Promise Won't accept proposal with n' < n If majority Accept(n, ) CONSENSUS ALGORITHMS
Respond If n >= maxProposal: acceptedProposal = maxProposal = n acceptedValue = value If majority: if any rejection => n is not largest repeat from beginning else: value chosen CONSENSUS ALGORITHMS Broadcast Accept(n, )
Respond If n >= maxProposal: acceptedProposal = maxProposal = n acceptedValue = value If majority: if any rejection => n is not largest repeat from beginning else: value chosen CONSENSUS ALGORITHMS Broadcast Accept(n, )
- server id: unique - round number: increment overtime shared, highest Generate new proposal number: increment maxRound concatenate with server id CONSENSUS ALGORITHMS
S0 If didn't receive Heartbeat from a Higher ID for >= 2T ms: act as leader act as proposer Server with highest ID Heartbeat for every T ms Accept requests from client Leader/ Distinguished Proposer: CONSENSUS ALGORITHMS
S0 Server with highest ID Heartbeat for every T ms Accept requests from client Non-leader: Redirect client requests to leader act as acceptor Leader/ Proposer: CONSENSUS ALGORITHMS
Equivalent: performance & fault-tolerance Consistency, Conciseness, Correctness Why: implemented -> useful, extended/ adapted to the environment Understandability @yifan_xing_e 21 Designed by Diego Ongaro and John Ousterhout at Stanford CONSENSUS ALGORITHMS
to be a leader 2. Detect crashes, reelection 1. Leader processes commands from clients 2. Replicates logs (consistency and consensus among servers) @yifan_xing_e 23 Leader Election Log Replication CONSENSUS ALGORITHMS
@yifan_xing_e 25 At most one leader per term: - Each server: one vote per term - Receive majority to win election (N / 2 + 1) - Example: S0 S1 S1 S0 S0 S0 S1 S2 S3 S4 CONSENSUS ALGORITHMS
@yifan_xing_e 26 At most one leader per term: - Each server: one vote per term - Receive majority to win election (N / 2 + 1) - Example: S0 Leader S1 S2 S3 S4 CONSENSUS ALGORITHMS S1 S0 S0 S0 S1
@yifan_xing_e 27 There will eventually be a leader: - Random election timeout (range 100-300ms) - Usually, one times out first, and win the majority votes - If two time out at the same time: - Split vote -> election timeout -> re-enter election state (increment term, gather votes) CONSENSUS ALGORITHMS
YIFAN XING - 2018 @yifan_xing_e 32 RequestVoteRPC: term - candidate’s term candidateId - candidate requesting vote lastLogIndex - index of candidate’s last log entry lastLogTerm - term of candidate’s last log entry CONSENSUS ALGORITHMS