Raft Consensus Algorithm

Slide 1

Slide 1 text

Consensus Algorithms Consensus Algorithms Aman Garg Fall 2019 “ Is it better to be alive and wrong than be correct and dead? - Jay Kreps

Slide 2

Slide 2 text

Slide 3

Slide 3 text

What is consensus? What is consensus? A general agreement is what a consensus is all about. Get everyone to agree about a resource / thing Make decisions through a formal process Delay and analyse to gain harmony or else retry Establish a shared opinion The resulting consensus doesn't have to be unanimous. This person here is clearly unhappy but has consented to the majority

Slide 4

Slide 4 text

But why is it But why is it important? important? Let us take a use case and understand

Slide 5

Slide 5 text

Map Partition Owner Map Partition Owner Let's take a partitioned distributed map, say Hazelcast Assume you are writing a client over the IMap and there's a PUT operation on a certain key K. Assume a simple MOD hash function exists to find the partition where the key resides. Since the map is distributed (over N nodes), we need to find the owner for the given partition. What all various ways exist for you as a client to figure out the owner?

Slide 6

Slide 6 text

Diﬀerent ways of routing a request to the right node

Slide 7

Slide 7 text

Diﬀerent ways of routing a request to the right node

Slide 8

Slide 8 text

Diﬀerent ways of routing a request to the right node Client forwards to any node which either forwards to the correct node or returns

Slide 9

Slide 9 text

Diﬀerent ways of routing a request to the right node Client forwards to any node which either forwards to the correct node or returns

Slide 10

Slide 10 text

Diﬀerent ways of routing a request to the right node Client forwards to any node which either forwards to the correct node or returns Client sends to a partition aware routing tier which knows the owner for "foo"

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Slide 13

Slide 13 text

Why is it a problem though? Why is it a problem though?

Slide 14

Slide 14 text

Slide 15

Slide 15 text

Slide 16

Slide 16 text

Why is it a problem though? Why is it a problem though? All participants have to agree on what the correct owner node is for a particular partition. There's no point in guaranteeing a consistent view of a map, otherwise. Regardless of whether this information lies with the routing tier, nodes or the client, we need some sort of a coordination here. Distributed consensus is a hard problem. Easy to reason about but unfathomably hard to implement. There are a lot of edge cases to handle

Slide 17

Slide 17 text

Slide 18

Slide 18 text

Slide 19

Slide 19 text

Apache Zookeeper Apache Zookeeper Zookeeper tracks information that is to be synchronised across the cluster Each node registers itself in the Zookeeper (ZK) ZK tracks the partitions against a particular set of nodes Clients can subscribe to the above metadata. Whenever new nodes are added or removed, client is notiﬁed

Slide 20

Slide 20 text

Consensus is the key Consensus is the key Solving consensus is the key to solving at least the following problems in computer science:

Slide 21

Slide 21 text

Consensus is the key Consensus is the key Solving consensus is the key to solving at least the following problems in computer science: Total order broadcast Used in ZAB : Zookeeper Atomic Broadcast

Slide 22

Slide 22 text

Slide 23

Slide 23 text

Slide 24

Slide 24 text

Consensus is the key Consensus is the key Solving consensus is the key to solving at least the following problems in computer science: Total order broadcast Used in ZAB : Zookeeper Atomic Broadcast Atomic Commit (Databases) Fulﬁlling A and C in ACID properties Terminating reliable broadcasts Sending messages to a list of processes, say in multiplayer gaming Dynamic group membership Who is the master? Which workers are available? What task is assigned to the worker?

Slide 25

Slide 25 text

Slide 26

Slide 26 text

So what is a Consensus So what is a Consensus Algorithm? Algorithm? Consensus algorithms allow a collection of machines to work as a coherent group that can survive failures of some of its members

Slide 27

Slide 27 text

Properties of a consensus algorithm Properties of a consensus algorithm

Slide 28

Slide 28 text

Properties of a consensus algorithm Properties of a consensus algorithm Safety Never returns an incorrect result despite Network partitions and delays Packet loss, duplications and reorder

Slide 29

Slide 29 text

Slide 30

Slide 30 text

Slide 31

Slide 31 text

Properties of a consensus algorithm Properties of a consensus algorithm Safety Never returns an incorrect result despite Network partitions and delays Packet loss, duplications and reorder Fault Tolerant System is available and fully functional in case of failure of nodes. Correctness Performance is not impacted by minority of slow nodes. Does not depend on consistency of time for correctness. Real World Core algorithm should be understandable and intuitive. The internal workings should seem obvious. Implementation shouldn't require a major overhaul in existing arch.