$30 off During Our Annual Pro Sale. View Details »

RAFT: A story on how clusters of computers keep your data in sync

RAFT: A story on how clusters of computers keep your data in sync

Joshua Thijssen

January 25, 2020
Tweet

More Decks by Joshua Thijssen

Other Decks in Programming

Transcript

  1. RAFT
    !1
    An adventure in keeping your data in sync
    Joshua Thijssen
    @JayTaph

    View Slide

  2. What is X?
    2
    Set X = 5
    ok!
    5!

    View Slide

  3. 3
    What is X?
    5! ...

    X ?!
    3.14

    View Slide

  4. 4
    Gunter, we need a
    distributed consensus
    algorithm!
    Wenk!

    View Slide

  5. How do we make sure that...
    !5
    ➡ Data stays consistent
    ➡ Servers can fail at any time
    ➡ Don’t rely on unreliable means (time, id’s,
    network etc)

    View Slide

  6. Replicated State Machine
    !6
    #1 set Y=5
    #2 inc Y
    #3 set X = 3
    Journal / Log State machine
    Y=5
    Y=6
    Y=6, X=3

    View Slide

  7. Where?
    ➡ etcd / CoreOS
    ➡ Zookeeper
    ➡ MongoDB
    ➡ ElasticSearch
    ➡ ...basically everything clustered
    !7

    View Slide

  8. PAXOS
    !8
    Paxos is like, way
    too complex!

    View Slide

  9. Diego Ongaro
    John Ousterhout
    !9
    In Search of an Understandable Consensus
    Algorithm (Extended Version)

    View Slide

  10. RAFT
    !10

    View Slide

  11. ➡ https://raft.github.io
    ➡ https://raft.github.io/raft.pdf
    !11

    View Slide

  12. !12
    ➡ Proven to be correct
    ➡ Designed for simplicity
    ➡ Relatively easy to implement
    ➡ "Good enough" for most cases

    View Slide

  13. ➡ Cluster of servers, usually 3 or 5.
    ➡ One single leader.
    ➡ Clients communicate with leader only.
    ➡ Leader sends logs to other members.
    !13

    View Slide

  14. Three main pillars
    !14
    Leader
    Election
    Log
    Replication
    Safety

    View Slide

  15. Leader Election
    !15
    "One leader to rule them all 

    (without the orcs and stuff)"

    View Slide

  16. Follower
    Candidate
    Leader
    !16

    View Slide

  17. !17
    I'm the
    leader!

    View Slide

  18. !18
    ...
    Wenk!???

    View Slide

  19. !19
    ...
    Vote for my
    as leader!

    View Slide

  20. !20
    ...
    Wenk!
    Wenk!

    View Slide

  21. !21
    ... I'm the
    leader!

    View Slide

  22. !22
    Hey guys,
    I'm back!
    I'm the
    leader now!

    View Slide

  23. !23
    I'm the leader
    for term 1!
    I'm the leader
    for term 2!

    View Slide

  24. !24

    View Slide

  25. 25

    View Slide

  26. 26
    Servers Quorum Size Failure Tolerance
    1 1 0
    2 2 0
    3 2 1
    4 3 1
    5 3 2
    6 4 2
    7 4 3

    View Slide

  27. Three main pillars
    !27
    Leader
    Election
    Log
    Replication
    Safety

    View Slide

  28. Log Replication
    !28
    "How do we make sure that stuff also
    happens somewhere else"

    View Slide

  29. !29
    1 2 3 4 5 6 7 8 9 10 11 12
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    3
    x←5
    3
    x←4
    index
    command
    term committed uncommitted

    View Slide

  30. !30
    Leader
    1 2 3 4 5 6 7 8 9 10 11 12
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    3
    x←5
    3
    x←4
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    Followers
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    Leader receives commands and applies to log
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7

    View Slide

  31. !31
    Leader
    1 2 3 4 5 6 7 8 9 10 11 12
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    3
    x←5
    3
    x←4
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    Followers
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    3
    x←5
    3
    x←4
    3
    x←5
    3
    x←4
    Leader pushes entries to the cluster
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7

    View Slide

  32. !32
    Leader
    1 2 3 4 5 6 7 8 9 10 11 12
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    3
    x←5
    3
    x←4
    Followers
    Leader commits after majority and OKs to client
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    3
    x←5
    3
    x←4
    3
    x←5
    3
    x←4
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7

    View Slide

  33. !33
    Leader
    1 2 3 4 5 6 7 8 9 10 11 12
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    3
    x←5
    3
    x←4
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    Followers
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    3
    x←5
    3
    x←4
    3
    x←5
    3
    x←4
    Followers receives logs and commitIndex from leader
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    3
    x←0
    3
    y←7
    3
    x←5
    3
    x←4
    3
    x←5
    3
    x←4

    View Slide

  34. !34
    1 2 3 4 5 6 7 8 9 10 11 12
    1 1 1
    1 1 1
    1 1 1
    1 1 1
    1 1 1
    1 1 1
    1 1 1 2 2 2
    4 4 4 4
    4 4
    4 4
    4
    4 4
    4 4 5 5
    5 5
    5 5
    5 5
    6 6 6
    6 6
    6 6 6 6
    6 6 6 7 7
    3 3 3 3 3
    a
    b
    c
    d
    e
    f

    View Slide

  35. 1007 1008 1009 1010 1011 1012
    24
    x←5
    24
    x←4
    24
    x←2
    24
    y←4
    1 2 3 4 1006
    1
    x←3
    1
    x←1
    1
    y←9
    2
    x←2
    23
    y←7
    ...
    Snapshots
    last included index: 1006
    last included term: 23
    Y=7, X=4,Z="aap"

    View Slide

  36. Three main pillars
    !36
    Leader
    Election
    Log
    Replication
    Safety

    View Slide

  37. safety
    !37

    View Slide

  38. !38
    ➡ #1 For any given term, there can be at
    most one leader.
    ➡ #2 A server will vote for at most one
    candidate per term.
    ➡ #3 If two logs contain an entry with the
    same index and term, both logs are
    identical up to the given index.

    View Slide

  39. !39
    ➡ #4 The leader in a term contains all the
    entries committed in previous terms.
    ➡ #5 Leaders always append new entries to
    logs. Entries are in leaders are never
    overwritten or deleted.
    ➡ #6 If server S1 applies committed log
    entry 'e' at index 'i' to its state machine,
    then no other server can apply a different
    committed log entry at index 'i' to theirs.

    View Slide

  40. PHP?
    ➡ Actually, sort of
    ➡ https://github.com/Waqee/Raft-php
    ➡ https://github.com/JayTaph/raft
    !40

    View Slide

  41. 41
    Questions?

    View Slide