Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

90-minute presentation for the Advanced topics in Data Management course at KAUST, on "A Scalable Peer-to-peer Lookup Service for Internet Applications" by Stoica et al, published in SIGCOMM '01.

Ed09e933a899fcae158439f11f66fed0?s=128

Emaad Manzoor

May 04, 2014
Tweet

Transcript

  1. Chord Stoica et al., SIGCOMM '01 May 4, 2014 |

    Emaad Ahmed Manzoor CS341: Advanced Topics in Data Management
  2. Problem Motivation & metrics Solution The Chord Protocol Protocol Analysis

    Provable guarantees Experimental Results Simulations May 4, 2014 | Emaad Ahmed Manzoor
  3. Problem Motivation & metrics Solution The Chord Protocol Protocol Analysis

    Provable guarantees Experimental Results Simulations May 4, 2014 | Emaad Ahmed Manzoor
  4. May 4, 2014 | Emaad Ahmed Manzoor

  5. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value
  6. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value
  7. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Keys are distributed across decentralized nodes
  8. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Keys are distributed across decentralized nodes
  9. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Nodes may arrive or leave at any time Keys are distributed across decentralized nodes
  10. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value
  11. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value
  12. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Routing Tables
  13. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Communication Cost Routing Tables
  14. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Communication Cost Routing Tables Maintenance Cost
  15. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Communication Cost Routing Tables Scalability f(N) Maintenance Cost
  16. May 4, 2014 | Emaad Ahmed Manzoor Keys are distributed

    across decentralized nodes
  17. May 4, 2014 | Emaad Ahmed Manzoor Keys are distributed

    across decentralized nodes
  18. May 4, 2014 | Emaad Ahmed Manzoor Keys are distributed

    across decentralized nodes Load Balance f (|k|/N)
  19. May 4, 2014 | Emaad Ahmed Manzoor Keys are distributed

    across decentralized nodes Load Balance f (|k|/N)
  20. May 4, 2014 | Emaad Ahmed Manzoor Keys are distributed

    across decentralized nodes Load Balance f (|k|/N) Asynchronous
  21. May 4, 2014 | Emaad Ahmed Manzoor Nodes may arrive

    or leave at any time
  22. May 4, 2014 | Emaad Ahmed Manzoor Nodes may arrive

    or leave at any time Availability
  23. May 4, 2014 | Emaad Ahmed Manzoor Nodes may arrive

    or leave at any time Availability + Asynchronous
  24. May 4, 2014 | Emaad Ahmed Manzoor Nodes may arrive

    or leave at any time Availability + Asynchronous = Correctness
  25. May 4, 2014 | Emaad Ahmed Manzoor Scalability Load Balance

    Availability Asynchronous Correctness Map a key to a node in the network containing the corresponding value Keys are distributed across decentralized nodes Nodes may arrive or leave at any time
  26. Problem Motivation & metrics Solution The Chord Protocol Protocol Analysis

    Provable guarantees Experimental Results Simulations May 4, 2014 | Emaad Ahmed Manzoor
  27. May 4, 2014 | Emaad Ahmed Manzoor N N N

    N N N Internet
  28. May 4, 2014 | Emaad Ahmed Manzoor N N N

    N N N Internet Publisher key = k value = v
  29. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Client key = k value = ?
  30. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Client key = k value = ? Assignment Load Balancing Asynchronous
  31. May 4, 2014 | Emaad Ahmed Manzoor Consistent Hashing Distribute

    keys across nodes
  32. May 4, 2014 | Emaad Ahmed Manzoor Consistent Hashing key_identifier

    = hash(key) = 110001 m bits, m = 6
  33. May 4, 2014 | Emaad Ahmed Manzoor Consistent Hashing key_identifier

    = hash(key) = 110001 node_identifier = hash(node IP) = 100101 m bits, m = 6
  34. May 4, 2014 | Emaad Ahmed Manzoor Consistent Hashing key_identifier

    = hash(key) = 110001 node_identifier = hash(node IP) = 100101 m bits, m = 6 Identifier ring identifier % 2m = 0 to 63
  35. May 4, 2014 | Emaad Ahmed Manzoor Consistent Hashing key_identifier

    = hash(key) = 110001 node_identifier = hash(node IP) = 100101 m bits, m = 6 Node for k successor(key_identifer) Identifier ring identifier % 2m = 0 to 63
  36. May 4, 2014 | Emaad Ahmed Manzoor Load Balancing with

    Consistent Hashing N nodes, K keys, with high probability: - Each node has at most (1 + e)K/N keys - On arrival or departure of a node, O(K/N) keys exchange hands
  37. May 4, 2014 | Emaad Ahmed Manzoor Load Balancing with

    Consistent Hashing (a) N = 10,000 (b) k = 500,000
  38. May 4, 2014 | Emaad Ahmed Manzoor Load Balancing with

    Consistent Hashing (a) N = 10,000 (b) k = 500,000 Large variation in the number of keys per node
  39. May 4, 2014 | Emaad Ahmed Manzoor Load Balancing with

    Consistent Hashing (a) N = 10,000 (b) k = 500,000 Many nodes have no keys
  40. May 4, 2014 | Emaad Ahmed Manzoor Reason for The

    Load Imbalance
  41. May 4, 2014 | Emaad Ahmed Manzoor Reason for The

    Load Imbalance Node identifiers do not uniformly cover the identifier space
  42. May 4, 2014 | Emaad Ahmed Manzoor Reason for The

    Load Imbalance Node identifiers do not uniformly cover the identifier space Solution Virtual Nodes
  43. May 4, 2014 | Emaad Ahmed Manzoor Load Balance with

    Virtual Nodes
  44. May 4, 2014 | Emaad Ahmed Manzoor Load Balance with

    Virtual Nodes Load balance improves with more virtual nodes per real node
  45. May 4, 2014 | Emaad Ahmed Manzoor Load Balance with

    Virtual Nodes Load balance improves with more virtual nodes per real node Cost of routing information per node also increases
  46. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Client key = k value = ? Assignment Load Balancing Asynchronous
  47. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Client key = k value = ? Lookup Scalability Asynchronous Assignment Load Balancing Asynchronous
  48. May 4, 2014 | Emaad Ahmed Manzoor

  49. May 4, 2014 | Emaad Ahmed Manzoor Routing information: O(1)

    Communication: O(N)
  50. May 4, 2014 | Emaad Ahmed Manzoor

  51. May 4, 2014 | Emaad Ahmed Manzoor m entries

  52. May 4, 2014 | Emaad Ahmed Manzoor m entries

  53. May 4, 2014 | Emaad Ahmed Manzoor Routing information: log(N)

  54. May 4, 2014 | Emaad Ahmed Manzoor Communication cost: ?

  55. May 4, 2014 | Emaad Ahmed Manzoor Each node can

    forward a request at most halfway around the identifier circle
  56. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = ?
  57. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = ?
  58. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = ? 0000 1000 1011
  59. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = ? 0000 1000 1011 d 0 = 1011
  60. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = ? 0000 1000 1011 d 1 = 0011
  61. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = 0.5 x log(N)
  62. May 4, 2014 | Emaad Ahmed Manzoor Routing information: O(log(N))

    Communication cost: O(log(N))
  63. May 4, 2014 | Emaad Ahmed Manzoor Lookup Cost in

    Practice 212 node network
  64. May 4, 2014 | Emaad Ahmed Manzoor Lookup Cost in

    Practice Path length = query steps. Grows as log(N) 212 node network
  65. May 4, 2014 | Emaad Ahmed Manzoor Lookup Cost in

    Practice Path length = query steps. Grows as log(N) Mean path length = 0.5 x log(N) 212 node network
  66. May 4, 2014 | Emaad Ahmed Manzoor Lookup Cost in

    Practice Path length = query steps. Grows as log(N) Mean path length = 0.5 x log(N) High variance in path length?
  67. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Lookup Scalability Asynchronous Assignment Load Balancing Asynchronous Client key = k value = ?
  68. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Lookup Scalability Asynchronous Assignment Load Balancing Asynchronous Client key = k value = ?
  69. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Lookup Scalability Asynchronous Assignment Load Balancing Asynchronous Client key = k value = ? Joins/Failure Availability Correctness Scalability
  70. May 4, 2014 | Emaad Ahmed Manzoor

  71. May 4, 2014 | Emaad Ahmed Manzoor

  72. May 4, 2014 | Emaad Ahmed Manzoor

  73. May 4, 2014 | Emaad Ahmed Manzoor Stabilize protocol -

    Stabilize Updates successors - Notify Updates predecessors - Fix fingers Updates finger tables
  74. May 4, 2014 | Emaad Ahmed Manzoor

  75. May 4, 2014 | Emaad Ahmed Manzoor What if a

    lookup happens now?
  76. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, successors are correct
  77. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result
  78. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, successors are correct
  79. May 4, 2014 | Emaad Ahmed Manzoor 3 cases -

    Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, O(N) lookup cost successors are correct Correctness of Lookups with Node Joins
  80. May 4, 2014 | Emaad Ahmed Manzoor 3 cases -

    Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, O(N) lookup cost successors are correct Correct lookup result Correctness of Lookups with Node Joins
  81. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, O(N) lookup cost successors are correct Correct lookup result - Successors are incorrect
  82. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, O(N) lookup cost successors are correct Correct lookup result - Successors are incorrect Lookup fails
  83. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, O(N) lookup cost successors are correct Correct lookup result - Successors are incorrect Lookup fails Application layer pauses and retries failed lookups
  84. May 4, 2014 | Emaad Ahmed Manzoor Performance of Lookups

    with Node Joins
  85. May 4, 2014 | Emaad Ahmed Manzoor Performance of Lookups

    with Node Joins With high probability, O (log(N)) new nodes will land between any two existing nodes
  86. May 4, 2014 | Emaad Ahmed Manzoor Performance of Lookups

    with Node Joins With high probability, O (log(N)) new nodes will land between any two existing nodes Lookups remain O(log(N))
  87. May 4, 2014 | Emaad Ahmed Manzoor Node Failure

  88. May 4, 2014 | Emaad Ahmed Manzoor Node Failure

  89. May 4, 2014 | Emaad Ahmed Manzoor Node Failure Each

    node maintains a successor list of size r
  90. May 4, 2014 | Emaad Ahmed Manzoor Performance of Lookups

    with Node Failure Mean path length with successor lists = 0.5 x log(N) - 0.5 x log(r) + 1
  91. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Failure
  92. May 4, 2014 | Emaad Ahmed Manzoor Performance of Lookups

    with Node Failure Mean path length with successor lists = 0.5 x log(N) - 0.5 x log(r) + 1 Eliminates the last 0.5 x log(r) hops on average
  93. May 4, 2014 | Emaad Ahmed Manzoor Performance of Lookups

    with Node Failure Mean path length with successor lists = 0.5 x log(N) - 0.5 x log(r) + 1 Access the predecessor of the key k from the node n' retrieved from the successor list
  94. May 4, 2014 | Emaad Ahmed Manzoor Scalability Load Balance

    Availability Asynchronous Correctness Map a key to a node in the network containing the corresponding value Keys are distributed across decentralized nodes Nodes may arrive or leave at any time Chord
  95. May 4, 2014 | Emaad Ahmed Manzoor .