$30 off During Our Annual Pro Sale. View Details »

Chord: A Scalable Peer-to-peer Lookup Service f...

Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

90-minute presentation for the Advanced topics in Data Management course at KAUST, on "A Scalable Peer-to-peer Lookup Service for Internet Applications" by Stoica et al, published in SIGCOMM '01.

Emaad Manzoor

May 04, 2014
Tweet

More Decks by Emaad Manzoor

Other Decks in Science

Transcript

  1. Chord Stoica et al., SIGCOMM '01 May 4, 2014 |

    Emaad Ahmed Manzoor CS341: Advanced Topics in Data Management
  2. Problem Motivation & metrics Solution The Chord Protocol Protocol Analysis

    Provable guarantees Experimental Results Simulations May 4, 2014 | Emaad Ahmed Manzoor
  3. Problem Motivation & metrics Solution The Chord Protocol Protocol Analysis

    Provable guarantees Experimental Results Simulations May 4, 2014 | Emaad Ahmed Manzoor
  4. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value
  5. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value
  6. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Keys are distributed across decentralized nodes
  7. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Keys are distributed across decentralized nodes
  8. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Nodes may arrive or leave at any time Keys are distributed across decentralized nodes
  9. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value
  10. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value
  11. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Routing Tables
  12. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Communication Cost Routing Tables
  13. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Communication Cost Routing Tables Maintenance Cost
  14. May 4, 2014 | Emaad Ahmed Manzoor Map a key

    to a node in the network containing the corresponding value Communication Cost Routing Tables Scalability f(N) Maintenance Cost
  15. May 4, 2014 | Emaad Ahmed Manzoor Keys are distributed

    across decentralized nodes Load Balance f (|k|/N)
  16. May 4, 2014 | Emaad Ahmed Manzoor Keys are distributed

    across decentralized nodes Load Balance f (|k|/N)
  17. May 4, 2014 | Emaad Ahmed Manzoor Keys are distributed

    across decentralized nodes Load Balance f (|k|/N) Asynchronous
  18. May 4, 2014 | Emaad Ahmed Manzoor Nodes may arrive

    or leave at any time Availability
  19. May 4, 2014 | Emaad Ahmed Manzoor Nodes may arrive

    or leave at any time Availability + Asynchronous
  20. May 4, 2014 | Emaad Ahmed Manzoor Nodes may arrive

    or leave at any time Availability + Asynchronous = Correctness
  21. May 4, 2014 | Emaad Ahmed Manzoor Scalability Load Balance

    Availability Asynchronous Correctness Map a key to a node in the network containing the corresponding value Keys are distributed across decentralized nodes Nodes may arrive or leave at any time
  22. Problem Motivation & metrics Solution The Chord Protocol Protocol Analysis

    Provable guarantees Experimental Results Simulations May 4, 2014 | Emaad Ahmed Manzoor
  23. May 4, 2014 | Emaad Ahmed Manzoor N N N

    N N N Internet Publisher key = k value = v
  24. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Client key = k value = ?
  25. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Client key = k value = ? Assignment Load Balancing Asynchronous
  26. May 4, 2014 | Emaad Ahmed Manzoor Consistent Hashing key_identifier

    = hash(key) = 110001 node_identifier = hash(node IP) = 100101 m bits, m = 6
  27. May 4, 2014 | Emaad Ahmed Manzoor Consistent Hashing key_identifier

    = hash(key) = 110001 node_identifier = hash(node IP) = 100101 m bits, m = 6 Identifier ring identifier % 2m = 0 to 63
  28. May 4, 2014 | Emaad Ahmed Manzoor Consistent Hashing key_identifier

    = hash(key) = 110001 node_identifier = hash(node IP) = 100101 m bits, m = 6 Node for k successor(key_identifer) Identifier ring identifier % 2m = 0 to 63
  29. May 4, 2014 | Emaad Ahmed Manzoor Load Balancing with

    Consistent Hashing N nodes, K keys, with high probability: - Each node has at most (1 + e)K/N keys - On arrival or departure of a node, O(K/N) keys exchange hands
  30. May 4, 2014 | Emaad Ahmed Manzoor Load Balancing with

    Consistent Hashing (a) N = 10,000 (b) k = 500,000
  31. May 4, 2014 | Emaad Ahmed Manzoor Load Balancing with

    Consistent Hashing (a) N = 10,000 (b) k = 500,000 Large variation in the number of keys per node
  32. May 4, 2014 | Emaad Ahmed Manzoor Load Balancing with

    Consistent Hashing (a) N = 10,000 (b) k = 500,000 Many nodes have no keys
  33. May 4, 2014 | Emaad Ahmed Manzoor Reason for The

    Load Imbalance Node identifiers do not uniformly cover the identifier space
  34. May 4, 2014 | Emaad Ahmed Manzoor Reason for The

    Load Imbalance Node identifiers do not uniformly cover the identifier space Solution Virtual Nodes
  35. May 4, 2014 | Emaad Ahmed Manzoor Load Balance with

    Virtual Nodes Load balance improves with more virtual nodes per real node
  36. May 4, 2014 | Emaad Ahmed Manzoor Load Balance with

    Virtual Nodes Load balance improves with more virtual nodes per real node Cost of routing information per node also increases
  37. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Client key = k value = ? Assignment Load Balancing Asynchronous
  38. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Client key = k value = ? Lookup Scalability Asynchronous Assignment Load Balancing Asynchronous
  39. May 4, 2014 | Emaad Ahmed Manzoor Each node can

    forward a request at most halfway around the identifier circle
  40. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = ?
  41. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = ?
  42. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = ? 0000 1000 1011
  43. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = ? 0000 1000 1011 d 0 = 1011
  44. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = ? 0000 1000 1011 d 1 = 0011
  45. May 4, 2014 | Emaad Ahmed Manzoor Expected number of

    query steps if node n wants to find the predecessor p of key k = 0.5 x log(N)
  46. May 4, 2014 | Emaad Ahmed Manzoor Lookup Cost in

    Practice Path length = query steps. Grows as log(N) 212 node network
  47. May 4, 2014 | Emaad Ahmed Manzoor Lookup Cost in

    Practice Path length = query steps. Grows as log(N) Mean path length = 0.5 x log(N) 212 node network
  48. May 4, 2014 | Emaad Ahmed Manzoor Lookup Cost in

    Practice Path length = query steps. Grows as log(N) Mean path length = 0.5 x log(N) High variance in path length?
  49. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Lookup Scalability Asynchronous Assignment Load Balancing Asynchronous Client key = k value = ?
  50. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Lookup Scalability Asynchronous Assignment Load Balancing Asynchronous Client key = k value = ?
  51. May 4, 2014 | Emaad Ahmed Manzoor Publisher key =

    k value = v N N N N N N Internet Lookup Scalability Asynchronous Assignment Load Balancing Asynchronous Client key = k value = ? Joins/Failure Availability Correctness Scalability
  52. May 4, 2014 | Emaad Ahmed Manzoor Stabilize protocol -

    Stabilize Updates successors - Notify Updates predecessors - Fix fingers Updates finger tables
  53. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, successors are correct
  54. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result
  55. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, successors are correct
  56. May 4, 2014 | Emaad Ahmed Manzoor 3 cases -

    Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, O(N) lookup cost successors are correct Correctness of Lookups with Node Joins
  57. May 4, 2014 | Emaad Ahmed Manzoor 3 cases -

    Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, O(N) lookup cost successors are correct Correct lookup result Correctness of Lookups with Node Joins
  58. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, O(N) lookup cost successors are correct Correct lookup result - Successors are incorrect
  59. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, O(N) lookup cost successors are correct Correct lookup result - Successors are incorrect Lookup fails
  60. May 4, 2014 | Emaad Ahmed Manzoor Correctness of Lookups

    with Node Joins 3 cases - Finger table entries are current, O(log(N)) lookup cost successors are correct Correct lookup result - Finger table entries are stale, O(N) lookup cost successors are correct Correct lookup result - Successors are incorrect Lookup fails Application layer pauses and retries failed lookups
  61. May 4, 2014 | Emaad Ahmed Manzoor Performance of Lookups

    with Node Joins With high probability, O (log(N)) new nodes will land between any two existing nodes
  62. May 4, 2014 | Emaad Ahmed Manzoor Performance of Lookups

    with Node Joins With high probability, O (log(N)) new nodes will land between any two existing nodes Lookups remain O(log(N))
  63. May 4, 2014 | Emaad Ahmed Manzoor Node Failure Each

    node maintains a successor list of size r
  64. May 4, 2014 | Emaad Ahmed Manzoor Performance of Lookups

    with Node Failure Mean path length with successor lists = 0.5 x log(N) - 0.5 x log(r) + 1
  65. May 4, 2014 | Emaad Ahmed Manzoor Performance of Lookups

    with Node Failure Mean path length with successor lists = 0.5 x log(N) - 0.5 x log(r) + 1 Eliminates the last 0.5 x log(r) hops on average
  66. May 4, 2014 | Emaad Ahmed Manzoor Performance of Lookups

    with Node Failure Mean path length with successor lists = 0.5 x log(N) - 0.5 x log(r) + 1 Access the predecessor of the key k from the node n' retrieved from the successor list
  67. May 4, 2014 | Emaad Ahmed Manzoor Scalability Load Balance

    Availability Asynchronous Correctness Map a key to a node in the network containing the corresponding value Keys are distributed across decentralized nodes Nodes may arrive or leave at any time Chord