Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Harvest, Yield, and Scalable Tolerant Systems

Harvest, Yield, and Scalable Tolerant Systems

Pedro Tavares

October 30, 2018

More Decks by Pedro Tavares

Other Decks in Programming


  1. Harvest, Yield, and Scalable Tolerant Systems

  2. @ordepdev @ordepdev

  3. 1999 https://users.ece.cmu.edu/~adrian/731-sp04/readings/FB-cap.pdf

  4. “We propose two strategies for improving overall availability using simple

    mechanisms that scale over large applications whose output behavior tolerates graceful degradation.”
  5. Tolerate partial failures and provide smoothly- degrading functionality.

  6. “We characterize this degradation in terms of harvest and yield.”

  7. 1975 https://tools.ietf.org/html/rfc677

  8. “[…] originally proposed as a rule of thumb, without precise

    definitions, with the goal of starting a discussion about trade-offs in databases.”
  9. The CAP Principle

  10. It was originally called the CAP Principle by Fox and

  11. After the principle was formalized by Gilbert and Lynch (2002)

    it became known as the CAP Theorem.
  12. All nodes should see the same data at the same

    time. Consistency (C)
  13. Consistency (C) x=1 x=1

  14. When a failure occurs, the system should keep going, switching

    over to a replica, if required. Availability (A)
  15. Availability (A) ✅

  16. The system should continue to operate despite arbitrary message loss

    or failure of part of the system. Partition resilience (P)
  17. Partition resilience (P) ⚡ ✅ ✅

  18. ⚡A network partition is a communication fault that splits the

    network into subsets of nodes that cannot communicate with each other.⚡
  19. Partition resilience (P) ⚡ ✅ ✅

  20. Strong CAP Principle

  21. Strong Consistency, High Availability, Partition- resilience: Pick at most 2.

  22. x=? x=?

  23. ⚡ x=? x=?

  24. ⚡ 1. set(‘x’,1) x=? x=?

  25. ⚡ 1. set(‘x’,1) 2. send(‘x’) x=1 x=?

  26. ⚡ 1. set(‘x’,1) 2. send(‘x’) x=1 x=? ⚡

  27. ⚡ 1. set(‘x’,1) 2. send(‘x’) x=1 x=? ⚡ 1. set(‘x’,2)

  28. ⚡ 1. set(‘x’,1) 2. send(‘x’) x=1 x=2 ⚡ 1. set(‘x’,2)

    2. send(‘x’)
  29. ⚡ 1. set(‘x’,1) 2. send(‘x’) x=1 x=2 ⚡ 1. set(‘x’,2)

    2. send(‘x’) Both nodes are available, although there’s no consistency!
  30. It’s all about trade-offs

  31. Partition Tolerance Consistency Availability

  32. Partition tolerance is mandatory in distributed systems. You cannot not

    choose it!
  33. Partition Tolerance Consistency Availability

  34. To achieve atomic reads and writes we must wait for

    a response from the partitioned node. CP without A
  35. Partition Tolerance Consistency Availability

  36. To achieve maximum availability it should return the most recent

    version of (stale) data. AP without C
  37. “The stronger the guarantees made about any two, the weaker

    the guarantees that can be made about the third.”
  38. Harvest & Yield

  39. Probability of completing a request. Yield

  40. The fraction of the data reflected in the response. Harvest

  41. “In the presence of faults there is typically a tradeoff

    between providing no answer and providing an imperfect answer.”
  42. ⚡ COUNT WHERE x = 1; ? x=1 x=1

  43. ⚡ COUNT WHERE x = 1; No answer. x=1 x=1

  44. ⚡ COUNT WHERE x = 1; 1. x=1 x=1

  45. “Instead of CAP, you should think about your availability in

    terms of yield and harvest and which of these two your system will sacrifice when failures happen.” https://codahale.com/you-cant-sacrifice-partition-tolerance
  46. Problems with CAP

  47. Asymmetry between A & C http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-and-yahoos-little.html

  48. “consistent and tolerant of network partitions, but not available” CP

    reads like…
  49. Availability is only sacrificed when there is a network partition.

    Of course, it’s not the case…
  50. Sacrifice consistency all the time, not just when there is

    a network partition. While an AP system…
  51. Lack of latency considerations

  52. “In its classic interpretation, CAP theorem ignores latency…”

  53. “… although in practice, latency and partitions are deeply related.”

  54. “Systems that tend to give up consistency for availability when

    there is a partition also tend to give up consistency for latency when there is no partition.” http://dbmsmusings.blogspot.com/2010/04/problems-with-cap-and-yahoos-little.html
  55. Guarantee Consistency Performance Availability Strong Consistency Excellent Poor Poor Eventual

    Consistency Poor Excellent Excellent
  56. “When you go with an AP system, you choose latency

    over consistency.”
  57. “Pick 2 of 3” is misleading http://cs609.cs.ua.edu/CAP12.pdf

  58. “The CAP theorem asserts that any networked shared-data system can

    have only two of three desirable properties.” https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed
  59. ⚡Network faults: you don’t have a choice — they will

    happen whether you like it or not!⚡
  60. ⚡ Consistency Availability ?

  61. “A better way of phrasing CAP would be either Consistent

    or Available when Partitioned.”
  62. CRDTs: we can write safely and consistently even when the

    cluster is totally partitioned.
  63. “[…] by explicitly handling partitions, designers can optimize consistency and

    availability, thereby achieving some tradeoff of all three.” https://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed
  64. Improving Knowledge

  65. “Whatever way you choose to learn, I encourage you to

    be curious and patient – this stuff doesn’t come easy.” https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html
  66. “But whatever you do, please stop talking about CP and

    AP, because they just don’t make any sense.” https://martin.kleppmann.com/2015/05/11/please-stop-calling-databases-cp-or-ap.html
  67. Harvest, Yield, and Scalable Tolerant Systems