Bolt-on Causal Consistency

B7dc26518988058faa50712248c80bd3?s=47 pbailis
June 27, 2013

Bolt-on Causal Consistency

B7dc26518988058faa50712248c80bd3?s=128

pbailis

June 27, 2013
Tweet

Transcript

  1. Causal Consistency Peter Bailis, Ali Ghodsi, Joseph M. Hellerstein, Ion

    Stoica UC Berkeley Bolt on
  2. Slides from Sigmod 2013 paper at http://bailis.org/papers/bolton-sigmod2013.pdf pbailis@cs.berkeley.edu

  3. July 2000: CAP Conjecture

  4. July 2000: CAP Conjecture A system facing network partitions must

    choose between either availability or strong consistency
  5. July 2000: CAP Conjecture A system facing network partitions must

    choose between either availability or strong consistency Theorem
  6. NoSQL

  7. NoSQL

  8. NoSQL

  9. NoSQL

  10. NoSQL

  11. NoSQL Strong consistency is out! “Partitions matter, and so does

    low latency” [cf. Abadi: PACELC] ...offer eventual consistency instead
  12. Eventual Consistency eventually all replicas agree on the same value

    Extremely weak consistency model:
  13. Eventual Consistency eventually all replicas agree on the same value

    Extremely weak consistency model: Any value can be returned at any given time ...as long as it’s eventually the same everywhere
  14. Eventual Consistency eventually all replicas agree on the same value

    Extremely weak consistency model: Any value can be returned at any given time ...as long as it’s eventually the same everywhere Provides liveness but no safety guarantees Liveness: something good eventually happens Safety: nothing bad ever happens
  15. Do we have to give up safety if we want

    availability?
  16. Do we have to give up safety if we want

    availability? ?
  17. Do we have to give up safety if we want

    availability? ? No! There’s a spectrum of models.
  18. Do we have to give up safety if we want

    availability? ? No! There’s a spectrum of models.
  19. Do we have to give up safety if we want

    availability? ? No! There’s a spectrum of models. UT Austin TR: No model stronger than Causal Consistency is achievable with HA
  20. None
  21. None
  22. Why Causal Consistency? Highly available, low latency operation Long-identified useful

    “session” model Natural fit for many modern apps [Bayou Project, 1994-98] [UT Austin 2011 TR]
  23. Dilemma! Eventual consistency is the lowest common denominator across systems...

  24. Dilemma! Eventual consistency is the lowest common denominator across systems...

    ...yet eventual consistency is often insufficient for many applications...
  25. Dilemma! Eventual consistency is the lowest common denominator across systems...

    ...and no production-ready storage systems offer highly available causal consistency. ...yet eventual consistency is often insufficient for many applications...
  26. In this talk... show how to upgrade existing stores to

    provide HA causal consistency
  27. In this talk... show how to upgrade existing stores to

    provide HA causal consistency Approach: bolt on a narrow shim layer to upgrade eventual consistency
  28. In this talk... show how to upgrade existing stores to

    provide HA causal consistency Approach: bolt on a narrow shim layer to upgrade eventual consistency Outcome: architecturally separate safety and liveness properties
  29. Separation of Concerns

  30. Separation of Concerns Shim handles: Consistency/visibility

  31. Consistency-related Safety Mostly algorithmic Small code base Separation of Concerns

    Shim handles: Consistency/visibility
  32. Consistency-related Safety Mostly algorithmic Small code base Separation of Concerns

    Shim handles: Consistency/visibility Underlying store handles: Messaging/propagation Durability/persistence Failure-detection/handling
  33. Consistency-related Safety Mostly algorithmic Small code base Separation of Concerns

    Shim handles: Consistency/visibility Liveness and Replication Lots of engineering Reuse existing efforts! Underlying store handles: Messaging/propagation Durability/persistence Failure-detection/handling
  34. Consistency-related Safety Mostly algorithmic Small code base Separation of Concerns

    Shim handles: Consistency/visibility Liveness and Replication Lots of engineering Reuse existing efforts! Underlying store handles: Messaging/propagation Durability/persistence Failure-detection/handling Guarantee same (useful) semantics across systems! Allows portability, modularity, comparisons
  35. Bolt-on Architecture Bolt-on shim layer upgrades the semantics of an

    eventually consistent data store Clients only communicate with shim Shim communicates with one of many different eventually consistent stores (generic)
  36. Bolt-on Architecture Bolt-on shim layer upgrades the semantics of an

    eventually consistent data store Clients only communicate with shim Shim communicates with one of many different eventually consistent stores (generic) Treat EC store as “storage manager” of distributed DBMS
  37. Bolt-on Architecture Bolt-on shim layer upgrades the semantics of an

    eventually consistent data store Clients only communicate with shim Shim communicates with one of many different eventually consistent stores (generic) Treat EC store as “storage manager” of distributed DBMS for now, an extreme: unmodified EC store
  38. None
  39. None
  40. None
  41. None
  42. Bolt-on causal consistency

  43. None
  44. What is Causal Consistency?

  45. What is Causal Consistency?

  46. What is Causal Consistency? Time

  47. What is Causal Consistency? Time First Tweet

  48. What is Causal Consistency? Time First Tweet

  49. What is Causal Consistency? Time First Tweet Reply to Alex

  50. What is Causal Consistency? Time First Tweet Reply to Alex

  51. What is Causal Consistency? Time First Tweet Reply to Alex

  52. What is Causal Consistency? Time First Tweet Reply to Alex

  53. What is Causal Consistency? Time First Tweet Reply to Alex

  54. What is Causal Consistency? Reads obey: 1.) Writes Follow Reads

    (“happens-before”) 2.) Program order 3.) Transitivity [Lamport 1978]
  55. What is Causal Consistency? Reads obey: 1.) Writes Follow Reads

    (“happens-before”) 2.) Program order 3.) Transitivity [Lamport 1978] Here, applications explicitly define happens-before for each write (“explicit causality”) [Ladin et al. 1990, cf. Bailis et al. 2012]
  56. What is Causal Consistency? Reads obey: 1.) Writes Follow Reads

    (“happens-before”) 2.) Program order 3.) Transitivity [Lamport 1978] Here, applications explicitly define happens-before for each write (“explicit causality”) [Ladin et al. 1990, cf. Bailis et al. 2012] First Tweet Reply to Alex
  57. What is Causal Consistency? Reads obey: 1.) Writes Follow Reads

    (“happens-before”) 2.) Program order 3.) Transitivity [Lamport 1978] Here, applications explicitly define happens-before for each write (“explicit causality”) [Ladin et al. 1990, cf. Bailis et al. 2012] First Tweet Reply to Alex happens-before
  58. First Tweet Reply to Alex happens-before https://dev.twitter.com/docs/api/1.1/post/statuses/update

  59. First Tweet Reply to Alex happens-before happens-before https://dev.twitter.com/docs/api/1.1/post/statuses/update

  60. DC1 DC2

  61. First Tweet Reply to Alex happens-before DC1 DC2

  62. First Tweet Reply to Alex happens-before DC1 DC2

  63. First Tweet Reply to Alex happens-before DC1 DC2

  64. First Tweet Reply to Alex happens-before DC1 DC2

  65. First Tweet Reply to Alex happens-before DC1 DC2

  66. First Tweet Reply to Alex happens-before DC1 DC2

  67. First Tweet Reply to Alex happens-before DC1 DC2

  68. First Tweet Reply to Alex happens-before DC1 DC2

  69. First Tweet Reply to Alex happens-before DC1 DC2

  70. First Tweet Reply to Alex happens-before DC1 DC2

  71. First Tweet Reply to Alex happens-before DC1 DC2

  72. First Tweet Reply to Alex happens-before DC1 DC2

  73. 1.) Representing Order Two Tasks: How do we efficiently store

    causal ordering in the EC system? 2.) Controlling Order How do we control the visibility of new updates to the EC system?
  74. 1.) Representing Order Two Tasks: How do we efficiently store

    causal ordering in the EC system? 2.) Controlling Order How do we control the visibility of new updates to the EC system?
  75. 1.) Representing Order Two Tasks: How do we efficiently store

    causal ordering in the EC system? 2.) Controlling Order How do we control the visibility of new updates to the EC system?
  76. 1.) Representing Order Two Tasks: How do we efficiently store

    causal ordering in the EC system? 2.) Controlling Order How do we control the visibility of new updates to the EC system?
  77. Strawman: use vector clocks Representing Order [e.g., Bayou, Causal Memory]

  78. Strawman: use vector clocks Representing Order First Tweet :0 {

    } :1, :1 { } :1, Reply-to Alex [e.g., Bayou, Causal Memory]
  79. Strawman: use vector clocks Representing Order First Tweet :0 {

    } :1, :1 { } :1, Reply-to Alex Problem? Given missing dependency (from vector), what key should we check? [e.g., Bayou, Causal Memory]
  80. Strawman: use vector clocks Representing Order First Tweet :0 {

    } :1, :1 { } :1, Reply-to Alex Problem? Given missing dependency (from vector), what key should we check? If I have <3,1>; where is <2,1>? <1,1>? Write to same key? Write to different key? Which? [e.g., Bayou, Causal Memory]
  81. Strawman: use dependency pointers Representing Order [e.g., Lazy Replication, COPS]

  82. Strawman: use dependency pointers First Tweet A @ timestamp 1092,

    dependencies = {} Representing Order [e.g., Lazy Replication, COPS]
  83. Strawman: use dependency pointers First Tweet A @ timestamp 1092,

    dependencies = {} Reply-to Alex B @ timestamp 1109, dependencies={A@1092} Representing Order [e.g., Lazy Replication, COPS]
  84. Strawman: use dependency pointers Problem? First Tweet A @ timestamp

    1092, dependencies = {} Reply-to Alex B @ timestamp 1109, dependencies={A@1092} Representing Order [e.g., Lazy Replication, COPS]
  85. Strawman: use dependency pointers Problem? First Tweet A @ timestamp

    1092, dependencies = {} Reply-to Alex B @ timestamp 1109, dependencies={A@1092} Representing Order A@1→B@2→C@3 [e.g., Lazy Replication, COPS]
  86. C@3 A@1 Strawman: use dependency pointers Problem? First Tweet A

    @ timestamp 1092, dependencies = {} Reply-to Alex B @ timestamp 1109, dependencies={A@1092} Representing Order A@1→B@2→C@3 B@2 [e.g., Lazy Replication, COPS]
  87. C@3 A@1 Strawman: use dependency pointers Problem? First Tweet A

    @ timestamp 1092, dependencies = {} Reply-to Alex B @ timestamp 1109, dependencies={A@1092} Representing Order A@1→B@2→C@3 B@2 [e.g., Lazy Replication, COPS]
  88. C@3 A@1 Strawman: use dependency pointers Problem? First Tweet A

    @ timestamp 1092, dependencies = {} Reply-to Alex B @ timestamp 1109, dependencies={A@1092} Representing Order A@1→B@2→C@3 B@2 [e.g., Lazy Replication, COPS]
  89. C@3 A@1 Strawman: use dependency pointers Problem? First Tweet A

    @ timestamp 1092, dependencies = {} Reply-to Alex B @ timestamp 1109, dependencies={A@1092} Representing Order A@1→B@2→C@3 B@7 [e.g., Lazy Replication, COPS]
  90. C@3 A@1 Strawman: use dependency pointers Problem? First Tweet A

    @ timestamp 1092, dependencies = {} Reply-to Alex B @ timestamp 1109, dependencies={A@1092} Representing Order A@1→B@2→C@3 B@7 [e.g., Lazy Replication, COPS]
  91. C@3 A@1 Strawman: use dependency pointers Problem? First Tweet A

    @ timestamp 1092, dependencies = {} Reply-to Alex B @ timestamp 1109, dependencies={A@1092} Representing Order A@1→B@2→C@3 B@7 single pointers can be overwritten! [e.g., Lazy Replication, COPS]
  92. Representing Order

  93. Representing Order Strawman: use vector clocks don’t know what items

    to check
  94. Strawman: use dependency pointers Representing Order single pointers can be

    overwritten “overwritten histories” Strawman: use vector clocks don’t know what items to check
  95. Strawman: use dependency pointers Representing Order single pointers can be

    overwritten “overwritten histories” Strawman: use vector clocks don’t know what items to check Strawman: use N2 items for messaging
  96. Strawman: use dependency pointers Representing Order single pointers can be

    overwritten “overwritten histories” Strawman: use vector clocks don’t know what items to check Strawman: use N2 items for messaging highly inefficient!
  97. Representing Order Solution: store metadata about causal cuts

  98. Representing Order Solution: store metadata about causal cuts

  99. Representing Order Solution: store metadata about causal cuts short answer:

    consistent cut applied to data items; not quite the transitive closure
  100. short answer: consistent cut applied to data items; not quite

    the transitive closure Representing Order Solution: store metadata about causal cuts
  101. short answer: consistent cut applied to data items; not quite

    the transitive closure Representing Order Solution: store metadata about causal cuts A@1→B@2→C@3 Causal cut for C@3: {B@2, A@1}
  102. short answer: consistent cut applied to data items; not quite

    the transitive closure Representing Order Solution: store metadata about causal cuts A@1→B@2→C@3 Causal cut for C@3: {B@2, A@1} A@6→B@17→C@20 A@10→B@12 Causal cut for C@20: {B@17, A@10}
  103. Two Tasks: 1.) Representing Order How do we efficiently store

    causal ordering in the EC system? 2.) Controlling Order How do we control the visibility of new updates to the EC system?
  104. Two Tasks: 1.) Representing Order 2.) Controlling Order How do

    we control the visibility of new updates to the EC system? Shim stores causal cut summary along with every key due to overwrites and “unreliable” delivery
  105. Two Tasks: 1.) Representing Order 2.) Controlling Order How do

    we control the visibility of new updates to the EC system? Shim stores causal cut summary along with every key due to overwrites and “unreliable” delivery
  106. Controlling Order

  107. Controlling Order Standard technique: reveal new writes to readers only

    when dependencies have been revealed Inductively guarantee clients read from causal cut
  108. Controlling Order Standard technique: reveal new writes to readers only

    when dependencies have been revealed Inductively guarantee clients read from causal cut In bolt-on causal consistency, two challenges:
  109. Controlling Order Standard technique: reveal new writes to readers only

    when dependencies have been revealed Inductively guarantee clients read from causal cut In bolt-on causal consistency, two challenges: Each shim has to check dependencies manually Underlying store doesn’t notify clients of new writes
  110. Controlling Order Standard technique: reveal new writes to readers only

    when dependencies have been revealed Inductively guarantee clients read from causal cut In bolt-on causal consistency, two challenges: Each shim has to check dependencies manually Underlying store doesn’t notify clients of new writes EC store may overwrite “stable” cut Clients need to cache relevant cut to prevent overwrites
  111. Controlling Order Standard technique: reveal new writes to readers only

    when dependencies have been revealed Inductively guarantee clients read from causal cut In bolt-on causal consistency, two challenges: Each shim has to check dependencies manually Underlying store doesn’t notify clients of new writes EC store may overwrite “stable” cut Clients need to cache relevant cut to prevent overwrites
  112. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut
  113. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut
  114. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut
  115. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut
  116. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut SHIM
  117. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut SHIM EC Store
  118. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut read(B) SHIM EC Store
  119. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut read(B) SHIM EC Store read(B)
  120. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut read(B) SHIM EC Store read(B) B@1109, deps={A@1092}
  121. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut read(B) SHIM EC Store read(B) B@1109, deps={A@1092} read(A)
  122. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut read(B) SHIM EC Store read(B) B@1109, deps={A@1092} read(A) A@1092, deps={}
  123. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut read(B) SHIM EC Store read(B) B@1109, deps={A@1092} read(A) A@1092, deps={} B@1109
  124. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut read(B) SHIM EC Store read(B) B@1109, deps={A@1092} read(A) A@1092, deps={} B@1109
  125. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut read(B) SHIM EC Store read(B) B@1109, deps={A@1092} A@1092, deps={}
  126. Each shim has to check dependencies manually EC store may

    overwrite “stable” cut read(B) SHIM EC Store read(B) B@1109, deps={A@1092} A@1092, deps={} Cache this value for A! EC store might overwrite it with “unresolved” write
  127. 1.) Representing Order Two Tasks: 2.) Controlling Order How do

    we control the visibility of new updates to the EC system? Shim stores causal cut summary along with every key due to overwrites and “unreliable” delivery
  128. 1.) Representing Order Two Tasks: 2.) Controlling Order Shim performs

    dependency checks for client, caches dependencies Shim stores causal cut summary along with every key due to overwrites and “unreliable” delivery
  129. None
  130. UpgradeD CASSANDRA to Causal consistency

  131. UpgradeD CASSANDRA to Causal consistency 322 lines Java for CORE

    Safety Custom serialization Client-side caching
  132. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median
  133. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median
  134. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median Twitter 40 230 5407 Flickr 44 100 2447 Metafilter 170 870 19375 TUAW 62 100 2438 99th percentile
  135. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median Twitter 40 230 5407 Flickr 44 100 2447 Metafilter 170 870 19375 TUAW 62 100 2438 99th percentile
  136. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median Twitter 40 230 5407 Flickr 44 100 2447 Metafilter 170 870 19375 TUAW 62 100 2438 99th percentile
  137. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median Twitter 40 230 5407 Flickr 44 100 2447 Metafilter 170 870 19375 TUAW 62 100 2438 99th percentile
  138. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median Twitter 40 230 5407 Flickr 44 100 2447 Metafilter 170 870 19375 TUAW 62 100 2438 99th percentile
  139. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median Twitter 40 230 5407 Flickr 44 100 2447 Metafilter 170 870 19375 TUAW 62 100 2438 99th percentile
  140. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median Twitter 40 230 5407 Flickr 44 100 2447 Metafilter 170 870 19375 TUAW 62 100 2438 99th percentile
  141. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median Twitter 40 230 5407 Flickr 44 100 2447 Metafilter 170 870 19375 TUAW 62 100 2438 99th percentile
  142. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median Twitter 40 230 5407 Flickr 44 100 2447 Metafilter 170 870 19375 TUAW 62 100 2438 99th percentile
  143. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median Twitter 40 230 5407 Flickr 44 100 2447 Metafilter 170 870 19375 TUAW 62 100 2438 99th percentile
  144. Dataset Chain Length Message Depth Serialized Size (b) Twitter 2

    4 169 Flickr 3 5 201 Metafilter 6 18 525 TUAW 13 8 275 Median Twitter 40 230 5407 Flickr 44 100 2447 Metafilter 170 870 19375 TUAW 62 100 2438 99th percentile Most chains are small Metadata often < 1KB Power laws mean some chains are difficult
  145. Strategy 1: Resolve dependencies at read time

  146. Strategy 1: Resolve dependencies at read time

  147. Strategy 1: Resolve dependencies at read time

  148. Strategy 1: Resolve dependencies at read time Often (but not

    always) within 40% of eventual Long chains hurt throughput
  149. Strategy 1: Resolve dependencies at read time Often (but not

    always) within 40% of eventual Long chains hurt throughput N.B. Locality in YCSB workload greatly helps read performance; dependencies (or replacements) often cached (used 100x default # keys, but still likely to have concurrent write in cache)
  150. A thought... Causal consistency trades visibility for safety How far

    can we push this visibility?
  151. SHIM EC Store What if we serve entirely from cache

    and fetch new data asynchronously?
  152. read(B) SHIM EC Store What if we serve entirely from

    cache and fetch new data asynchronously?
  153. read(B) SHIM EC Store B from cache What if we

    serve entirely from cache and fetch new data asynchronously?
  154. read(B) SHIM EC Store read(B) B from cache What if

    we serve entirely from cache and fetch new data asynchronously?
  155. read(B) SHIM EC Store read(B) B@1109, deps=... B from cache

    What if we serve entirely from cache and fetch new data asynchronously?
  156. read(B) SHIM EC Store read(B) B@1109, deps=... read(A) B from

    cache What if we serve entirely from cache and fetch new data asynchronously?
  157. read(B) SHIM EC Store read(B) B@1109, deps=... read(A) A@1092, deps={}

    B from cache What if we serve entirely from cache and fetch new data asynchronously?
  158. read(B) SHIM EC Store read(B) B@1109, deps=... read(A) A@1092, deps={}

    B from cache What if we serve entirely from cache and fetch new data asynchronously? EC store reads are async
  159. A thought... Causal consistency trades visibility for safety How far

    can we push this visibility? What if we serve reads entirely from cache and fetch new data asynchronously?
  160. A thought... Causal consistency trades visibility for safety How far

    can we push this visibility? What if we serve reads entirely from cache and fetch new data asynchronously? Continuous trade-off space between dependency resolution depth and fast-path latency hit
  161. Strategy 2: Fetch dependencies asynchronously

  162. Strategy 2: Fetch dependencies asynchronously

  163. Throughput exceeds eventual configuration Still causally consistent, more stale reads

    Strategy 2: Fetch dependencies asynchronously
  164. Sync Reads Async Reads

  165. Sync Reads Async Reads Reading from cache is fast; linear

    speedup
  166. Sync Reads Async Reads Reading from cache is fast; linear

    speedup ...but not reading most recent data... ...in this case, effectively a straw-man.
  167. Lessons Causal consistency is achievable without modifications to existing stores

    represent and control ordering between updates EC is “orderless” until convergence trade-off between visibility and ordering
  168. Lessons Causal consistency is achievable without modifications to existing stores

    works well for workloads with small causal histories, good temporal locality represent and control ordering between updates EC is “orderless” until convergence trade-off between visibility and ordering
  169. Rethinking the EC API Uncontrolled overwrites increased metadata and local

    storage requirements Clients had to check causal dependencies independently, with no aid from EC store
  170. Rethinking the EC API What if we eliminated overwrites? via

    multi-versioning, conditional updates or immutability
  171. Rethinking the EC API What if we eliminated overwrites? via

    multi-versioning, conditional updates or immutability No more overwritten histories Decrease metadata Still have to check for dependency arrivals
  172. Rethinking the EC API What if the EC store notified

    us when dependencies converged (arrived everywhere)?
  173. Rethinking the EC API What if the EC store notified

    us when dependencies converged (arrived everywhere)? put( after converges)
  174. Rethinking the EC API What if the EC store notified

    us when dependencies converged (arrived everywhere)? put( after converges)
  175. Rethinking the EC API What if the EC store notified

    us when dependencies converged (arrived everywhere)? put( after converges)
  176. Rethinking the EC API What if the EC store notified

    us when dependencies converged (arrived everywhere)? put( after converges)
  177. Rethinking the EC API What if the EC store notified

    us when dependencies converged (arrived everywhere)? put( after converges)
  178. Rethinking the EC API What if the EC store notified

    us when dependencies converged (arrived everywhere)? put( after converges)
  179. Rethinking the EC API What if the EC store notified

    us when dependencies converged (arrived everywhere)? put( after converges)
  180. Rethinking the EC API What if the EC store notified

    us when dependencies converged (arrived everywhere)? put( after converges)
  181. Rethinking the EC API What if the EC store notified

    us when dependencies converged (arrived everywhere)? put( after converges)
  182. Rethinking the EC API What if the EC store notified

    us when dependencies converged (arrived everywhere)? Wait to place writes in shared EC store until dependencies have converged No need for metadata No need for additional checks Ensure durability with client-local EC storage
  183. Multi-versioning or Conditional Update Stable Callback Reduces Metadata YES YES

    No Dependency Checks NO YES
  184. Multi-versioning or Conditional Update Stable Callback Reduces Metadata YES YES

    No Dependency Checks NO YES
  185. Multi-versioning or Conditional Update Stable Callback Reduces Metadata YES YES

    No Dependency Checks NO YES
  186. Multi-versioning or Conditional Update Stable Callback Reduces Metadata YES YES

    No Dependency Checks NO YES Data Store Multi-versioning or Conditional Update Stable Callback Amazon DynamoDB YES NO Amazon S3 NO NO Amazon SimpleDB YES NO Amazon Dynamo YES NO Cloudant Data Layer YES NO Google App Engine YES NO Apache Cassandra NO NO Apache CouchDB YES NO Basho Riak YES NO LinkedIn Voldemort YES NO MongoDB YES NO Yahoo! PNUTS YES NO ...not (yet) common to all stores
  187. Rethinking the EC API Our extreme approach (unmodified EC store)

    definitely impeded efficiency (but is portable) Opportunities to better define surgical improvements to API for future stores/shims!
  188. Bolt-on Causal Consistency Modular, “bolt-on” architecture cleanly separates safety and

    liveness upgraded EC (all liveness) to causal consistency, preserving HA, low latency, liveness Challenges: overwrites, managing causal order
  189. Bolt-on Causal Consistency Modular, “bolt-on” architecture cleanly separates safety and

    liveness upgraded EC (all liveness) to causal consistency, preserving HA, low latency, liveness Challenges: overwrites, managing causal order large design space: took an extreme here, but: room for exploration in EC API bolt-on transactions?
  190. (Some) Related Work • S3 DB [SIGMOD 2008]: foundational prior

    work building on EC stores, not causally consistent, not HA (e.g., RYW implementation), AWS- dependent (e.g., assumes queues) • 28msec architecture [SIGMOD Record 2009]: like SIGMOD 2008, treat EC stores as cheap storage • Cloudy [VLDB 2010]: layered approach to data management, partitioning, load balancing, messaging in middleware; larger focus: extensible query model, storage format, routing, etc. • G-Store [SoCC 2010]: provide client and middleware implementation of entity-grouped linearizable transaction support • Bermbach et al. middleware [IC2E 2013]: provides read-your-writes guarantees with caching • Causal Consistency: Bayou [SOSP 1997], Lazy Replication [TOCS 1992], COPS [SOSP 2011], Eiger [NSDI 2013], ChainReaction [EuroSys 2013], Swift [INRIA] are all custom solutions for causal memory [Ga Tech 1993] (inspired by Lamport [CACM 1978])