Slide 1

Slide 1 text

Chain Replication Deniz Altinbuken Cornell University

Slide 2

Slide 2 text

Chain Replication Robbert  van  Renesse  and  Fred  B.  Schneider.  2004.   Chain  replica:on  for  suppor:ng  high  throughput  and  availability.   In  Proceedings  of  OSDI'04.  

Slide 3

Slide 3 text

replication failure models • fail-stop failure model • crash failure model • byzantine failure model replication techniques • quorum replication • stake replication • broker replication • primary-backup replication • state machine replication • chain replication • etc. consistency models • strong consistency • sequential consistency • eventual consistency • causal consistency • read-your-writes consistency • monotonic read consistency • etc.

Slide 4

Slide 4 text

replication failure models • fail-stop failure model • crash failure model • byzantine failure model replication techniques • quorum replication • stake replication • broker replication • primary-backup replication • state machine replication • chain replication • etc. consistency models • strong consistency • sequential consistency • eventual consistency • causal consistency • read-your-writes consistency • monotonic read consistency • etc.

Slide 5

Slide 5 text

primary-backup replication client R1 R2 R3 Rprimary update

Slide 6

Slide 6 text

primary-backup replication client R1 R2 R3 Rprimary update

Slide 7

Slide 7 text

primary-backup replication client update reply R1 R2 R3 Rprimary

Slide 8

Slide 8 text

primary-backup replication client R1 R2 R3 Rprimary query

Slide 9

Slide 9 text

primary-backup replication client R1 R2 R3 Rprimary query reply

Slide 10

Slide 10 text

R2 R3 Rtail Rhead update chain replication client

Slide 11

Slide 11 text

R2 R3 Rtail Rhead update chain replication client

Slide 12

Slide 12 text

R2 R3 Rtail Rhead reply chain replication client update

Slide 13

Slide 13 text

query R2 R3 Rtail Rhead chain replication client client

Slide 14

Slide 14 text

query R2 R3 Rtail Rhead chain replication client reply client

Slide 15

Slide 15 text

R2 R3 Rtail Rhead chain replication client reply update query

Slide 16

Slide 16 text

primary-backup replication client update reply R1 R2 R3 Rprimary 1 2 3 4

Slide 17

Slide 17 text

R2 R3 Rtail Rhead chain replication client reply update 1 5 2 3 4 Higher latency!

Slide 18

Slide 18 text

primary-backup replication client query reply R1 R2 R3 Rprimary Primary has to make sure that all updates prior to this query are done!

Slide 19

Slide 19 text

R2 R3 Rtail Rhead chain replication client reply query Tail can respond directly!

Slide 20

Slide 20 text

R2 R3 Rtail Rhead chain replication client reply query Higher throughput! Tail can respond directly!

Slide 21

Slide 21 text

related work • Sage  A.  Weil,  Andrew  W.  Leung,  ScoE  A.  Brandt,  and  Carlos  Maltzahn.  2007.  RADOS:  a  scalable,  reliable  storage  service  for   petabyte-­‐scale  storage  clusters.  In  Proceedings  of  the  2nd  interna8onal  workshop  on  Petascale  data  storage:  held  in  conjunc8on   with  Supercompu8ng  '07  (PDSW  '07).  ACM,  New  York,  NY,  USA,  35-­‐44.     • Jeff  Terrace  and  Michael  J.  Freedman.  2009.  Object  storage  on  CRAQ:  high-­‐throughput  chain  replica@on  for  read-­‐mostly   workloads.  In  Proceedings  of  the  2009  conference  on  USENIX  Annual  technical  conference  (USENIX'09).  USENIX  Associa:on,   Berkeley,  CA,  USA,  11-­‐11.   • David  G.  Andersen,  Jason  Franklin,  Michael  Kaminsky,  Amar  Phanishayee,  Lawrence  Tan,  and  Vijay  Vasudevan.  2009.  FAWN:  a  fast   array  of  wimpy  nodes.  In  Proceedings  of  the  ACM  SIGOPS  22nd  symposium  on  Opera8ng  systems  principles  (SOSP  '09).  ACM,  New   York,  NY,  USA,  1-­‐14.   • ScoE  Lys:g  Fritchie.  2010.  Chain  replica@on  in  theory  and  in  prac@ce.  In  Proceedings  of  the  9th  ACM  SIGPLAN  workshop  on   Erlang  (Erlang  '10).  ACM,  New  York,  NY,  USA,  33-­‐44.   • WyaE  Lloyd,  Michael  J.  Freedman,  Michael  Kaminsky,  and  David  G.  Andersen.  2011.  Don't  seIle  for  eventual:  scalable  causal   consistency  for  wide-­‐area  storage  with  COPS.  In  Proceedings  of  the  Twenty-­‐Third  ACM  Symposium  on  Opera8ng  Systems   Principles  (SOSP  '11).  ACM,  New  York,  NY,  USA,  401-­‐416.   • Mahesh  Balakrishnan,  Dahlia  Malkhi,  Vijayan  Prabhakaran,  Ted  Wobber,  Michael  Wei,  and  John  D.  Davis.  2012.  CORFU:  a  shared   log  design  for  flash  clusters.  In  Proceedings  of  the  9th  USENIX  conference  on  Networked  Systems  Design  and  Implementa8on   (NSDI'12).  USENIX  Associa:on,  Berkeley,  CA,  USA,  1-­‐1.   • Guy  Laden,  Roie  Melamed,  and  Ymir  Vigfusson.  2012.  Adap@ve  and  dynamic  funnel  replica@on  in  clouds.  SIGOPS  Oper.  Syst.  Rev.   46,  1  (February  2012),  40-­‐46.   • Sérgio  Almeida,  João  Leitão,  and  Luís  Rodrigues.  2013.  ChainReac@on:  a  causal+  consistent  datastore  based  on  chain  replica@on.   In  Proceedings  of  the  8th  ACM  European  Conference  on  Computer  Systems  (EuroSys  '13).  ACM,  New  York,  NY,  USA,  85-­‐98.     • Hussam  Abu-­‐Libdeh,  Robbert  van  Renesse,  and  Ymir  Vigfusson.  2013.  Leveraging  sharding  in  the  design  of  scalable  replica@on   protocols.  In  Proceedings  of  the  4th  annual  Symposium  on  Cloud  Compu8ng  (SOCC  '13).  ACM,  New  York,  NY   …

Slide 22

Slide 22 text

chain replication limitations • tail is a bottleneck for queries. • CRAQ: read from “clean” nodes. • supports only strong consistency. • CRAQ: eventual consistency • Chain Reaction: causal consistency • requires a master to reconfigure.

Slide 23

Slide 23 text

motivation • explain why suggested improvements work. • find further improvements. • make reconfiguration easier and cleaner. • create complete specifications. • prove chain replication works.

Slide 24

Slide 24 text

outline •updates •queries •failures •reconfiguration •various consistency models

Slide 25

Slide 25 text

updates

Slide 26

Slide 26 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 27

Slide 27 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History R2 is the predecessor of R3 R3 is the successor of R2

Slide 28

Slide 28 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 29

Slide 29 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History update

Slide 30

Slide 30 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 31

Slide 31 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 32

Slide 32 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History propagation message

Slide 33

Slide 33 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 34

Slide 34 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 35

Slide 35 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 36

Slide 36 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 37

Slide 37 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 38

Slide 38 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 39

Slide 39 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History reply

Slide 40

Slide 40 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History reply

Slide 41

Slide 41 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 42

Slide 42 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History acknowledgment message

Slide 43

Slide 43 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 44

Slide 44 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 45

Slide 45 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 46

Slide 46 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 47

Slide 47 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History

Slide 48

Slide 48 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History update update pdate Stable History Stable History Stable History Stable History reply reply

Slide 49

Slide 49 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History update update pdate Stable History Stable History Stable History Stable History reply reply Multiple updates are handled simultaneously.

Slide 50

Slide 50 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History update update pdate reply reply R2 R3 Rtail Rhead

Slide 51

Slide 51 text

R2 R3 Rtail Speculative History ⊆ ⊇ ⊇ ⊆ ⊆ ⊆ ⊆ Stable History Speculative History Speculative History ⊇ Speculative History ⊆ Stable History ⊆ Stable History Stable History Rhead ⊆

Slide 52

Slide 52 text

⊆ Stable History Stable History Speculative History Speculative History ⊇ Speculative History ⊆ Stable History R2 R3 R2 R3 R2 R2 The speculative history of a node’s successor is a subset of that node’s speculative history. The speculative history of a node is a superset of its stable history. The stable history of a node’s successor is a superset of that node’s stable history.

Slide 53

Slide 53 text

queries

Slide 54

Slide 54 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History R2 R3 Rtail Rhead

Slide 55

Slide 55 text

Speculative History Speculative History Speculative History Speculative History query Stable History Stable History Stable History Stable History R2 R3 Rtail Rhead

Slide 56

Slide 56 text

Speculative History Speculative History Speculative History Speculative History query Stable History Stable History Stable History Stable History reply R2 R3 Rtail Rhead

Slide 57

Slide 57 text

Speculative History Speculative History Speculative History Speculative History reply query Stable History Stable History Stable History Stable History R2 R3 Rtail Rhead

Slide 58

Slide 58 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History update update pdate Stable History Stable History Stable History Stable History reply reply query reply

Slide 59

Slide 59 text

R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History update update pdate Stable History Stable History Stable History Stable History reply reply query reply The tail is the point of linearization!

Slide 60

Slide 60 text

failures

Slide 61

Slide 61 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History head failure R2 R3 Rtail Rhead

Slide 62

Slide 62 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History head failure R2 R3 Rtail Rhead

Slide 63

Slide 63 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History head failure R2 R3 Rtail Rhead update

Slide 64

Slide 64 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History head failure R2 R3 Rtail Rhead

Slide 65

Slide 65 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History head failure R2 R3 Rtail Rhead

Slide 66

Slide 66 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History middle node failure R2 R3 Rtail Rhead

Slide 67

Slide 67 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History middle node failure R2 R3 Rtail Rhead

Slide 68

Slide 68 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History middle node failure R2 R3 Rtail Rhead

Slide 69

Slide 69 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History middle node failure R2 R3 Rtail Rhead

Slide 70

Slide 70 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History middle node failure R2 R3 Rtail Rhead

Slide 71

Slide 71 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History tail failure R2 R3 Rtail Rhead

Slide 72

Slide 72 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History tail failure R2 R3 Rtail Rhead

Slide 73

Slide 73 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History tail failure R2 R3 Rtail Rhead

Slide 74

Slide 74 text

reconfiguration

Slide 75

Slide 75 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Rnew Speculative History Stable History tail R2 R3 Rtail Rhead

Slide 76

Slide 76 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Rnew Speculative History Stable History tail • new nodes are added to the chain with special configuration updates that are added to the history: add(nodeid) • by looking at the order of these updates, a node can determine the configuration of the chain add( ) new tail R2 R3 Rtail Rhead

Slide 77

Slide 77 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Rnew Speculative History Stable History tail • new nodes are added to the chain with special configuration updates that are added to the history: add(nodeid) • by looking at the order of these updates, a node can determine the configuration of the chain add( ) new tail R2 R3 Rtail Rhead

Slide 78

Slide 78 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Rnew Speculative History Stable History tail add( ) new tail R2 R3 Rtail Rhead

Slide 79

Slide 79 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Speculative History Stable History • stable history of new tail should be a superset of the stable history of tail. • speculative history of new tail should be a superset of its stable history. • speculative and stable histories of new tail should be equal to the speculative history of tail • old tail should not answer to queries when the new tail should. add( ) new tail Rnew tail R2 R3 Rtail Rhead ⊆

Slide 80

Slide 80 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Speculative History Stable History • stable history of new tail should be a superset of the stable history of tail. • speculative history of new tail should be a superset of its stable history. • speculative and stable histories of new tail should be equal to the speculative history of tail • old tail should not answer to queries when the new tail should. add( ) new tail Rnew tail R2 R3 Rtail Rhead ⊆ ⊇

Slide 81

Slide 81 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Speculative History Stable History • stable history of new tail should be a superset of the stable history of tail. • speculative history of new tail should be a superset of its stable history. • speculative and stable histories of new tail should be equal to the speculative history of tail • old tail should not answer to queries when the new tail should. add( ) new tail Rnew tail R2 R3 Rtail Rhead ⊆ ⊇ ⊇

Slide 82

Slide 82 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Speculative History Stable History • stable history of new tail should be a superset of the stable history of tail. • speculative history of new tail should be a superset of its stable history. • speculative and stable histories of new tail should be equal to the speculative history of tail • old tail should not answer to queries when the new tail should. add( ) new tail Rnew tail R2 R3 Rtail Rhead = = ⊆

Slide 83

Slide 83 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Speculative History Stable History • stable history of new tail should be a superset of the stable history of tail. • speculative history of new tail should be a superset of its stable history. • speculative and stable histories of new tail should be equal to the speculative history of tail • old tail should not answer to queries when the new tail should. add( ) new tail Rnew tail R2 R3 Rtail Rhead

Slide 84

Slide 84 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Speculative History Stable History add( ) new tail Rnew tail R2 R3 Rtail Rhead

Slide 85

Slide 85 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Speculative History Stable History add( ) new tail Rnew tail R2 R3 Rtail Rhead

Slide 86

Slide 86 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Speculative History Stable History add( ) new tail Rnew tail R2 R3 Rtail Rhead

Slide 87

Slide 87 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Speculative History Stable History add( ) new tail reply Rnew tail R2 R3 Rtail Rhead

Slide 88

Slide 88 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Speculative History Stable History add( ) new tail Rnew tail R2 R3 Rtail Rhead

Slide 89

Slide 89 text

Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History adding a new node Speculative History Stable History add( ) new tail Rnew tail R2 R3 Rtail Rhead

Slide 90

Slide 90 text

various consistency models

Slide 91

Slide 91 text

strong consistency after an update completes, any subsequent query by any client will return the updated value.

Slide 92

Slide 92 text

strong consistency • tail can reply to queries. • nodes that have their speculative and stable histories equal to each other can reply to queries. (clean vs dirty nodes at CRAQ) • a node can record the speculative history when it received a query and reply to the client when its stable history becomes equal to it. R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History reply query Stable History Stable History Stable History Stable History after an update completes, any subsequent query by any client will return the updated value.

Slide 93

Slide 93 text

strong consistency • tail can reply to queries. • any node can record its speculative history when it received a query and reply to the client when its stable history becomes equal to it. • nodes that have their speculative and stable histories equal to each other can reply to queries. (clean vs dirty nodes at CRAQ) R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History query after an update completes, any subsequent query by any client will return the updated value.

Slide 94

Slide 94 text

• tail can reply to queries. • any node can record its speculative history when it received a query and reply to the client when its stable history becomes equal to it. • nodes that have their speculative and stable histories equal to each other can reply to queries. (clean vs dirty nodes at CRAQ) strong consistency R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History query after an update completes, any subsequent query by any client will return the updated value.

Slide 95

Slide 95 text

strong consistency • tail can reply to queries. • any node can record its speculative history when it received a query and reply to the client when its stable history becomes equal to it. • nodes that have their speculative and stable histories equal to each other can reply to queries. (clean vs dirty nodes at CRAQ) R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History query after an update completes, any subsequent query by any client will return the updated value.

Slide 96

Slide 96 text

strong consistency • tail can reply to queries. • any node can record its speculative history when it received a query and reply to the client when its stable history becomes equal to it. • nodes that have their speculative and stable histories equal to each other can reply to queries. (clean vs dirty nodes at CRAQ) R2 Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History reply query R3 Rtail after an update completes, any subsequent query by any client will return the updated value.

Slide 97

Slide 97 text

strong consistency • tail can reply to queries. • any node can record its speculative history when it received a query and reply to the client when its stable history becomes equal to it. • nodes that have their speculative and stable histories equal to each other can reply to queries. (clean vs dirty nodes at CRAQ) R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History reply query after an update completes, any subsequent query by any client will return the updated value.

Slide 98

Slide 98 text

sequential consistency queries might return stale values, as long as they are not reordered.

Slide 99

Slide 99 text

sequential consistency queries might return stale values, as long as they are not reordered. • any node can reply to query messages with their stable history. • the stable history of any node is a prefix of history at tail. R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History reply query Stable History Stable History Stable History Stable History reply query

Slide 100

Slide 100 text

eventual consistency if no new updates are made, eventually all queries will return a history including that last update.

Slide 101

Slide 101 text

eventual consistency • any node can reply to query messages with their speculative history • the speculative history includes the history at tail and a sequence of updates that have been invoked but not yet stabilized (used in CRAQ) R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History reply query Stable History Stable History Stable History Stable History reply query if no new updates are made, eventually all queries will return a history including that last update.

Slide 102

Slide 102 text

causal consistency if client A has communicated to client B that it has completed an update, a subsequent query by client B will return that completed update.

Slide 103

Slide 103 text

causal consistency • requires modeling communication between clients • if a client receives a query reply from a node, same client can only read from this node’s predecessors until all updates in the reply are stabilized (used in Chain Reaction) R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History query1 reply if client A has communicated to client B that it has completed an update, a subsequent query by client B will return that completed update.

Slide 104

Slide 104 text

causal consistency • requires modeling communication between clients • if a client receives a query reply from a node, same client can only read from this node’s predecessors until all updates in the reply are stabilized (used in Chain Reaction) R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History reply query2 Stable History Stable History Stable History Stable History query1 reply if client A has communicated to client B that it has completed an update, a subsequent query by client B will return that completed update.

Slide 105

Slide 105 text

read-your-writes consistency if a client’s update completes, that client will never see an older version of the history. this is a special case of the causal consistency model.

Slide 106

Slide 106 text

read-your-writes consistency • requires modeling client-side. • on the client-side, the proxy should ensure that a history returned by a query includes all updates that have been completed. • the proxy keeps track of updates that are invoked and completed. R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History query reply if a client’s update completes, that client will never see an older version of the history. this is a special case of the causal consistency model.

Slide 107

Slide 107 text

monotonic read consistency if a client has issued a query and received h as a response, all following queries will receive a response with a history that has h as a prefix.

Slide 108

Slide 108 text

monotonic read consistency if a client has seen a particular update, any subsequent queries will never return any previous state.

Slide 109

Slide 109 text

monotonic read consistency • any given client only queries a single node (used in CRAQ) R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History query reply if a client has seen a particular update, any subsequent queries will never return any previous state.

Slide 110

Slide 110 text

monotonic read consistency • requires modeling client-side. • on the client-side, the proxy should ensure that a history returned by a query should always be a suffix of histories returned by previous queries • the proxy keeps track of queries that are invoked and completed R2 R3 Rtail Rhead Speculative History Speculative History Speculative History Speculative History Stable History Stable History Stable History Stable History query reply if a client has seen a particular update, any subsequent queries will never return any previous state.

Slide 111

Slide 111 text

R2 R3 Rtail Rhead chain replication client reply update query

Slide 112

Slide 112 text

chain replication update reply client query

Slide 113

Slide 113 text

objective • A linearizable data store replicated with chain replication should look like a centralized data store to all clients. • A centralized data store has a single history. • Make sure the data store replicated with chain replication looks like it has a single history. • Prove it :) • Write the specification for the centralized data store. • Write the specification for the replicated data store. • Show the replicated specification refines the centralized specification.

Slide 114

Slide 114 text

conclusion • we have created a formal end-to-end specification of chain replication • through this specification we can reason about how chain replication works • chain replication is easy to understand or implement • it can support different consistency models • reconfiguration can be done without requiring a master

Slide 115

Slide 115 text

TODO • open-source chain replication implementations • java and python in progress • chain replication wikipedia page :) website: http://www.cs.cornell.edu/~deniz e-mail: [email protected] denizalti