Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Consistency Types for Replicated Data in a High...

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Philipp Haller Philipp Haller
March 15, 2023
240

Consistency Types for Replicated Data in a Higher-order Distributed Programming Language

Avatar for Philipp Haller

Philipp Haller

March 15, 2023
Tweet

Transcript

  1. Consistency Types for Replicated Data in a Higher-order Distributed Programming

    Language Xin Zhao and Philipp Haller KTH Royal Institute of Technology Stockholm, Sweden International Conference on the Art, Science, and Engineering of Programming (‹Programming› 2023) 
 Tokyo, Japan & Online, March 13-17, 2023
  2. Philipp Haller Context: Scalable Distributed Services • Distributed applications –

    providing scalable services – that run on virtualized cloud infrastructures – in one or more datacenters • Examples: 
 E-commerce platforms, communication services, social media platforms, game servers, etc. 2 Support many concurrent clients!
  3. Philipp Haller Fault-tolerant Distributed Systems Basic problem: multiple clients access

    shared distributed data concurrently 3 Client 1 Client 2 Client 3 Data Safety goal:
 Never return incorrect result under non-Byzantine failure conditions, including • network delays, partitions, and • packet loss, duplication, and reordering read write read
  4. Philipp Haller Solution: Replication, Consensus Replicate data, distributed consensus ensures

    safety 4 Client 1 Client 2 Client 3 Data Safety goal:
 Never return incorrect result under non-Byzantine failure conditions, including • network delays, partitions, and • packet loss, duplication, and reordering read write read Data Data
  5. Philipp Haller Challenges Replicate data, distributed consensus ensures safety 5

    Client 1 Client 2 Client 3 Data • Each update requires distributed consensus • Leads to poor performance, poor scalability, and high latency,
 especially in geo-replicated systems read write read Data Data
  6. Philipp Haller Geo-Distribution Challenge • Operating a service in multiple

    datacenters can improve latency and availability for geographically distributed clients • Geo-distribution directly supported by today's cloud platforms • Challenge: round-trip latency – < 2ms between servers within the same datacenter – up to two orders of magnitude higher between distant datacenters 6 Naive reuse of single-datacenter application architectures and protocols leads to poor performance!
  7. Philipp Haller (Partial) Remedy: Eventual Consistency Eventual consistency promises better

    availability and performance than strong consistency (= serializing updates in a global total order) • Each update executes at some replica (e.g., geographically closest) without synchronization • Each update is propagated asynchronously to the other replicas • All updates eventually take effect at all replicas, possibly in different orders 7
  8. Philipp Haller Eventual Consistency • Updates are applied without synchronization

    • Updates/states propagated asynchronously to other replicas • All updates eventually take effect at all replicas, possibly in different orders 8 Image source: Shapiro, Preguica, Baquero, and Zawirski: Conflict-Free Replicated Data Types. SSS 2011
  9. Philipp Haller Strong Eventual Consistency (SEC) • Strong Eventual Consistency

    (SEC): – leverages mathematical properties that ensure absence of conflict, i.e., commutativity of update merging • A Conflict-Free Replicated Datatype (CRDT)1 provides SEC: – CRDT replicas provably converge to a correct common state – CRDTs remain available and scalable despite high network latency, failures, or network partitioning 9 1 Shapiro, Preguica, Baquero, and Zawirski: Conflict-Free Replicated Data Types. SSS 2011
  10. Philipp Haller Consistency Types: Idea To satisfy a range of

    performance, scalability, and consistency requirements, provide two different kinds of replicated data types (RDTs): 1. Consistent data types: – Serialize updates in a global total order: sequential consistency – Do not provide availability (in favor of partition tolerance2) 2. Available data types: – Guarantee availability and performance (and partition tolerance) – Weaken consistency: strong eventual consistency 10 2 Gilbert and Lynch: Brewer's conjecture and the feasibility of consistent, available, partition- tolerant web services. SIGACT News 33(2), 51-59 (2002)
  11. Philipp Haller Generalization:
 Observable Atomic Consistency (OAC) • Provide an

    RDT storing values of a lattice (actually, a join-semilattice) – Example: lattice = non-negative integers where join(x, y) = max(x, y) • The RDT supports operations with different consistency levels: – a totally-ordered operation (“TOp”) atomically synchronizes the replicas upon its execution; – a convergent operation (“CvOp”) is commutative; it is processed asynchronously. 11 Zhao and Haller: Replicated data types that unify eventual consistency and observable atomic consistency. J. Log. Algebraic Methods Program. 114: 100561 (2020)
  12. Philipp Haller Observable Atomic Consistency: Example • Auction system: –

    RDT maintains highest bidder including bid and ID of bidder – State of RDT = (bid: Int, bidderID: Int) – Update of (local) state upon submission of new bid: • submit is commutative: 12 def submit(out s: (Int, Int), bid: Int, bidderID: Int) = if (bid > s._1) s := (bid, bidderID) submit(s, 10, 1); submit(s, 20, 2) -> s == (20, 2) submit(s, 20, 2); submit(s, 10, 1) -> s == (20, 2)
  13. Philipp Haller Observable Atomic Consistency: Example (2) Since submitting a

    bid is commutative, submit can be executed as a CvOp: 13 R1 R2 R3 CvOp(submit(20,2)) (0,0) (0,0) (0,0) (20,2) CvOp(submit(10,1)) (20,2) (10,1) (20,2) (10,1) No update!
  14. Philipp Haller Observable Atomic Consistency: Example (3) Assume R1 receives

    a request to close the auction and return the highest bid: 14 R1 R2 R3 CvOp(submit(20,2)) (0,0) (0,0) (0,0) (20,2) CvOp(submit(10,1)) (20,2) (10,1) (20,2) (10,1) Op(close()) Should not return (10,1):
 Replicas not consistent!
  15. Philipp Haller Observable Atomic Consistency: Example (4) Assume R1 receives

    a request to close the auction and return the highest bid: 15 R1 R2 R3 CvOp(submit(20,2)) (0,0) (0,0) (0,0) (20,2) CvOp(submit(10,1)) (10,1) (10,1) TOp(close()) (20,2) (20,2) (20,2) Distributed consensus Return (20,2) Zhao and Haller: Replicated data types that unify eventual consistency and observable atomic consistency. J. Log. Algebraic Methods Program. 114: 100561 (2020)
  16. Philipp Haller A New CvOp • Now, we can use

    both CvOps and TOps with the same RDT • Example: – Add poll operation to retrieve the current highest bidder – In order to ensure high availability, implement poll as CvOp 16 def getHighestBid(auctionID: Int): (Int, Int) = getRef(auctionID).poll() def updateDisplay(auctionID: Int) = show("Highest bid: " + getHighestBid(auctionID)._1)
  17. Philipp Haller A Notification Service • Periodically, the auction service

    should send a message to all bidders to inform them about the current highest bid: 17 def notifyAll(auctionID: Int, bidders: List[Int]) = { val (hBid, hBidder) = getHighestBid(auctionID) bidders.foreach { bidderID => if (bidderID == hBidder) send(bidderID, "You have the highest bid!") else send(bidderID, "The highest bid is: " + hBid) } } May be inconsistent! Problem: notification based on inconsistent information!
  18. Philipp Haller CTRD: Consistency Types for Replicated Data • Type

    system that distinguishes values according to their consistency • Consistency represented as labels attached to types and values • A label l can be loc (local), con (consistent), oac (OAC), or ava (available) • Labels are ordered: • The label ordering expresses permitted data flow: loc !"con"!"oac"!"ava • Labeled types are covariant in their labels: 18 ava"!"con
  19. Philipp Haller Select Typing Rules • Example 1: t1con :=

    t2ava • Example 2: if xava then tcon := 1con else tcon := 0con 20 Illegal! Illegal!
  20. Philipp Haller Attempted “Fix” 1 21 def send(ID: Int@con, msg:

    String): Unit = ... def getHighestBid(auctionID: Int): (Int, Int)@ava = getRef(auctionID).poll() def notifyAll(auctionID: Int, bidders: List[Int]) = { val (hBid, hBidder): (Int, Int)@con = getHighestBid(auctionID) bidders.foreach { bidderID => if (bidderID == hBidder) send(bidderID, "You have the highest bid!") else send(bidderID, "The highest bid is: " + hBid) } (Int,Int)@ava <: (Int,Int)@con
  21. Philipp Haller Attempted “Fix” 2 22 def send(ID: Int@con, msg:

    String): Unit = ... def getHighestBid(auctionID: Int): (Int, Int)@ava = getRef(auctionID).poll() def notifyAll(auctionID: Int, bidders: List[Int]) = { val (hBid, hBidder): (Int, Int)@ava = getHighestBid(auctionID) bidders.foreach { bidderID => if (bidderID == hBidder) send(bidderID, "You have the highest bid!") else send(bidderID, "The highest bid is: " + hBid) } Int@ava <: Int@con Condition has label ava !" #$%%&'()"*+,,-."/+0&"*-,"1+#&1" $,"./&"#'+,*/&23 Implicit information flow
  22. Philipp Haller The Real Fix 23 def send(ID: Int@con, msg:

    String): Unit = ... def getHighestBid(auctionID: Int): (Int, Int)@con = getRef(auctionID).consistentRead() def notifyAll(auctionID: Int, bidders: List[Int]) = { val (hBid, hBidder): (Int, Int)@con = getHighestBid(auctionID) bidders.foreach { bidderID => if (bidderID == hBidder) send(bidderID, "You have the highest bid!") else send(bidderID, "The highest bid is: " + hBid) } Must strengthen consistency!
  23. Philipp Haller Results • Distributed small-step operational semantics • Formalizes

    RDTs including observable atomic consistency; operations via message passing • Proofs of correctness properties: • Type soundness (preservation + progress) ! no run-time label violations! • Noninterference 
 E.g., mutation of ava-labelled references cannot be observed via con-labelled values • Proofs of consistency properties: • Theorem: For con operations, CTRD ensures sequential consistency • Theorem: For ava operations, CTRD ensures eventual consistency 24
  24. Philipp Haller Selected Related Work (1) • Inconsistent, Performance-bound, Approximate

    (IPA)3 storage system – Goals: consistency safety and error-bounded consistency – Limitations: • Only direct invalid information flows prevented • No proof of type soundness – Our work: • Also prevents implicit invalid information flows • Provides proofs of correctness and consistency properties 25 3 Holt, Bornholt, Zhang, Ports, Oskin, Ceze: Disciplined Inconsistency with Consistency Types. SoCC 2016: 279-293
  25. Philipp Haller Selected Related Work (2) • ConSysT4 language for

    distributed systems – Integrates consistency and availability with an object-oriented programming programming model – Provides correctness proof for an OO core calculus – Provides an implementation as a Java extension and middleware • Our work: – ML-style higher-order functional language – Integrates observable atomic consistency (OAC) for increased flexibility – First published proofs of type soundness and noninterference (LCPC ’19) 26 4 Köhler, Eskandani, Weisenburger, Margara, Salvaneschi: Rethinking safe consistency in distributed object-oriented programming. Proc. ACM Program. Lang. 4(OOPSLA): 188:1-188:30 (2020)
  26. Philipp Haller Conclusion CTRD: Consistency Types for Replicated Data •

    A distributed, higher-order language with replicated types and consistency labels • Enables safe mixing of strongly consistent and available (weakly consistent) data • Proofs of type soundness and noninterference, and consistency properties • Integrates observable atomic consistency which provides high availability through convergent operations and strong consistency through totally-ordered operations Future work: • Practical implementation integrated with OACP • Going beyond the RDT abstraction 27 Thanks!