Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Consistency Types for Replicated Data in a Higher-order Distributed Programming Language

Philipp Haller
March 15, 2023
95

Consistency Types for Replicated Data in a Higher-order Distributed Programming Language

Philipp Haller

March 15, 2023
Tweet

Transcript

  1. Consistency Types for Replicated Data in a Higher-order Distributed Programming

    Language Xin Zhao and Philipp Haller KTH Royal Institute of Technology Stockholm, Sweden International Conference on the Art, Science, and Engineering of Programming (‹Programming› 2023) 
 Tokyo, Japan & Online, March 13-17, 2023
  2. Philipp Haller Context: Scalable Distributed Services • Distributed applications –

    providing scalable services – that run on virtualized cloud infrastructures – in one or more datacenters • Examples: 
 E-commerce platforms, communication services, social media platforms, game servers, etc. 2 Support many concurrent clients!
  3. Philipp Haller Fault-tolerant Distributed Systems Basic problem: multiple clients access

    shared distributed data concurrently 3 Client 1 Client 2 Client 3 Data Safety goal:
 Never return incorrect result under non-Byzantine failure conditions, including • network delays, partitions, and • packet loss, duplication, and reordering read write read
  4. Philipp Haller Solution: Replication, Consensus Replicate data, distributed consensus ensures

    safety 4 Client 1 Client 2 Client 3 Data Safety goal:
 Never return incorrect result under non-Byzantine failure conditions, including • network delays, partitions, and • packet loss, duplication, and reordering read write read Data Data
  5. Philipp Haller Challenges Replicate data, distributed consensus ensures safety 5

    Client 1 Client 2 Client 3 Data • Each update requires distributed consensus • Leads to poor performance, poor scalability, and high latency,
 especially in geo-replicated systems read write read Data Data
  6. Philipp Haller Geo-Distribution Challenge • Operating a service in multiple

    datacenters can improve latency and availability for geographically distributed clients • Geo-distribution directly supported by today's cloud platforms • Challenge: round-trip latency – < 2ms between servers within the same datacenter – up to two orders of magnitude higher between distant datacenters 6 Naive reuse of single-datacenter application architectures and protocols leads to poor performance!
  7. Philipp Haller (Partial) Remedy: Eventual Consistency Eventual consistency promises better

    availability and performance than strong consistency (= serializing updates in a global total order) • Each update executes at some replica (e.g., geographically closest) without synchronization • Each update is propagated asynchronously to the other replicas • All updates eventually take effect at all replicas, possibly in different orders 7
  8. Philipp Haller Eventual Consistency • Updates are applied without synchronization

    • Updates/states propagated asynchronously to other replicas • All updates eventually take effect at all replicas, possibly in different orders 8 Image source: Shapiro, Preguica, Baquero, and Zawirski: Conflict-Free Replicated Data Types. SSS 2011
  9. Philipp Haller Strong Eventual Consistency (SEC) • Strong Eventual Consistency

    (SEC): – leverages mathematical properties that ensure absence of conflict, i.e., commutativity of update merging • A Conflict-Free Replicated Datatype (CRDT)1 provides SEC: – CRDT replicas provably converge to a correct common state – CRDTs remain available and scalable despite high network latency, failures, or network partitioning 9 1 Shapiro, Preguica, Baquero, and Zawirski: Conflict-Free Replicated Data Types. SSS 2011
  10. Philipp Haller Consistency Types: Idea To satisfy a range of

    performance, scalability, and consistency requirements, provide two different kinds of replicated data types (RDTs): 1. Consistent data types: – Serialize updates in a global total order: sequential consistency – Do not provide availability (in favor of partition tolerance2) 2. Available data types: – Guarantee availability and performance (and partition tolerance) – Weaken consistency: strong eventual consistency 10 2 Gilbert and Lynch: Brewer's conjecture and the feasibility of consistent, available, partition- tolerant web services. SIGACT News 33(2), 51-59 (2002)
  11. Philipp Haller Generalization:
 Observable Atomic Consistency (OAC) • Provide an

    RDT storing values of a lattice (actually, a join-semilattice) – Example: lattice = non-negative integers where join(x, y) = max(x, y) • The RDT supports operations with different consistency levels: – a totally-ordered operation (“TOp”) atomically synchronizes the replicas upon its execution; – a convergent operation (“CvOp”) is commutative; it is processed asynchronously. 11 Zhao and Haller: Replicated data types that unify eventual consistency and observable atomic consistency. J. Log. Algebraic Methods Program. 114: 100561 (2020)
  12. Philipp Haller Observable Atomic Consistency: Example • Auction system: –

    RDT maintains highest bidder including bid and ID of bidder – State of RDT = (bid: Int, bidderID: Int) – Update of (local) state upon submission of new bid: • submit is commutative: 12 def submit(out s: (Int, Int), bid: Int, bidderID: Int) = if (bid > s._1) s := (bid, bidderID) submit(s, 10, 1); submit(s, 20, 2) -> s == (20, 2) submit(s, 20, 2); submit(s, 10, 1) -> s == (20, 2)
  13. Philipp Haller Observable Atomic Consistency: Example (2) Since submitting a

    bid is commutative, submit can be executed as a CvOp: 13 R1 R2 R3 CvOp(submit(20,2)) (0,0) (0,0) (0,0) (20,2) CvOp(submit(10,1)) (20,2) (10,1) (20,2) (10,1) No update!
  14. Philipp Haller Observable Atomic Consistency: Example (3) Assume R1 receives

    a request to close the auction and return the highest bid: 14 R1 R2 R3 CvOp(submit(20,2)) (0,0) (0,0) (0,0) (20,2) CvOp(submit(10,1)) (20,2) (10,1) (20,2) (10,1) Op(close()) Should not return (10,1):
 Replicas not consistent!
  15. Philipp Haller Observable Atomic Consistency: Example (4) Assume R1 receives

    a request to close the auction and return the highest bid: 15 R1 R2 R3 CvOp(submit(20,2)) (0,0) (0,0) (0,0) (20,2) CvOp(submit(10,1)) (10,1) (10,1) TOp(close()) (20,2) (20,2) (20,2) Distributed consensus Return (20,2) Zhao and Haller: Replicated data types that unify eventual consistency and observable atomic consistency. J. Log. Algebraic Methods Program. 114: 100561 (2020)
  16. Philipp Haller A New CvOp • Now, we can use

    both CvOps and TOps with the same RDT • Example: – Add poll operation to retrieve the current highest bidder – In order to ensure high availability, implement poll as CvOp 16 def getHighestBid(auctionID: Int): (Int, Int) = getRef(auctionID).poll() def updateDisplay(auctionID: Int) = show("Highest bid: " + getHighestBid(auctionID)._1)
  17. Philipp Haller A Notification Service • Periodically, the auction service

    should send a message to all bidders to inform them about the current highest bid: 17 def notifyAll(auctionID: Int, bidders: List[Int]) = { val (hBid, hBidder) = getHighestBid(auctionID) bidders.foreach { bidderID => if (bidderID == hBidder) send(bidderID, "You have the highest bid!") else send(bidderID, "The highest bid is: " + hBid) } } May be inconsistent! Problem: notification based on inconsistent information!
  18. Philipp Haller CTRD: Consistency Types for Replicated Data • Type

    system that distinguishes values according to their consistency • Consistency represented as labels attached to types and values • A label l can be loc (local), con (consistent), oac (OAC), or ava (available) • Labels are ordered: • The label ordering expresses permitted data flow: loc !"con"!"oac"!"ava • Labeled types are covariant in their labels: 18 ava"!"con
  19. Philipp Haller Select Typing Rules • Example 1: t1con :=

    t2ava • Example 2: if xava then tcon := 1con else tcon := 0con 20 Illegal! Illegal!
  20. Philipp Haller Attempted “Fix” 1 21 def send(ID: Int@con, msg:

    String): Unit = ... def getHighestBid(auctionID: Int): (Int, Int)@ava = getRef(auctionID).poll() def notifyAll(auctionID: Int, bidders: List[Int]) = { val (hBid, hBidder): (Int, Int)@con = getHighestBid(auctionID) bidders.foreach { bidderID => if (bidderID == hBidder) send(bidderID, "You have the highest bid!") else send(bidderID, "The highest bid is: " + hBid) } (Int,Int)@ava <: (Int,Int)@con
  21. Philipp Haller Attempted “Fix” 2 22 def send(ID: Int@con, msg:

    String): Unit = ... def getHighestBid(auctionID: Int): (Int, Int)@ava = getRef(auctionID).poll() def notifyAll(auctionID: Int, bidders: List[Int]) = { val (hBid, hBidder): (Int, Int)@ava = getHighestBid(auctionID) bidders.foreach { bidderID => if (bidderID == hBidder) send(bidderID, "You have the highest bid!") else send(bidderID, "The highest bid is: " + hBid) } Int@ava <: Int@con Condition has label ava !" #$%%&'()"*+,,-."/+0&"*-,"1+#&1" $,"./&"#'+,*/&23 Implicit information flow
  22. Philipp Haller The Real Fix 23 def send(ID: Int@con, msg:

    String): Unit = ... def getHighestBid(auctionID: Int): (Int, Int)@con = getRef(auctionID).consistentRead() def notifyAll(auctionID: Int, bidders: List[Int]) = { val (hBid, hBidder): (Int, Int)@con = getHighestBid(auctionID) bidders.foreach { bidderID => if (bidderID == hBidder) send(bidderID, "You have the highest bid!") else send(bidderID, "The highest bid is: " + hBid) } Must strengthen consistency!
  23. Philipp Haller Results • Distributed small-step operational semantics • Formalizes

    RDTs including observable atomic consistency; operations via message passing • Proofs of correctness properties: • Type soundness (preservation + progress) ! no run-time label violations! • Noninterference 
 E.g., mutation of ava-labelled references cannot be observed via con-labelled values • Proofs of consistency properties: • Theorem: For con operations, CTRD ensures sequential consistency • Theorem: For ava operations, CTRD ensures eventual consistency 24
  24. Philipp Haller Selected Related Work (1) • Inconsistent, Performance-bound, Approximate

    (IPA)3 storage system – Goals: consistency safety and error-bounded consistency – Limitations: • Only direct invalid information flows prevented • No proof of type soundness – Our work: • Also prevents implicit invalid information flows • Provides proofs of correctness and consistency properties 25 3 Holt, Bornholt, Zhang, Ports, Oskin, Ceze: Disciplined Inconsistency with Consistency Types. SoCC 2016: 279-293
  25. Philipp Haller Selected Related Work (2) • ConSysT4 language for

    distributed systems – Integrates consistency and availability with an object-oriented programming programming model – Provides correctness proof for an OO core calculus – Provides an implementation as a Java extension and middleware • Our work: – ML-style higher-order functional language – Integrates observable atomic consistency (OAC) for increased flexibility – First published proofs of type soundness and noninterference (LCPC ’19) 26 4 Köhler, Eskandani, Weisenburger, Margara, Salvaneschi: Rethinking safe consistency in distributed object-oriented programming. Proc. ACM Program. Lang. 4(OOPSLA): 188:1-188:30 (2020)
  26. Philipp Haller Conclusion CTRD: Consistency Types for Replicated Data •

    A distributed, higher-order language with replicated types and consistency labels • Enables safe mixing of strongly consistent and available (weakly consistent) data • Proofs of type soundness and noninterference, and consistency properties • Integrates observable atomic consistency which provides high availability through convergent operations and strong consistency through totally-ordered operations Future work: • Practical implementation integrated with OACP • Going beyond the RDT abstraction 27 Thanks!