Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Certified Mergeable Replicated Data Types

Certified Mergeable Replicated Data Types

Replicated data types (RDTs) are data structures that permit concurrent modification of multiple potentially geo-distributed replicas without coordination between them. RDTs are designed in such a way that conflicting operations are eventually deterministically reconciled ensuring convergence. Constructing correct RDTs remains a difficult endeavour due to the complexity of reasoning about independently evolving states of the replicas. With the focus on the correctness of RDTs (and rightly so), existing approaches to RDTs are less efficient compared to their sequential counterparts in terms of time and space complexity. This is unfortunate since RDTs are often used in an local-first setting where the local operations far outweigh remote communication.

In this paper, we present Peepul, a pragmatic approach to building and verifying efficient RDTs. To make reasoning about correctness easier, we cast RDTs in the mould of distributed version control system, and equip it with a threeway merge function for reconciling conflicting versions. Further, we go beyond just verifying convergence, and provide a methodology to verify arbitrarily complex specifications. We develop a replication-aware simulation relation based technique to relate RDT specifications to their efficient purely functional implementations. We have developed Peepul as an F* library that discharges proof obligations to an SMT solver. The verified efficient RDTs are extracted as OCaml code and used in Irmin, a Git-like distributed database.

KC Sivaramakrishnan

April 27, 2022
Tweet

More Decks by KC Sivaramakrishnan

Other Decks in Science

Transcript

  1. Certi
    fi
    ed Mergeable
    Replicated Data Types
    “KC” Sivaramakrishnan
    joint work with
    Vimala Soundarapandian, Adharsh Kamath and Kartik Nagar

    View Slide

  2. View Slide

  3. INTERNET

    View Slide

  4. INTERNET

    View Slide

  5. INTERNET

    View Slide

  6. INTERNET

    • Serializability
    • Linearizability
    • Weak Consistency & Isolation

    View Slide

  7. Even simple data structures attract enormous
    complexity when made distributed

    View Slide

  8. 4
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    end
    Sequential Counter

    View Slide

  9. • Written in idiomatic style
    • Composable
    4
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    end
    type counter_list = Counter.t list
    Sequential Counter

    View Slide

  10. INTERNET
    0
    0
    0
    0
    Replicated Counter

    View Slide

  11. INTERNET
    0
    0
    0
    0
    Replicated Counter

    View Slide

  12. INTERNET
    0
    0
    0
    0
    Replicated Counter

    View Slide

  13. INTERNET
    0
    0
    0
    0
    Replicated Counter

    View Slide

  14. INTERNET
    0
    0
    0
    0
    Replicated Counter

    View Slide

  15. INTERNET
    0
    0
    0
    Replicated Counter
    +2
    2

    View Slide

  16. INTERNET
    0
    0
    0
    Replicated Counter
    +2
    2

    View Slide

  17. INTERNET
    0
    0
    Replicated Counter
    +2
    2
    +3
    3

    View Slide

  18. INTERNET
    0
    0
    Replicated Counter
    +2
    2
    +3
    3
    • Idea: Apply the local operations at all replicas

    View Slide

  19. INTERNET
    Replicated Counter
    +2 +3
    5
    5
    5
    5
    • Idea: Apply the local operations at all replicas

    View Slide

  20. 7
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end

    View Slide

  21. 7
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7

    View Slide

  22. 7
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8
    +1

    View Slide

  23. 7
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8 21
    +1 *3

    View Slide

  24. 7
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8 21
    +1 *3
    24
    *3

    View Slide

  25. 7
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8 21
    +1 *3
    24 22
    *3 +1

    View Slide

  26. 7
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8 21
    +1 *3
    24 22
    *3 +1
    Diverges

    View Slide

  27. 7
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8 21
    +1 *3
    24 22
    *3 +1
    Diverges
    Addition and multiplication do not commute

    View Slide

  28. 8
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end

    View Slide

  29. 8
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    • Idea: Capture the effect of multiplication through the
    commutative addition operation

    View Slide

  30. 8
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    • Idea: Capture the effect of multiplication through the
    commutative addition operation

    View Slide

  31. 8
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8
    +1
    • Idea: Capture the effect of multiplication through the
    commutative addition operation

    View Slide

  32. 8
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8 21
    +1 *3
    • Idea: Capture the effect of multiplication through the
    commutative addition operation

    View Slide

  33. 8
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8 21
    +1 *3
    22
    +14
    • Idea: Capture the effect of multiplication through the
    commutative addition operation

    View Slide

  34. 8
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8 21
    +1 *3
    22 22
    +14 +1
    • Idea: Capture the effect of multiplication through the
    commutative addition operation

    View Slide

  35. 8
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8 21
    +1 *3
    22 22
    +14 +1
    Converges
    • Idea: Capture the effect of multiplication through the
    commutative addition operation

    View Slide

  36. 8
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    end
    7
    8 21
    +1 *3
    22 22
    +14 +1
    Converges
    • Idea: Capture the effect of multiplication through the
    commutative addition operation
    • CRDTs

    View Slide

  37. Convergent Replicated Data Types
    (CRDT)
    9

    View Slide

  38. Convergent Replicated Data Types
    (CRDT)
    • CRDT is guaranteed to ensure strong eventual consistency (SEC)
    ★ G-counters, PN-counters, OR-Sets, Graphs, Ropes, docs, sheets
    ★ Simple interface for the clients of CRDTs
    9

    View Slide

  39. Convergent Replicated Data Types
    (CRDT)
    • CRDT is guaranteed to ensure strong eventual consistency (SEC)
    ★ G-counters, PN-counters, OR-Sets, Graphs, Ropes, docs, sheets
    ★ Simple interface for the clients of CRDTs
    • Need to reengineer every datatype to ensure SEC
    (commutativity)
    ★ Do not mirror sequential counter parts => implementation & proof
    burden
    ★ Do not compose!
    ✦ counter set is not a composition of counter and set CRDTs
    9

    View Slide

  40. Can we program & reason about replicated data types
    as an extension of their sequential counterparts?

    View Slide

  41. Can we program & reason about replicated data types
    as an extension of their sequential counterparts?
    MRDT

    View Slide

  42. 11
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    val merge : lca:t -> v1:t -> v2:t -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    let merge ~lca ~v1 ~v2 =


    lca + (v1 - lca) + (v2 - lca)


    end

    View Slide

  43. 11
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    val merge : lca:t -> v1:t -> v2:t -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    let merge ~lca ~v1 ~v2 =


    lca + (v1 - lca) + (v2 - lca)


    end
    7

    View Slide

  44. 11
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    val merge : lca:t -> v1:t -> v2:t -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    let merge ~lca ~v1 ~v2 =


    lca + (v1 - lca) + (v2 - lca)


    end
    7
    8
    +1

    View Slide

  45. 11
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    val merge : lca:t -> v1:t -> v2:t -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    let merge ~lca ~v1 ~v2 =


    lca + (v1 - lca) + (v2 - lca)


    end
    7
    8 21
    +1 *3

    View Slide

  46. 11
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    val merge : lca:t -> v1:t -> v2:t -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    let merge ~lca ~v1 ~v2 =


    lca + (v1 - lca) + (v2 - lca)


    end
    7
    8 21
    +1 *3
    22

    View Slide

  47. 11
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    val merge : lca:t -> v1:t -> v2:t -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    let merge ~lca ~v1 ~v2 =


    lca + (v1 - lca) + (v2 - lca)


    end
    7
    8 21
    +1 *3
    22 22

    View Slide

  48. 11
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    val merge : lca:t -> v1:t -> v2:t -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    let merge ~lca ~v1 ~v2 =


    lca + (v1 - lca) + (v2 - lca)


    end
    7
    8 21
    +1 *3
    22 22
    22 = 7 + (8-1) + (21 -7)

    View Slide

  49. 11
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    val merge : lca:t -> v1:t -> v2:t -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    let merge ~lca ~v1 ~v2 =


    lca + (v1 - lca) + (v2 - lca)


    end
    7
    8 21
    +1 *3
    22 22
    22 = 7 + (8-1) + (21 -7)
    • 3-way merge function makes the counter suitable for distribution

    View Slide

  50. 11
    module Counter : sig


    type t


    val read : t -> int


    val add : t -> int -> t


    val sub : t -> int -> t


    val mult : t -> int -> t


    val merge : lca:t -> v1:t -> v2:t -> t


    end = struct


    type t = int


    let read x = x


    let add x d = x + d


    let sub x d = x - d


    let mult x n = x * n


    let merge ~lca ~v1 ~v2 =


    lca + (v1 - lca) + (v2 - lca)


    end
    7
    8 21
    +1 *3
    22 22
    22 = 7 + (8-1) + (21 -7)
    • 3-way merge function makes the counter suitable for distribution
    • Does not appeal to individual operations => independently
    extend data-type

    View Slide

  51. 12
    Systems ➞ PL

    View Slide

  52. 12
    Systems ➞ PL
    • CRDTs need to take care of
    systems level concerns such as
    message loss, duplication and
    reordering

    View Slide

  53. 12
    Systems ➞ PL
    • CRDTs need to take care of
    systems level concerns such as
    message loss, duplication and
    reordering
    • 3-way merge is oblivious to these
    ✦ By leaving those concerns to MRDT
    middleware

    View Slide

  54. 12
    7
    8 21
    +1 *3
    22 22
    Systems ➞ PL
    • CRDTs need to take care of
    systems level concerns such as
    message loss, duplication and
    reordering
    • 3-way merge is oblivious to these
    ✦ By leaving those concerns to MRDT
    middleware

    View Slide

  55. ??
    12
    7
    8 21
    +1 *3
    22 22
    Systems ➞ PL
    • CRDTs need to take care of
    systems level concerns such as
    message loss, duplication and
    reordering
    • 3-way merge is oblivious to these
    ✦ By leaving those concerns to MRDT
    middleware

    View Slide

  56. ??
    12
    7
    8 21
    +1 *3
    22 22
    22
    22 = 21 + (21-21) + (22 -21)
    Systems ➞ PL
    • CRDTs need to take care of
    systems level concerns such as
    message loss, duplication and
    reordering
    • 3-way merge is oblivious to these
    ✦ By leaving those concerns to MRDT
    middleware

    View Slide

  57. Does the 3-way merge idea generalise?

    View Slide

  58. Does the 3-way merge idea generalise?
    Sort of

    View Slide

  59. 14
    Observed-Removed Set

    View Slide

  60. 14
    • OR-set — add-wins when there is a concurrent add and remove
    of the same element
    Observed-Removed Set

    View Slide

  61. 14
    • OR-set — add-wins when there is a concurrent add and remove
    of the same element
    Observed-Removed Set
    let merge ~lca ~v1 ~v2 =


    (lca ∩ v1 ∩ v2) (* unmodified elements *)


    ∪ (v1 - lca) (* added in v1 *)


    ∪ (v2 - lca) (* added in v2 *)
    Kaki et al. “Mergeable Replicated Data Types”,
    OOPSLA 2019

    View Slide

  62. 14
    {1}
    • OR-set — add-wins when there is a concurrent add and remove
    of the same element
    Observed-Removed Set
    let merge ~lca ~v1 ~v2 =


    (lca ∩ v1 ∩ v2) (* unmodified elements *)


    ∪ (v1 - lca) (* added in v1 *)


    ∪ (v2 - lca) (* added in v2 *)
    Kaki et al. “Mergeable Replicated Data Types”,
    OOPSLA 2019

    View Slide

  63. 14
    {1}
    {1}
    add(1)
    • OR-set — add-wins when there is a concurrent add and remove
    of the same element
    Observed-Removed Set
    let merge ~lca ~v1 ~v2 =


    (lca ∩ v1 ∩ v2) (* unmodified elements *)


    ∪ (v1 - lca) (* added in v1 *)


    ∪ (v2 - lca) (* added in v2 *)
    Kaki et al. “Mergeable Replicated Data Types”,
    OOPSLA 2019

    View Slide

  64. 14
    {1}
    {1} { }
    add(1) rem(1)
    • OR-set — add-wins when there is a concurrent add and remove
    of the same element
    Observed-Removed Set
    let merge ~lca ~v1 ~v2 =


    (lca ∩ v1 ∩ v2) (* unmodified elements *)


    ∪ (v1 - lca) (* added in v1 *)


    ∪ (v2 - lca) (* added in v2 *)
    Kaki et al. “Mergeable Replicated Data Types”,
    OOPSLA 2019

    View Slide

  65. 14
    {1}
    {1} { }
    { } { }
    add(1) rem(1)
    • OR-set — add-wins when there is a concurrent add and remove
    of the same element
    Observed-Removed Set
    let merge ~lca ~v1 ~v2 =


    (lca ∩ v1 ∩ v2) (* unmodified elements *)


    ∪ (v1 - lca) (* added in v1 *)


    ∪ (v2 - lca) (* added in v2 *)
    { } ∪ ({1} - {1}) ∪ ({ } - {1})

    = { } ∪ { } ∪ { }

    = { } (expected {1})
    Kaki et al. “Mergeable Replicated Data Types”,
    OOPSLA 2019

    View Slide

  66. 14
    {1}
    {1} { }
    { } { }
    add(1) rem(1)
    • Convergence is not suf
    fi
    cient; Intent is not preserved
    • OR-set — add-wins when there is a concurrent add and remove
    of the same element
    Observed-Removed Set
    let merge ~lca ~v1 ~v2 =


    (lca ∩ v1 ∩ v2) (* unmodified elements *)


    ∪ (v1 - lca) (* added in v1 *)


    ∪ (v2 - lca) (* added in v2 *)
    { } ∪ ({1} - {1}) ∪ ({ } - {1})

    = { } ∪ { } ∪ { }

    = { } (expected {1})
    Kaki et al. “Mergeable Replicated Data Types”,
    OOPSLA 2019

    View Slide

  67. Concretising Intent
    • Intent is a woolly term
    ★ How can we formalise the intent of operations on a data
    structure?
    15
    l
    v1 v2
    v

    View Slide

  68. Concretising Intent
    • Intent is a woolly term
    ★ How can we formalise the intent of operations on a data
    structure?
    • We need
    ★ A formal language to specify the intent of an RDT
    ★ Mechanization to bridge the air gap between speci
    fi
    cation
    and implementation due to distributed system complexity
    15
    l
    v1 v2
    v

    View Slide

  69. Peepul — Certi
    fi
    ed MRDTs
    16

    View Slide

  70. Peepul — Certi
    fi
    ed MRDTs
    • An F* library implementing and proving MRDTs
    ★ https://github.com/prismlab/peepul
    16

    View Slide

  71. Peepul — Certi
    fi
    ed MRDTs
    • An F* library implementing and proving MRDTs
    ★ https://github.com/prismlab/peepul
    • Speci
    fi
    cation language is event-based
    ★ Burckhardt et al. “Replicated Data Types: Speci
    fi
    cation, Veri
    fi
    cation and Optimality”,
    POPL 2014
    16

    View Slide

  72. Peepul — Certi
    fi
    ed MRDTs
    • An F* library implementing and proving MRDTs
    ★ https://github.com/prismlab/peepul
    • Speci
    fi
    cation language is event-based
    ★ Burckhardt et al. “Replicated Data Types: Speci
    fi
    cation, Veri
    fi
    cation and Optimality”,
    POPL 2014
    • Replication-aware simulation to connect speci
    fi
    cation with implementation
    16

    View Slide

  73. Peepul — Certi
    fi
    ed MRDTs
    • An F* library implementing and proving MRDTs
    ★ https://github.com/prismlab/peepul
    • Speci
    fi
    cation language is event-based
    ★ Burckhardt et al. “Replicated Data Types: Speci
    fi
    cation, Veri
    fi
    cation and Optimality”,
    POPL 2014
    • Replication-aware simulation to connect speci
    fi
    cation with implementation
    • Composition of MRDTs and their proofs!
    16

    View Slide

  74. Peepul — Certi
    fi
    ed MRDTs
    • An F* library implementing and proving MRDTs
    ★ https://github.com/prismlab/peepul
    • Speci
    fi
    cation language is event-based
    ★ Burckhardt et al. “Replicated Data Types: Speci
    fi
    cation, Veri
    fi
    cation and Optimality”,
    POPL 2014
    • Replication-aware simulation to connect speci
    fi
    cation with implementation
    • Composition of MRDTs and their proofs!
    • Extracted RDTs are compatible with Irmin — a Git-like distributed
    database
    16

    View Slide

  75. Fixing OR-Set
    • Discriminate duplicate additions by
    associating a unique id
    17

    View Slide

  76. Fixing OR-Set
    • Discriminate duplicate additions by
    associating a unique id
    17
    { (a,1) }

    View Slide

  77. Fixing OR-Set
    • Discriminate duplicate additions by
    associating a unique id
    17
    { (a,1) }
    { (a,1);


    (a,2) }
    add(a)

    View Slide

  78. Fixing OR-Set
    • Discriminate duplicate additions by
    associating a unique id
    17
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    add(a) rem(a)

    View Slide

  79. Fixing OR-Set
    • Discriminate duplicate additions by
    associating a unique id
    17
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)
    { } ∪ ( { (a,1); (a,2) } - { (a,1) }) ∪ ( { } - { (a,1) } )

    = { } ∪ { (a,2) } ∪ { }

    = { (a,2) }

    View Slide

  80. Fixing OR-Set
    • Discriminate duplicate additions by
    associating a unique id
    • MRDT implementation
    17
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)
    { } ∪ ( { (a,1); (a,2) } - { (a,1) }) ∪ ( { } - { (a,1) } )

    = { } ∪ { (a,2) } ∪ { }

    = { (a,2) }

    View Slide

  81. Fixing OR-Set
    • Discriminate duplicate additions by
    associating a unique id
    • MRDT implementation
    17
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)
    { } ∪ ( { (a,1); (a,2) } - { (a,1) }) ∪ ( { } - { (a,1) } )

    = { } ∪ { (a,2) } ∪ { }

    = { (a,2) }

    View Slide

  82. Fixing OR-Set
    • Discriminate duplicate additions by
    associating a unique id
    • MRDT implementation
    17
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)
    { } ∪ ( { (a,1); (a,2) } - { (a,1) }) ∪ ( { } - { (a,1) } )

    = { } ∪ { (a,2) } ∪ { }

    = { (a,2) }
    Unique Lamport Timestamps

    View Slide

  83. 18
    Specifying OR-Set
    Abstract state

    View Slide

  84. 18
    Specifying OR-Set
    Abstract state
    add(a)
    add(a) rem(a)
    rd
    vis
    vis
    vis vis
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)

    View Slide

  85. 18
    Specifying OR-Set
    Abstract state
    add(a)
    add(a) rem(a)
    rd
    vis
    vis
    vis vis
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)

    View Slide

  86. 18
    Specifying OR-Set
    Abstract state
    = { a }
    add(a)
    add(a) rem(a)
    rd
    vis
    vis
    vis vis
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)

    View Slide

  87. Simulation Relation
    • Connects the abstract execution with the concrete state
    • For the OR-set,
    19

    View Slide

  88. Verifying Operations
    1. Show that the simulation holds for operations
    20

    View Slide

  89. Verifying Operations
    1. Show that the simulation holds for operations
    20
    to prove
    Operation de
    fi
    nition
    de
    fi
    ned once-and-for-all
    Simulation
    relation

    View Slide

  90. Verifying Operations
    1. Show that the simulation holds for operations
    20
    to prove
    Operation de
    fi
    nition
    de
    fi
    ned once-and-for-all
    Simulation
    relation
    2. Show that the simulation holds for merge

    View Slide

  91. Verifying Operations
    1. Show that the simulation holds for operations
    20
    to prove
    Operation de
    fi
    nition
    de
    fi
    ned once-and-for-all
    Simulation
    relation
    2. Show that the simulation holds for merge

    View Slide

  92. Verifying Operations
    1. Show that the simulation holds for operations
    20
    to prove
    Operation de
    fi
    nition
    de
    fi
    ned once-and-for-all
    Simulation
    relation
    2. Show that the simulation holds for merge
    Merge de
    fi
    nition
    Assume
    de
    fi
    ned once-and-for-all
    To prove

    View Slide

  93. Verifying Operations
    3. Show that the speci
    fi
    cation and the implementation agree on
    the return values of operations
    21

    View Slide

  94. Verifying Operations
    3. Show that the speci
    fi
    cation and the implementation agree on
    the return values of operations
    21
    4. Convergence

    View Slide

  95. Verifying Operations
    3. Show that the speci
    fi
    cation and the implementation agree on
    the return values of operations
    21
    4. Convergence
    ✦ Permits the different replicas to converge to states that are
    observationally equal but not structurally equal
    ✤ Example: differently balanced BSTs

    View Slide

  96. Verifying Operations
    3. Show that the speci
    fi
    cation and the implementation agree on
    the return values of operations
    21
    4. Convergence
    ✦ Permits the different replicas to converge to states that are
    observationally equal but not structurally equal
    ✤ Example: differently balanced BSTs

    View Slide

  97. Space-ef
    fi
    cient OR-Set
    • Recall that the OR-set has duplicates
    • How can we remove them?
    22
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)

    View Slide

  98. Space-ef
    fi
    cient OR-Set
    • Recall that the OR-set has duplicates
    • How can we remove them?
    • Idea
    ★ On addition, replace existing element’s timestamp with the new timestamp
    ★ On merge, pick the larger timestamp
    22
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)

    View Slide

  99. Space-ef
    fi
    cient OR-Set
    • Recall that the OR-set has duplicates
    • How can we remove them?
    • Idea
    ★ On addition, replace existing element’s timestamp with the new timestamp
    ★ On merge, pick the larger timestamp
    22
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)
    Correctness
    argument is tricky

    View Slide

  100. Space-ef
    fi
    cient OR-Set
    23
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)
    { (a,1) }
    { (a,2) } { }
    { (a,2) } { (a,2) }
    add(a) rem(a)

    View Slide

  101. Space-ef
    fi
    cient OR-Set
    23
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)
    { (a,1) }
    { (a,2) } { }
    { (a,2) } { (a,2) }
    add(a) rem(a)

    View Slide

  102. Space-ef
    fi
    cient OR-Set
    23
    { (a,1) }
    { (a,1);


    (a,2) }
    { }
    { (a,2) } { (a,2) }
    add(a) rem(a)
    { (a,1) }
    { (a,2) } { }
    { (a,2) } { (a,2) }
    add(a) rem(a)
    Simulation relation is more
    intricate as one would
    expect

    View Slide

  103. Veri
    fi
    cation effort
    24

    View Slide

  104. 25
    Composing CRDTs is HARD!

    View Slide

  105. Composing IRC-style chat
    • Build IRC-style group chat
    ★ Send and read messages in channels
    ★ For simplicity, channels and messages cannot be deleted
    • Represent application state as a grow-only map with string
    (channel name) keys and mergeable-log as values
    • Goal:
    ★ map and log proved correct separately
    ★ Use the proof of underlying RDTs to prove chat application
    correctness
    26

    View Slide

  106. Generic Map MRDT
    • Speci
    fi
    cation
    27

    View Slide

  107. Generic Map MRDT
    • Speci
    fi
    cation
    27
    where

    View Slide

  108. Generic Map MRDT
    • Speci
    fi
    cation
    • Project
    fi
    lters the abstract state of the map on the key k and
    returns an abstract state of the underlying data type
    ★ Provided by the user once for a generic MRDT
    27
    where

    View Slide

  109. Generic Map MRDT
    • Speci
    fi
    cation
    • Project
    fi
    lters the abstract state of the map on the key k and
    returns an abstract state of the underlying data type
    ★ Provided by the user once for a generic MRDT
    27
    where
    set (“general”, append (“hello”))
    set (“compiler”, append (“error”))
    set (“general”, append (“world”))
    vis
    vis
    get (“general”, rd)
    [“world”; “hello”]

    View Slide

  110. 28
    Generic Map MRDT
    Implementation
    Simulation Relation

    View Slide

  111. 28
    Generic Map MRDT
    Implementation
    Simulation Relation
    Get applies given operation on the
    value at key k and returns the value

    View Slide

  112. 28
    Generic Map MRDT
    Implementation
    Simulation Relation
    Get applies given operation on the
    value at key k and returns the value
    Set is Get + update the map
    with the new state

    View Slide

  113. 28
    Generic Map MRDT
    Implementation
    Simulation Relation
    Get applies given operation on the
    value at key k and returns the value
    Set is Get + update the map
    with the new state
    Merge uses the merge of the
    underlying value type!

    View Slide

  114. 28
    Generic Map MRDT
    Implementation
    Simulation Relation
    Get applies given operation on the
    value at key k and returns the value
    Set is Get + update the map
    with the new state
    Merge uses the merge of the
    underlying value type!
    Simulation relation appeals to the
    value type’s simulation relation!

    View Slide

  115. • Program state is constructed by instantiating generic map with
    mergeable log
    ★ The proof of correctness of the chat application directly follows from the
    composition!
    29
    Composing IRC-style chat

    View Slide

  116. Mergeable Queues
    • Replicated queue with at-least-once dequeue semantics
    ★ First veri
    fi
    ed queue RDT!
    30

    View Slide

  117. Mergeable Queues
    • Replicated queue with at-least-once dequeue semantics
    ★ First veri
    fi
    ed queue RDT!
    30

    View Slide

  118. Mergeable Queues
    • Replicated queue with at-least-once dequeue semantics
    ★ First veri
    fi
    ed queue RDT!
    • Our aim is to have O(1) enqueue and dequeue and O(n) merge
    30

    View Slide

  119. Mergeable Queues
    • Implementation
    ★ Uses two-list functional queue implementation
    ✦ amortised O(1) enqueue and dequeue operations
    ★ Merge uses longest common contiguous subsequence
    algorithm — O(n)
    31
    M

    View Slide

  120. Mergeable Queues
    • Implementation
    ★ Uses two-list functional queue implementation
    ✦ amortised O(1) enqueue and dequeue operations
    ★ Merge uses longest common contiguous subsequence
    algorithm — O(n)
    • Speci
    fi
    cation
    1.Any element popped in either A or B does not
    remain in M
    2. Any element pushed into either A or B appears in M
    3. An element that remains untouched in LCA, A, B
    remains in M
    4. Order of pairs of elements in LCA, A, B must be
    preserved in M, if those elements are present in M.
    31
    M

    View Slide

  121. Mergeable Queues
    • Implementation
    ★ Uses two-list functional queue implementation
    ✦ amortised O(1) enqueue and dequeue operations
    ★ Merge uses longest common contiguous subsequence
    algorithm — O(n)
    • Speci
    fi
    cation
    1.Any element popped in either A or B does not
    remain in M
    2. Any element pushed into either A or B appears in M
    3. An element that remains untouched in LCA, A, B
    remains in M
    4. Order of pairs of elements in LCA, A, B must be
    preserved in M, if those elements are present in M.
    31
    M
    Implementation far
    removed from the
    specification!

    View Slide

  122. Veri
    fi
    cation effort
    32

    View Slide

  123. 33
    Summary

    View Slide

  124. • Programming and proving with RDTs is complicated due to
    concurrency and the lack of suitable programming abstractions
    33
    Summary

    View Slide

  125. • Programming and proving with RDTs is complicated due to
    concurrency and the lack of suitable programming abstractions
    • MRDTs simplify RDTs by implementing them as extensions of
    sequential data types
    ★ Reasoning about correctness is still hard
    33
    Summary

    View Slide

  126. • Programming and proving with RDTs is complicated due to
    concurrency and the lack of suitable programming abstractions
    • MRDTs simplify RDTs by implementing them as extensions of
    sequential data types
    ★ Reasoning about correctness is still hard
    • Peepul is an F* library for certi
    fi
    ed MRDTs
    ★ Replication-aware simulation for proving complex MRDTs
    ★ Complex MRDTs can be constructed and proved using simpler MRDTs
    33
    Summary

    View Slide

  127. • Programming and proving with RDTs is complicated due to
    concurrency and the lack of suitable programming abstractions
    • MRDTs simplify RDTs by implementing them as extensions of
    sequential data types
    ★ Reasoning about correctness is still hard
    • Peepul is an F* library for certi
    fi
    ed MRDTs
    ★ Replication-aware simulation for proving complex MRDTs
    ★ Complex MRDTs can be constructed and proved using simpler MRDTs
    • F* allows us to strike a balance between automated and
    interactive proofs
    ★ Extract to OCaml and run on Irmin!
    33
    Summary

    View Slide

  128. Backup Slides
    34

    View Slide

  129. Queue Performance
    35

    View Slide