Verifying Strong Eventual Consistency in Distributed Systems

Verifying Strong Eventual Consistency in Distributed Systems VICTOR B. F.
GOMES, MARTIN KLEPPMANN, DOMINIC P. MULLIGAN, ALASTAIR R. BERESFORD Read By: Raghav Roy

whoami

What I will be covering - Motivation - Foundation -
Implementation

What I will be covering • {Strong, Eventual, Strong Eventual}
Consistency

Consistency • Roles real world networks play

Consistency • Roles real world networks play • Prerequisite Information

Consistency • Roles real world networks play • Prerequisite Information • Why other algorithms failed - Brief

Consistency • Roles real world networks play • Prerequisite Information • Why other algorithms failed - Brief • Proof Strategy and General Purpose Models

Consistency • Roles real world networks play • Prerequisite Information • Why other algorithms failed - Brief • Proof Strategy and General Purpose Models • Implementation of CRDTs

Consistency • Roles real world networks play • Prerequisite Information • Why other algorithms failed - Brief • Proof Strategy and General Purpose Models • Implementation of CRDTs • Final Remarks/Conclusions

{Strong, Eventual, Strong Eventual} Consistency

Strong Consistency

Strong Consistency • Replicas update in the same order

Strong Consistency • Replicas update in the same order •
Consensus ◦ Serialisation Bottleneck

Consensus ◦ Serialisation Bottleneck ◦ Tolerate n/2 faults

Consensus ◦ Serialisation Bottleneck ◦ Tolerate n/2 faults • Sequential, Linearisable

Eventual Consistency

Eventual Consistency • Update local the propagate ◦ No foreground
sync

sync ◦ Eventual, reliable delivery

sync ◦ Eventual, reliable delivery • On conﬂict ◦ Arbitrate

sync ◦ Eventual, reliable delivery • On conﬂict ◦ Arbitrate ◦ Roll-back • Consensus moved to background

Diverge

Conflict!

Strong Eventual Consistency

Strong Eventual Consistency • Update local the propagate ◦ No
foreground sync

foreground sync ◦ Eventual, reliable delivery

foreground sync ◦ Eventual, reliable delivery • No conﬂict ◦ Unique outcome of concurrent updates

foreground sync ◦ Eventual, reliable delivery • No conﬂict ◦ Unique outcome of concurrent updates • No consensus: n-1 faults

foreground sync ◦ Eventual, reliable delivery • No conﬂict ◦ Unique outcome of concurrent updates • No consensus: n-1 faults • Solves CAP theorem? (A critique of the CAP theorem)

foreground sync ◦ Eventual, reliable delivery • No conﬂict ◦ Unique outcome of concurrent updates • No consensus: n-1 faults • Solves CAP theorem? (A critique of the CAP theorem) • Does not satisfy Strong Consistency conditions (concurrent ops can happen in any order)

Op-Based Commute Necessary conditions for convergence

Op-Based Commute Necessary conditions for convergence • Liveness: All replicas
execute all operations in delivery order

Op-Based Commute Necessary conditions for convergence • Liveness: All replicas
execute all operations in delivery order • Safety: Concurrent operations commute

Updates a and b should commute

Now Add Networks!

Networks Do Not Spark Joy • Replication algorithm must operate
across computer networks

across computer networks • These may arbitrarily delay, drop, or re-order messages

across computer networks • These may arbitrarily delay, drop, or re-order messages • Experience temporary partitions of the nodes

across computer networks • These may arbitrarily delay, drop, or re-order messages • Experience temporary partitions of the nodes • Suffer node failures.

across computer networks • These may arbitrarily delay, drop, or re-order messages • Experience temporary partitions of the nodes • Suffer node failures. • Making false assumptions about this execution environment -> incorrect models

Why Other Algorithms Failed

Why They Failed • Wrong assumptions about the Network Infrastructure

• The requirement for a central server for some of these increases the risk of faults

• The requirement for a central server for some of these increases the risk of faults • Their informal reasoning has produced plausible-looking, but incorrect algorithms. (The next two slides are borrowed from Martin Kleppmann’s talk)

Isabelle/HOL

Relevant Semantics What is a Formal Proof?

Relevant Semantics What is a Formal Proof? • A derivation
in formal calculus ◦ For example: A ∧ B → B ∧ A

in formal calculus ◦ For example: A ∧ B → B ∧ A ◦ Left as an exercise for the reader: (check it out here)

in formal calculus ◦ For example: A ∧ B → B ∧ A ◦ Left as an exercise for the reader: (check it out here) What is a Theorem Prover?

in formal calculus ◦ For example: A ∧ B → B ∧ A ◦ Left as an exercise for the reader: (check it out here) What is a Theorem Prover? • In the context of Isabelle - Automated proofs, and interactive

in formal calculus ◦ For example: A ∧ B → B ∧ A ◦ Left as an exercise for the reader: (check it out here) What is a Theorem Prover? • In the context of Isabelle - Automated proofs, and interactive • Based on rules and axioms

Relevant Semantics • Other veriﬁcation tools: Model checking, static analysis
(do not deliver proofs)

(do not deliver proofs) • Analyse systems thoroughly

(do not deliver proofs) • Analyse systems thoroughly • Find design and speciﬁcation errors early

(do not deliver proofs) • Analyse systems thoroughly • Find design and speciﬁcation errors early • High assurance, etc

Relevant Semantics Logical Implications

Relevant Semantics Logical Implications • A => B => C
or [A;B] => C ◦ Read: A and B implies C

or [A;B] => C ◦ Read: A and B implies C • Used to write rules, theorems and proof states

or [A;B] => C ◦ Read: A and B implies C • Used to write rules, theorems and proof states • t → s for logical implication between formulae

Relevant Semantics • λx . t • Anonymous function mapping
an argument x to t(x)

Relevant Semantics Lists • [] or ‘nil’ empty list

Relevant Semantics Lists • [] or ‘nil’ empty list •
# - “cons” - prepends an element to an existing list

Relevant Semantics Lists • [] or ‘nil’ empty list •
# - “cons” - prepends an element to an existing list • @ - concatenation/appending

Relevant Semantics Sets • {} - empty set

Relevant Semantics Sets • {} - empty set • t
∪ u, t ∩ u, and x ∈ t have usual meanings

Relevant Semantics Deﬁnitions and theorems • Inductive relations are deﬁned
with the inductive keyword.

Relevant Semantics Definitions and theorems • Inductive relations are defined
with the inductive keyword. inductive only-fives :: nat list ⇒ bool where only-fives [] | [[ only-fives xs ]] ⇒ only-fives (5#xs)

Relevant Semantics Deﬁnitions and theorems • Lemmas, theorems, and corollaries
can be asserted using the lemma, theorem, and corollary keywords

Relevant Semantics Definitions and theorems • Lemmas, theorems, and corollaries
can be asserted using the lemma, theorem, and corollary keywords theorem only-fives-concat: assumes only-fives xs and only-fives ys shows only-fives (xs @ ys)

Relevant Semantics Deﬁnitions and theorems • Locales: May be thought
of as an interface with associated laws that implementations must obey

of as an interface with associated laws that implementations must obey locale semigroup = ﬁxes f :: ′a ⇒ ′a ⇒ ′a assumes f x (f y z) = f (f x y) z

of as an interface with associated laws that implementations must obey locale semigroup = ﬁxes f :: ′a ⇒ ′a ⇒ ′a assumes f x (f y z) = f (f x y) z • Introduces a locale, with a ﬁxed, typed constant f, and a law asserting that f is associative.

of as an interface with associated laws that implementations must obey locale semigroup = fixes f :: ′a ⇒ ′a ⇒ ′a assumes f x (f y z) = f (f x y) z • Introduces a locale, with a fixed, typed constant f, and a law asserting that f is associative. • Functions and constants may now be defined, and theorems conjectured and proved

Proof Strategy

Proof Strategy Breather?

Proof Strategy • The approach here breaks the proof into
simple modules or Locales

simple modules or Locales • More than half the code to construct a general purpose model of consistency and an axiomatic network model

simple modules or Locales • More than half the code to construct a general purpose model of consistency and an axiomatic network model • The remainder - Formalisation of three CRDTs and their proofs for correctness

simple modules or Locales • More than half the code to construct a general purpose model of consistency and an axiomatic network model • The remainder - Formalisation of three CRDTs and their proofs for correctness • Keeping the general purpose modules abstract and independent

simple modules or Locales • More than half the code to construct a general purpose model of consistency and an axiomatic network model • The remainder - Formalisation of three CRDTs and their proofs for correctness • Keeping the general purpose modules abstract and independent • They are able to create a reusable library of speciﬁcations and theorems

Proof Strategy • Formalisation of Strong Eventual Consistency (SEC) ◦
What they mean by convergence - prove an abstract convergence theorem

What they mean by convergence - prove an abstract convergence theorem • This is independent of networks or any particular CRDT

What they mean by convergence - prove an abstract convergence theorem • This is independent of networks or any particular CRDT • Describing an axiomatic model of asynchronous networks ◦ Only part of the proof with any axiomatic assumptions

What they mean by convergence - prove an abstract convergence theorem • This is independent of networks or any particular CRDT • Describing an axiomatic model of asynchronous networks ◦ Only part of the proof with any axiomatic assumptions • Prove that the network satisﬁes the ordering properties required by the abstract convergence theorem

What they mean by convergence - prove an abstract convergence theorem • This is independent of networks or any particular CRDT • Describing an axiomatic model of asynchronous networks ◦ Only part of the proof with any axiomatic assumptions • Prove that the network satisﬁes the ordering properties required by the abstract convergence theorem • Use these two models to prove SEC for concrete algorithms (RGA, Counter, OR-Set)

Abstract Convergence

Abstract Convergence • SEC is stronger than Eventual Consistency ◦
Whenever two nodes have received the same set of updates, they must be in the same state

Whenever two nodes have received the same set of updates, they must be in the same state ◦ Constrains the value ‘read’ can return at any time

Whenever two nodes have received the same set of updates, they must be in the same state ◦ Constrains the value ‘read’ can return at any time • To Formalise this using Isabelle, no assumptions made about the network or the data structures ◦ Abstract model of operations - can be reordered

You are here

Happens Before and Causality • Simplest way to achieve convergence
- All operations commute ◦ Too strong to be useful

- All operations commute ◦ Too strong to be useful • Better - Only “concurrent” operations commute

- All operations commute ◦ Too strong to be useful • Better - Only “concurrent” operations commute • Any other operations that “knew about” each other i.e. have a “happens-before” relationship (causal dependency) need not commute ◦ x ≺ y to indicate that operation x happened before y

- All operations commute ◦ Too strong to be useful • Better - Only “concurrent” operations commute • Any other operations that “knew about” each other i.e. have a “happens-before” relationship (causal dependency) need not commute ◦ x ≺ y to indicate that operation x happened before y ◦ Type - ′oper ⇒ ′oper ⇒ bool

- All operations commute ◦ Too strong to be useful • Better - Only “concurrent” operations commute • Any other operations that “knew about” each other i.e. have a “happens-before” relationship (causal dependency) need not commute ◦ x ≺ y to indicate that operation x happened before y ◦ Type - ′oper ⇒ ′oper ⇒ bool ◦ ≺ can be applied to two operations of some abstract type ′oper, returning either True or False

- All operations commute ◦ Too strong to be useful • Better - Only “concurrent” operations commute • Any other operations that “knew about” each other i.e. have a “happens-before” relationship (causal dependency) need not commute ◦ x ≺ y to indicate that operation x happened before y ◦ Type - ′oper ⇒ ′oper ⇒ bool ◦ ≺ can be applied to two operations of some abstract type ′oper, returning either True or False • Must be a strict partial order - irreﬂexive, transitive, antisymmetric

- All operations commute ◦ Too strong to be useful • Better - Only “concurrent” operations commute • Any other operations that “knew about” each other i.e. have a “happens-before” relationship (causal dependency) need not commute ◦ x ≺ y to indicate that operation x happened before y ◦ Type - ′oper ⇒ ′oper ⇒ bool ◦ ≺ can be applied to two operations of some abstract type ′oper, returning either True or False • Must be a strict partial order - irreﬂexive, transitive, antisymmetric • x and y are “concurrent”, written x || y, whenever one does not happen before the other written: ¬(x ≺ y) and ¬(y ≺ x)

Happens Before and Causality • Inductive Deﬁnition of operations being
consistent with happens-before (or simply hb-consistent)

consistent with happens-before (or simply hb-consistent) inductive hb-consistent :: ′oper list ⇒ bool where hb-consistent [] | [[ hb-consistent xs; ∀ x ∈ set xs. ¬ y ≺ x ]] ⇒ hb-consistent (xs @ [y])

consistent with happens-before (or simply hb-consistent) inductive hb-consistent :: ′oper list ⇒ bool where hb-consistent [] | [[ hb-consistent xs; ∀ x ∈ set xs. ¬ y ≺ x ]] ⇒ hb-consistent (xs @ [y]) • The empty list is hb-consistent

consistent with happens-before (or simply hb-consistent) inductive hb-consistent :: ′oper list ⇒ bool where hb-consistent [] | [[ hb-consistent xs; ∀ x ∈ set xs. ¬ y ≺ x ]] ⇒ hb-consistent (xs @ [y]) • The empty list is hb-consistent • Furthermore, given an hb-consistent list xs, we can append an operation y ◦ Provided that y does not happen-before any existing operation x in xs

consistent with happens-before (or simply hb-consistent) inductive hb-consistent :: ′oper list ⇒ bool where hb-consistent [] | [[ hb-consistent xs; ∀ x ∈ set xs. ¬ y ≺ x ]] ⇒ hb-consistent (xs @ [y]) • The empty list is hb-consistent • Furthermore, given an hb-consistent list xs, we can append an operation y ◦ Provided that y does not happen-before any existing operation x in xs • x ≺ y, then x must appear before y in the list.

consistent with happens-before (or simply hb-consistent) inductive hb-consistent :: ′oper list ⇒ bool where hb-consistent [] | [[ hb-consistent xs; ∀ x ∈ set xs. ¬ y ≺ x ]] ⇒ hb-consistent (xs @ [y]) • The empty list is hb-consistent • Furthermore, given an hb-consistent list xs, we can append an operation y ◦ Provided that y does not happen-before any existing operation x in xs • x ≺ y, then x must appear before y in the list. • However, if x ǁ y, the operations can appear in the list in either order.

Interpretation of operations • Modeling state changes - “interpretation function”
of type ◦ interp :: ′oper ⇒ ′state ⇒ ′state option

of type ◦ interp :: ′oper ⇒ ′state ⇒ ′state option • This can be looked at as a “state transformer” - function that maps an old state to a new state, or fails by returning “None”

of type ◦ interp :: ′oper ⇒ ′state ⇒ ′state option • This can be looked at as a “state transformer” - function that maps an old state to a new state, or fails by returning “None” • Capturing this in a Locale - locale happens-before = preorder hb-weak hb for hb-weak :: ′oper ⇒ ′oper ⇒ bool and hb :: ′oper ⇒ ′oper ⇒ bool + ﬁxes interp :: ′oper ⇒ ′state ⇒ ′state option

of type ◦ interp :: ′oper ⇒ ′state ⇒ ′state option • This can be looked at as a “state transformer” - function that maps an old state to a new state, or fails by returning “None” • Capturing this in a Locale - locale happens-before = preorder hb-weak hb for hb-weak :: ′oper ⇒ ′oper ⇒ bool and hb :: ′oper ⇒ ′oper ⇒ bool + ﬁxes interp :: ′oper ⇒ ′state ⇒ ′state option • This locale extends the “preorder” locale - useful lemmas

of type ◦ interp :: ′oper ⇒ ′state ⇒ ′state option • This can be looked at as a “state transformer” - function that maps an old state to a new state, or fails by returning “None” • Capturing this in a Locale - locale happens-before = preorder hb-weak hb for hb-weak :: ′oper ⇒ ′oper ⇒ bool and hb :: ′oper ⇒ ′oper ⇒ bool + ﬁxes interp :: ′oper ⇒ ′state ⇒ ′state option • This locale extends the “preorder” locale - useful lemmas • Constants under this: hb-weak, hb, providing partial and strict partial order

of type ◦ interp :: ′oper ⇒ ′state ⇒ ′state option • This can be looked at as a “state transformer” - function that maps an old state to a new state, or fails by returning “None” • Capturing this in a Locale - locale happens-before = preorder hb-weak hb for hb-weak :: ′oper ⇒ ′oper ⇒ bool and hb :: ′oper ⇒ ′oper ⇒ bool + ﬁxes interp :: ′oper ⇒ ′state ⇒ ′state option • This locale extends the “preorder” locale - useful lemmas • Constants under this: hb-weak, hb, providing partial and strict partial order • Fixes the “interp” function with the type signature (no implementation yet)

Interpretation of operations • Given two operations x and y,
we can now deﬁne the composition of state transformers

we can now deﬁne the composition of state transformers • 〈x〉 I> 〈y〉 to denote the state transformer that ﬁrst applies the effect of x to some state, and then applies the effect of y to the result.

we can now deﬁne the composition of state transformers • 〈x〉 I> 〈y〉 to denote the state transformer that ﬁrst applies the effect of x to some state, and then applies the effect of y to the result. • If either 〈x〉 or 〈y〉fails, the combined state transformer also fails

Interpretation of operations • Let’s deﬁne apply-operations

Interpretation of operations • Let’s deﬁne apply-operations • This will
compose an arbitrary list of operations into a state transformer deﬁnition apply-operations :: ′oper list ⇒ ′state ⇒ ′state option where apply-operations ops ≡ foldl (op |>) Some (map interp ops)

compose an arbitrary list of operations into a state transformer deﬁnition apply-operations :: ′oper list ⇒ ′state ⇒ ′state option where apply-operations ops ≡ foldl (op |>) Some (map interp ops) • The result: state transformer that applies the interpretation of each of the operations in the list in left to right order to some initial state

compose an arbitrary list of operations into a state transformer deﬁnition apply-operations :: ′oper list ⇒ ′state ⇒ ′state option where apply-operations ops ≡ foldl (op |>) Some (map interp ops) • The result: state transformer that applies the interpretation of each of the operations in the list in left to right order to some initial state • Any failed operation results in the entire composition to return None

Commutativity and Convergence

Commutativity and Convergence • Operations x and y commute when
〈x〉 I> 〈y〉 = 〈y〉 I> 〈x〉 ◦ We can swap the order of the interpretation without changing the resulting state

〈x〉 I> 〈y〉 = 〈y〉 I> 〈x〉 ◦ We can swap the order of the interpretation without changing the resulting state • Too strong to have this hold for all operations

〈x〉 I> 〈y〉 = 〈y〉 I> 〈x〉 ◦ We can swap the order of the interpretation without changing the resulting state • Too strong to have this hold for all operations • Only required to hold for operations that are concurrent, shown by this deﬁnition:

Commutativity and Convergence • Let’s show Convergence!

Commutativity and Convergence • Let’s show Convergence! • Two “hb-consistent”
lists of “distinct” operations - Let these be permutations

Commutativity and Convergence • Let’s show Convergence! • Two “hb-consistent”
lists of “distinct” operations - Let these be permutations • If concurrent operations commute then they have the same interpretation

Formalising Strong Eventual Consistency • The only thing left to
consider is “progress”

consider is “progress” • Valid operations should not become stuck in an error state

consider is “progress” • Valid operations should not become stuck in an error state locale strong-eventual-consistency = happens-before +

consider is “progress” • Valid operations should not become stuck in an error state locale strong-eventual-consistency = happens-before + ﬁxes op-history :: ′oper list ⇒ bool and initial-state :: ′state

consider is “progress” • Valid operations should not become stuck in an error state locale strong-eventual-consistency = happens-before + ﬁxes op-history :: ′oper list ⇒ bool and initial-state :: ′state assumes causality: [[ op-history xs ]] ⇒ hb-consistent xs

consider is “progress” • Valid operations should not become stuck in an error state locale strong-eventual-consistency = happens-before + ﬁxes op-history :: ′oper list ⇒ bool and initial-state :: ′state assumes causality: [[ op-history xs ]] ⇒ hb-consistent xs and distinctness: [[ op-history xs ]] ⇒ distinct xs

consider is “progress” • Valid operations should not become stuck in an error state locale strong-eventual-consistency = happens-before + ﬁxes op-history :: ′oper list ⇒ bool and initial-state :: ′state assumes causality: [[ op-history xs ]] ⇒ hb-consistent xs and distinctness: [[ op-history xs ]] ⇒ distinct xs and trunc-history: [[ op-history(xs@[x]) ]] ⇒ op-history xs

consider is “progress” • Valid operations should not become stuck in an error state locale strong-eventual-consistency = happens-before + ﬁxes op-history :: ′oper list ⇒ bool and initial-state :: ′state assumes causality: [[ op-history xs ]] ⇒ hb-consistent xs and distinctness: [[ op-history xs ]] ⇒ distinct xs and trunc-history: [[ op-history(xs@[x]) ]] ⇒ op-history xs and commutativity: [[ op-history xs ]] ⇒ concurrent-ops-commute xs

consider is “progress” • Valid operations should not become stuck in an error state locale strong-eventual-consistency = happens-before + ﬁxes op-history :: ′oper list ⇒ bool and initial-state :: ′state assumes causality: [[ op-history xs ]] ⇒ hb-consistent xs and distinctness: [[ op-history xs ]] ⇒ distinct xs and trunc-history: [[ op-history(xs@[x]) ]] ⇒ op-history xs and commutativity: [[ op-history xs ]] ⇒ concurrent-ops-commute xs and no-failure: [[ op-history(xs@[x]); apply-operations xs initial-state = Some state ]] ⇒ 〈x〉 state , None

Formalising Strong Eventual Consistency • Some details on that,

Formalising Strong Eventual Consistency • Some details on that, •
Concise summary of the properties that we require in order to achieve SEC

Concise summary of the properties that we require in order to achieve SEC • Op-history is an abstract predicate describing any valid operation history of some algorithm following: (concurrent-ops-commute, distinct, hb-consistent)

Concise summary of the properties that we require in order to achieve SEC • Op-history is an abstract predicate describing any valid operation history of some algorithm following: (concurrent-ops-commute, distinct, hb-consistent) • We can use this to prove the two safety properties of SEC as theorems

Formalising Strong Eventual Consistency • Operations convergence

Formalising Strong Eventual Consistency • Progress (no failure for valid
ops)

Formalising Strong Eventual Consistency • First three assumptions are satisﬁed
by the network model (no algorithm speciﬁc proofs required)

by the network model (no algorithm speciﬁc proofs required) • For individual algorithms we only need to prove commutativity and no-failure

by the network model (no algorithm speciﬁc proofs required) • For individual algorithms we only need to prove commutativity and no-failure • Note: trunc-history assumption requires that every preﬁx of a valid operation history is also valid ◦ => Convergence theorem holds at every step of the execution.

by the network model (no algorithm specific proofs required) • For individual algorithms we only need to prove commutativity and no-failure • Note: trunc-history assumption requires that every prefix of a valid operation history is also valid ◦ => Convergence theorem holds at every step of the execution. ◦ Not at some unspecified time in the future (eventual consistency)

by the network model (no algorithm specific proofs required) • For individual algorithms we only need to prove commutativity and no-failure • Note: trunc-history assumption requires that every prefix of a valid operation history is also valid ◦ => Convergence theorem holds at every step of the execution. ◦ Not at some unspecified time in the future (eventual consistency) • Making SEC stronger than EC

You are here

Axiomatic Network Model

Axiomatic Network Model • In this section, we develop a
formal deﬁnition of an asynchronous unreliable causal broadcast network

formal deﬁnition of an asynchronous unreliable causal broadcast network • This model will then satisfy the causal delivery requirements of many Op-based CRDTs

formal deﬁnition of an asynchronous unreliable causal broadcast network • This model will then satisfy the causal delivery requirements of many Op-based CRDTs • Also, this makes it suitable for use in decentralised settings without the need of a central server, or a quorum of nodes

formal deﬁnition of an asynchronous unreliable causal broadcast network • This model will then satisfy the causal delivery requirements of many Op-based CRDTs • Also, this makes it suitable for use in decentralised settings without the need of a central server, or a quorum of nodes (Stronger consistency models do not have this property).

formal deﬁnition of an asynchronous unreliable causal broadcast network • This model will then satisfy the causal delivery requirements of many Op-based CRDTs • Also, this makes it suitable for use in decentralised settings without the need of a central server, or a quorum of nodes (Stronger consistency models do not have this property). • The asynchronous aspect means that we make no timing assumptions ◦ Messages sent over the network may suffer unbounded delays before they are delivered

formal deﬁnition of an asynchronous unreliable causal broadcast network • This model will then satisfy the causal delivery requirements of many Op-based CRDTs • Also, this makes it suitable for use in decentralised settings without the need of a central server, or a quorum of nodes (Stronger consistency models do not have this property). • The asynchronous aspect means that we make no timing assumptions ◦ Messages sent over the network may suffer unbounded delays before they are delivered ◦ Nodes may pause their execution for unbounded periods of time

formal deﬁnition of an asynchronous unreliable causal broadcast network • This model will then satisfy the causal delivery requirements of many Op-based CRDTs • Also, this makes it suitable for use in decentralised settings without the need of a central server, or a quorum of nodes (Stronger consistency models do not have this property). • The asynchronous aspect means that we make no timing assumptions ◦ Messages sent over the network may suffer unbounded delays before they are delivered ◦ Nodes may pause their execution for unbounded periods of time • Unreliable means that messages may never arrive at all ◦ Nodes may fail permanently

formal deﬁnition of an asynchronous unreliable causal broadcast network • This model will then satisfy the causal delivery requirements of many Op-based CRDTs • Also, this makes it suitable for use in decentralised settings without the need of a central server, or a quorum of nodes (Stronger consistency models do not have this property). • The asynchronous aspect means that we make no timing assumptions ◦ Messages sent over the network may suffer unbounded delays before they are delivered ◦ Nodes may pause their execution for unbounded periods of time • Unreliable means that messages may never arrive at all ◦ Nodes may fail permanently • Networks are shown to act this way in practice!

Modeling a Distributed System • Aim: Model as an unbounded
number of nodes

number of nodes • No assumptions of the communication pattern

number of nodes • No assumptions of the communication pattern • We assume that each node is uniquely identiﬁed by a natural number (totally ordered)

number of nodes • No assumptions of the communication pattern • We assume that each node is uniquely identiﬁed by a natural number (totally ordered) • Every node’s history has every event (execution step) stored in it - Standard

number of nodes • No assumptions of the communication pattern • We assume that each node is uniquely identiﬁed by a natural number (totally ordered) • Every node’s history has every event (execution step) stored in it - Standard • History of node i is obtained by “history” -> list of events

number of nodes • No assumptions of the communication pattern • We assume that each node is uniquely identiﬁed by a natural number (totally ordered) • Every node’s history has every event (execution step) stored in it - Standard • History of node i is obtained by “history” -> list of events • “distinct” is an Isabelle library function that asserts that a list contains no duplicates

number of nodes • No assumptions of the communication pattern • We assume that each node is uniquely identiﬁed by a natural number (totally ordered) • Every node’s history has every event (execution step) stored in it - Standard • History of node i is obtained by “history” -> list of events • “distinct” is an Isabelle library function that asserts that a list contains no duplicates • Note: No assumptions made about the number of nodes (can model dynamic nodes, they can leave, join, and fail)

You are here

Modeling a Distributed System • Node’s history is ﬁnite, at
the end of the node history the node could have failed or successfully terminated

the end of the node history the node could have failed or successfully terminated • Node failures are treated as permanent - Crash Stop abstraction

the end of the node history the node could have failed or successfully terminated • Node failures are treated as permanent - Crash Stop abstraction • means that x comes before event y in the node history of i

Asynchronous Broadcast Network • We extend the node-histories locale

Asynchronous Broadcast Network • We extend the node-histories locale •
We need to deﬁne how nodes communicate - Broadcast or Deliver ◦ Deliver refers to message being received from the network and “delivered” to an application datatype ′msg event = Broadcast ′msg | Deliver ′msg

We need to deﬁne how nodes communicate - Broadcast or Deliver ◦ Deliver refers to message being received from the network and “delivered” to an application datatype ′msg event = Broadcast ′msg | Deliver ′msg • Can be thought of as a Deterministic State Machine where each transition corresponds to a broadcast or a deliver event.

We need to deﬁne how nodes communicate - Broadcast or Deliver ◦ Deliver refers to message being received from the network and “delivered” to an application datatype ′msg event = Broadcast ′msg | Deliver ′msg • Can be thought of as a Deterministic State Machine where each transition corresponds to a broadcast or a deliver event. • Broadcast abstraction is the standard for op-based CRDTs because it best ﬁts the replication pattern

We need to deﬁne how nodes communicate - Broadcast or Deliver ◦ Deliver refers to message being received from the network and “delivered” to an application datatype ′msg event = Broadcast ′msg | Deliver ′msg • Can be thought of as a Deterministic State Machine where each transition corresponds to a broadcast or a deliver event. • Broadcast abstraction is the standard for op-based CRDTs because it best ﬁts the replication pattern • Any nodes can accept writes and propagate to other nodes

We need to deﬁne how nodes communicate - Broadcast or Deliver ◦ Deliver refers to message being received from the network and “delivered” to an application datatype ′msg event = Broadcast ′msg | Deliver ′msg • Can be thought of as a Deterministic State Machine where each transition corresponds to a broadcast or a deliver event. • Broadcast abstraction is the standard for op-based CRDTs because it best ﬁts the replication pattern • Any nodes can accept writes and propagate to other nodes • More Locales!

Asynchronous Broadcast Network • Now we can start formally specifying
the properties of a broadcast network

the properties of a broadcast network • Three Axioms: Delivery-Has-A-Cause, Deliver-Locally and Msg-Id-Unique

the properties of a broadcast network • Three Axioms: Delivery-Has-A-Cause, Deliver-Locally and Msg-Id-Unique locale network = node-histories history

the properties of a broadcast network • Three Axioms: Delivery-Has-A-Cause, Deliver-Locally and Msg-Id-Unique locale network = node-histories history for history :: nat ⇒ ′msg event list +

the properties of a broadcast network • Three Axioms: Delivery-Has-A-Cause, Deliver-Locally and Msg-Id-Unique locale network = node-histories history for history :: nat ⇒ ′msg event list + ﬁxes msg-id :: ′msg ⇒ ′msgid

the properties of a broadcast network • Three Axioms: Delivery-Has-A-Cause, Deliver-Locally and Msg-Id-Unique locale network = node-histories history for history :: nat ⇒ ′msg event list + ﬁxes msg-id :: ′msg ⇒ ′msgid assumes delivery-has-a-cause: [[ Deliver m ∈ set (history i) ]] ⇒ ∃ j. Broadcast m ∈ set (history j)

the properties of a broadcast network • Three Axioms: Delivery-Has-A-Cause, Deliver-Locally and Msg-Id-Unique locale network = node-histories history for history :: nat ⇒ ′msg event list + ﬁxes msg-id :: ′msg ⇒ ′msgid assumes delivery-has-a-cause: [[ Deliver m ∈ set (history i) ]] ⇒ ∃ j. Broadcast m ∈ set (history j) and deliver-locally: [[ Broadcast m ∈ set (history i) ]] ⇒ Broadcast m ⊏i Deliver m

the properties of a broadcast network • Three Axioms: Delivery-Has-A-Cause, Deliver-Locally and Msg-Id-Unique locale network = node-histories history for history :: nat ⇒ ′msg event list + ﬁxes msg-id :: ′msg ⇒ ′msgid assumes delivery-has-a-cause: [[ Deliver m ∈ set (history i) ]] ⇒ ∃ j. Broadcast m ∈ set (history j) and deliver-locally: [[ Broadcast m ∈ set (history i) ]] ⇒ Broadcast m ⊏i Deliver m and msg-id-unique: [[ Broadcast m1 ∈ set (history i); Broadcast m2 ∈ set (history j); msg-id m1 = msg-id m2 ]] ⇒ i = j ∧ m1 = m2

Asynchronous Broadcast Network • Delivery Has a Cause: No “out
of thin air” values, if m was delivered at some node, then there exists a node where m was broadcast

of thin air” values, if m was delivered at some node, then there exists a node where m was broadcast • Deliver Locally: All broadcast messages are delivered to the node that broadcast the message as well

of thin air” values, if m was delivered at some node, then there exists a node where m was broadcast • Deliver Locally: All broadcast messages are delivered to the node that broadcast the message as well • Msg ID Unique: We assume the existence of msg-id :: ′msg ⇒ ′msgid that maps every message to some global identiﬁer

of thin air” values, if m was delivered at some node, then there exists a node where m was broadcast • Deliver Locally: All broadcast messages are delivered to the node that broadcast the message as well • Msg ID Unique: We assume the existence of msg-id :: ′msg ⇒ ′msgid that maps every message to some global identiﬁer (unique node IDs, sequence numbers, timestamps)

of thin air” values, if m was delivered at some node, then there exists a node where m was broadcast • Deliver Locally: All broadcast messages are delivered to the node that broadcast the message as well • Msg ID Unique: We assume the existence of msg-id :: ′msg ⇒ ′msgid that maps every message to some global identiﬁer (unique node IDs, sequence numbers, timestamps) • Network Locale inherits “histories-distinct” from node-histories ◦ Every message that is delivered on some node, there is exactly one broadcast event that created this message

of thin air” values, if m was delivered at some node, then there exists a node where m was broadcast • Deliver Locally: All broadcast messages are delivered to the node that broadcast the message as well • Msg ID Unique: We assume the existence of msg-id :: ′msg ⇒ ′msgid that maps every message to some global identiﬁer (unique node IDs, sequence numbers, timestamps) • Network Locale inherits “histories-distinct” from node-histories ◦ Every message that is delivered on some node, there is exactly one broadcast event that created this message ◦ Same message is not delivered more than once to each node

of thin air” values, if m was delivered at some node, then there exists a node where m was broadcast • Deliver Locally: All broadcast messages are delivered to the node that broadcast the message as well • Msg ID Unique: We assume the existence of msg-id :: ′msg ⇒ ′msgid that maps every message to some global identiﬁer (unique node IDs, sequence numbers, timestamps) • Network Locale inherits “histories-distinct” from node-histories ◦ Every message that is delivered on some node, there is exactly one broadcast event that created this message ◦ Same message is not delivered more than once to each node • No assumptions made about the reliability of the network (delays, reordering)

You are here

Causally Ordered Delivery • We need to deﬁne an instance
of ordering relation ≺ on messages, and prove that is satisﬁes strict partial ordering

Causally Ordered Delivery • We need to deﬁne an instance
of ordering relation ≺ on messages, and prove that is satisﬁes strict partial ordering • m1 happens before m2, if the node that generated m2 “knew about” m1 when m2 was generated

Causally Ordered Delivery Verbal deﬁnition

Causally Ordered Delivery Verbal deﬁnition • m1 and m2 were
broadcast by the same node, and m1 was broadcast before m2.

broadcast by the same node, and m1 was broadcast before m2. • The node that broadcast m2 had delivered m1 before it broadcast m2.

broadcast by the same node, and m1 was broadcast before m2. • The node that broadcast m2 had delivered m1 before it broadcast m2. • There exists some operation m3 such that m1 ≺ m3 and m3 ≺ m2

broadcast by the same node, and m1 was broadcast before m2. • The node that broadcast m2 had delivered m1 before it broadcast m2. • There exists some operation m3 such that m1 ≺ m3 and m3 ≺ m2 • Even more locales!

Causally Ordered Delivery • With this, we can create a
restricted variant our broadcast network model by extending the network locale

restricted variant our broadcast network model by extending the network locale Assumptions • if there are any happens-before dependencies between messages, they must be delivered in that order.

restricted variant our broadcast network model by extending the network locale Assumptions • if there are any happens-before dependencies between messages, they must be delivered in that order. • Concurrent messages may be delivered in any order.

You are here

Using Operations in the Network • So far we have
only talked about order of “messages”, but we need to attach these messages to operations

only talked about order of “messages”, but we need to attach these messages to operations (Spoiler: New Locale!)

only talked about order of “messages”, but we need to attach these messages to operations (Spoiler: New Locale!) • We can extend our convergence theorem into our network model by extending the causal-network locale

only talked about order of “messages”, but we need to attach these messages to operations (Spoiler: New Locale!) • We can extend our convergence theorem into our network model by extending the causal-network locale • All we need to do is specialise the variable of messages ‘msg to be a pair of ′msgid × ′oper, and we can ﬁx the msg-id functions to this locale

only talked about order of “messages”, but we need to attach these messages to operations (Spoiler: New Locale!) • We can extend our convergence theorem into our network model by extending the causal-network locale • All we need to do is specialise the variable of messages ‘msg to be a pair of ′msgid × ′oper, and we can ﬁx the msg-id functions to this locale • “fst” use used to return the ﬁrst component “msg-id” from this pair

Using Operations in the Network

Using Operations in the Network • Since this extends network

• It also meets its requirement of the happens-before locale

• It also meets its requirement of the happens-before locale • So the lemmas and deﬁnitions of this locale can use the happens-before relations ≺

• It also meets its requirement of the happens-before locale • So the lemmas and deﬁnitions of this locale can use the happens-before relations ≺ • We can prove that the sequence of message deliveries at any node is consistent with hb-consistent (can show this by preﬁxing “hb-consistent” to the theorem)

Using Operations in the Network theorem hb.hb-consistent (node-deliver-messages (history i))

• Here, node-deliver-messages ﬁlters the history of events at some node to return only messages that were delivered, in order

• Here, node-deliver-messages ﬁlters the history of events at some node to return only messages that were delivered, in order • When the message is delivered, we can take the operation ‘oper from it and use it’s “interpretation” to update that node

Using Operations in the Network • We can then define
the state of some node by defining “apply-operations” (with msg-id) Remember this definition!

You are here

Only Valid Messages • Messages that are broadcast can have
some restrictions by an algorithm, we need a general purpose way of modelling this

Only Valid Messages • Messages that are broadcast can have
some restrictions by an algorithm, we need a general purpose way of modelling this • As they may not be able to be expressed in Isabelle’s type system - New Locale!

Only Valid Messages • Broadcast Only Valid Messages is the
ﬁnal Axiom, all it requires is that ◦ If a node broadcasts a message, it must be valid according to “valid-msg”

Only Valid Messages • Broadcast Only Valid Messages is the
ﬁnal Axiom, all it requires is that ◦ If a node broadcasts a message, it must be valid according to “valid-msg” • Algorithms embedded in this locale are the ones that deﬁnes this predicate for valid message.

You are here

Replication Algorithms

Replication Algorithms Breather?

Replicated Growable Array • Replicated ordered list - supports insert
and delete operations

and delete operations • Here, every insert and delete must identify the position at which the modiﬁcation should take place

and delete operations • Here, every insert and delete must identify the position at which the modiﬁcation should take place • This is because unlike in a non-replicated case, index of list element can change if there are concurrent inserts or deletes

and delete operations • Here, every insert and delete must identify the position at which the modiﬁcation should take place • This is because unlike in a non-replicated case, index of list element can change if there are concurrent inserts or deletes • Insertion - After an existing list element (with a given ID) or the head of the list if there is no ID

and delete operations • Here, every insert and delete must identify the position at which the modiﬁcation should take place • This is because unlike in a non-replicated case, index of list element can change if there are concurrent inserts or deletes • Insertion - After an existing list element (with a given ID) or the head of the list if there is no ID • Deletion - It’s not completely safe to remove a list element, concurrent insertions would not be able to locate this element ◦ Retains tombstone - deletion merely sets a ﬂag to mark it as deleted

and delete operations • Here, every insert and delete must identify the position at which the modiﬁcation should take place • This is because unlike in a non-replicated case, index of list element can change if there are concurrent inserts or deletes • Insertion - After an existing list element (with a given ID) or the head of the list if there is no ID • Deletion - It’s not completely safe to remove a list element, concurrent insertions would not be able to locate this element ◦ Retains tombstone - deletion merely sets a ﬂag to mark it as deleted ◦ Later garbage collection can happen to purge tombstones

Replicated Growable Array

Replicated Growable Array • RGA state at each node -
list of elements

list of elements • Each element is a triple

list of elements • Each element is a triple • Unique ID of the list element, value to inserted, ﬂag that indicates that the element as has been deleted type-synonym ( ′id, ′v) elt = ′id × ′v × bool

list of elements • Each element is a triple • Unique ID of the list element, value to inserted, ﬂag that indicates that the element as has been deleted type-synonym ( ′id, ′v) elt = ′id × ′v × bool • Insert takes three params - Previous state of the list, new element to insert, ID of existing element after which value has to be inserted

list of elements • Each element is a triple • Unique ID of the list element, value to inserted, ﬂag that indicates that the element as has been deleted type-synonym ( ′id, ′v) elt = ′id × ′v × bool • Insert takes three params - Previous state of the list, new element to insert, ID of existing element after which value has to be inserted • None on no existing element with given ID

Replicated Growable Array • The function iterates over the list
and compares the ID for each element

Replicated Growable Array • The function iterates over the list
and compares the ID for each element • When the insertion position is found, “insert-body” is invoked to perform the actual insertion

Replicated Growable Array • In a replicated datatype, several nodes
could be inserting at the same location concurrently

could be inserting at the same location concurrently • These insertions may be processed in a different order by different nodes

could be inserting at the same location concurrently • These insertions may be processed in a different order by different nodes • How do we make it converge?

could be inserting at the same location concurrently • These insertions may be processed in a different order by different nodes • How do we make it converge? • Sort any concurrent insertions at the same position

Replicated Growable Array • The insert-body function skips over elements
with an ID greater than that of the newly added element

Replicated Growable Array • The insert-body function skips over elements
with an ID greater than that of the newly added element • IDs will be in total linear order (speciﬁed above as ‘id::{linorder})

Replicated Growable Array • Implementing delete with the same idea

Replicated Growable Array • Implementing delete with the same idea
• It searches for the element with a given ID and sets its ﬂag to True to mark it as deleted

Are We Forgetting Something?

Reasoning Commutativity • We discussed earlier that the only thing
we need to show for particular algorithms is that all concurrent operations commute.

Reasoning Commutativity • We discussed earlier that the only thing
we need to show for particular algorithms is that all concurrent operations commute. • Easy to show delete always commutes with itself

Reasoning Commutativity • To show insert commutes with itself, we
need to make a few assumptions

need to make a few assumptions • e1 and e2 are of type ′id × ′v × bool

need to make a few assumptions • e1 and e2 are of type ′id × ′v × bool • i1 :: ′id is the position after which e1 should be inserted

need to make a few assumptions • e1 and e2 are of type ′id × ′v × bool • i1 :: ′id is the position after which e1 should be inserted • Similarly, i2 is the position where e2 should be inserted

need to make a few assumptions • e1 and e2 are of type ′id × ′v × bool • i1 :: ′id is the position after which e1 should be inserted • Similarly, i2 is the position where e2 should be inserted • i1 can’t refer to e2 and vice-versa

need to make a few assumptions • e1 and e2 are of type ′id × ′v × bool • i1 :: ′id is the position after which e1 should be inserted • Similarly, i2 is the position where e2 should be inserted • i1 can’t refer to e2 and vice-versa • IDs of the two insertions must be distinct

Reasoning Commutativity • Finally, we need to show that insert-delete
commute

Reasoning Commutativity • Finally, we need to show that insert-delete
commute • Just the constraint that the element to be deleted is not the same as the element to be inserted (insert wins strategy)

Let’s Add This to The Network Model!

Embedding RGA in the network model • To be able
to prove SEC for RGA, we need to embed the insert and delete operations in the network model

Embedding RGA in the network model • To be able
to prove SEC for RGA, we need to embed the insert and delete operations in the network model • We need to deﬁne a datatype for these operations, and an interpretation function (we saw this earlier)

Embedding RGA in the network model • When are these
operations valid? ◦ IDs of the operations must be unique

operations valid? ◦ IDs of the operations must be unique ◦ Whenever an element is referred to be insert or delete, the element must exist

operations valid? ◦ IDs of the operations must be unique ◦ Whenever an element is referred to be insert or delete, the element must exist • We can now deﬁne the “valid-rga-msg” predicate

operations valid? ◦ IDs of the operations must be unique ◦ Whenever an element is referred to be insert or delete, the element must exist • We can now deﬁne the “valid-rga-msg” predicate (remember how we introduced this kind of a type signature in the network-with-constrained-ops locale)

Embedding RGA in the network model • With these deﬁnitions,
we can simply deﬁne the rga Locale by extending the network-with-constrained-ops

we can simply deﬁne the rga Locale by extending the network-with-constrained-ops • Initial state is the empty list

we can simply deﬁne the rga Locale by extending the network-with-constrained-ops • Initial state is the empty list • The validity predicate we described above

Embedding RGA in the network model • We also need
to show that whenever an insert or delete refers to an existing element, there is always a prior insertion operation that created this element:

Embedding RGA in the network model • Since the network
ensures causally ordered delivery, all nodes must deliver some insertion op1 before the dependent operation op2

ensures causally ordered delivery, all nodes must deliver some insertion op1 before the dependent operation op2 • Therefore, all cases where operations do not commute, one happens before another

ensures causally ordered delivery, all nodes must deliver some insertion op1 before the dependent operation op2 • Therefore, all cases where operations do not commute, one happens before another • Or, when they are concurrent, we show that they commute

Embedding RGA in the network model • Finally, we need
to show that Failure case for an interpretation operation is never reached.

Embedding RGA in the network model • Finally, we need
to show that Failure case for an interpretation operation is never reached. • With this, it's easy to show that rga satisﬁes all the requirements of SEC (formally)

You are here

Two Other (Simpler) CRDTs

Increment-Decrement Counter • We can now show that the proof
framework provides reusable components that simplify the proofs for new algorithms

framework provides reusable components that simplify the proofs for new algorithms • Let’s start with the simplest CRDT

framework provides reusable components that simplify the proofs for new algorithms • Let’s start with the simplest CRDT • Increment and Decrement a shared integer counter

Increment-Decrement Counter • Interpretation function!

Increment-Decrement Counter • It becomes an easy exercise to show
commutativity of the operations

commutativity of the operations • We don’t need to extend the network-with-constrained-ops locale ◦ The operations need not even be causally delivered

commutativity of the operations • We don’t need to extend the network-with-constrained-ops locale ◦ The operations need not even be causally delivered • Just with this, we can show that the Counter is a sublocale of strong-eventual-consistency (from which we can obtain convergence and progress theorems)

Increment-Decrement Counter

You are here

Observed Removed Set • ORSet is a well known CRDT
for implementing replicated sets

for implementing replicated sets • Supports two operations - Adding and Removing arbitrary elements in the set

for implementing replicated sets • Supports two operations - Adding and Removing arbitrary elements in the set • Let’s deﬁne the datatype

for implementing replicated sets • Supports two operations - Adding and Removing arbitrary elements in the set • Let’s deﬁne the datatype • ‘id - abstract type of message identiﬁers and ‘a refers to the type of the value that the application wants to add to the set

for implementing replicated sets • Supports two operations - Adding and Removing arbitrary elements in the set • Let’s deﬁne the datatype • ‘id - abstract type of message identiﬁers and ‘a refers to the type of the value that the application wants to add to the set • When element e needs to be added, Add i e is tagged with “i” to distinguish it from other operations that may add the same element to the set

for implementing replicated sets • Supports two operations - Adding and Removing arbitrary elements in the set • Let’s deﬁne the datatype • ‘id - abstract type of message identiﬁers and ‘a refers to the type of the value that the application wants to add to the set • When element e needs to be added, Add i e is tagged with “i” to distinguish it from other operations that may add the same element to the set • When element e needs to be removed, Rem is e is called

for implementing replicated sets • Supports two operations - Adding and Removing arbitrary elements in the set • Let’s define the datatype • ‘id - abstract type of message identifiers and ‘a refers to the type of the value that the application wants to add to the set • When element e needs to be added, Add i e is tagged with “i” to distinguish it from other operations that may add the same element to the set • When element e needs to be removed, Rem is e is called • Contains a set of identifiers “is” identifying all the additions at this element that causally happened-before this removal (“Observed” Remove)

Observed Removed Set • Let’s deﬁne this using it’s datatype

• The name comes from the fact that the algorithm “observes” the state of the node when removing an element

• The name comes from the fact that the algorithm “observes” the state of the node when removing an element • The state at each node is a function that maps each element ‘a to the set of ID’s of operations that have added to that element

• The name comes from the fact that the algorithm “observes” the state of the node when removing an element • The state at each node is a function that maps each element ‘a to the set of ID’s of operations that have added to that element • ‘a is part of the ORset if the set of IDs is non-empty; Init state - λx. {}, ◦ The function that maps every possible element ‘a to the empty set of IDs

Observed Removed Set • When interpreting Add - add the
identiﬁer of that operation to the node state

Observed Removed Set • When interpreting Add - add the
identiﬁer of that operation to the node state • When interpreting Remove - update the node to remove all causally prior Add identiﬁers

Observed Removed Set

Observed Removed Set • Here, state((op-elem oper) := after) is
Isabelle’s syntax for pointwise function update.

Isabelle’s syntax for pointwise function update. • A remove operation effectively undoes the prior additions of that element of the set.

Isabelle’s syntax for pointwise function update. • A remove operation effectively undoes the prior additions of that element of the set. • While leaving any concurrent or later additions of the same element unaffected

Observed Removed Set • Finally, what’s left to specify the
ORset locale, we need to show that Add and Rem use identiﬁers correctly.

ORset locale, we need to show that Add and Rem use identiﬁers correctly. • Add operations should be globally unique (unique ID of the message)

ORset locale, we need to show that Add and Rem use identiﬁers correctly. • Add operations should be globally unique (unique ID of the message) • Rem operation must contain the set of addition identiﬁer in the node ◦ At the moment the Rem operation was issued

Observed Removed Set • With this, we can extend the
network-with-constrained-ops locale to deﬁne the orset locale

Observed Removed Set • With this, we can extend the
network-with-constrained-ops locale to deﬁne the orset locale • Now, for Strong Eventual Consistency, we must show that the “apply-operations” predicate never fails ◦ Easy here since it never returns None

Observed Removed Set • Finally, we need to show that
concurrent operations commute (we’re almost there!)

Observed Removed Set • Finally, we need to show that
concurrent operations commute (we’re almost there!) • Two concurrent adds or two removes are easily veriﬁable

Observed Removed Set • But for add and remove operations,
this is only if the identiﬁer of the addition is not one of the identiﬁers affected by the removal

Observed Removed Set • But for add and remove operations,
this is only if the identiﬁer of the addition is not one of the identiﬁers affected by the removal • To show that holds for all concurrent Add and Rem is a bit more work

Observed Removed Set • We deﬁne added-ids to be the
identiﬁers of all Add operations in a list of delivery events (even if these are subsequently removed)

Observed Removed Set • We define added-ids to be the
identifiers of all Add operations in a list of delivery events (even if these are subsequently removed) • Then, we can show that the set of identifiers in the node state is a subset of added-ids ◦ Add only ever adds IDs to the node state, and Rem only ever remove IDs

Observed Removed Set • From this, we can show that
if Add and Rem are concurrent, ie, the identiﬁer of the Add cannot be in the set of identiﬁers removed by Rem

Observed Removed Set • Now that we have proved that
the assumption of add-rem-commute holds for all concurrent operations

the assumption of add-rem-commute holds for all concurrent operations • Let’s deduce that all concurrent operations commute:

the assumption of add-rem-commute holds for all concurrent operations • Let’s deduce that all concurrent operations commute: • With this, (apply-operations-never-fails and concurrent-operations-commute), we can immediately prove that orset is a sublocale of strong-eventual-consistency.

You are here

Whew! Some Final Remarks

Final Remarks • When we have different nodes concurrently perform
updates, without coordinating with each other (like with SEC)

updates, without coordinating with each other (like with SEC) • We need conﬂict resolution for concurrent updates at a single node

updates, without coordinating with each other (like with SEC) • We need conflict resolution for concurrent updates at a single node • User-defined conflict resolution - leave it for manual resolution by the user

updates, without coordinating with each other (like with SEC) • We need conflict resolution for concurrent updates at a single node • User-defined conflict resolution - leave it for manual resolution by the user • Last Write Wins - Pick the version with the highest timestamp (discard other versions)

updates, without coordinating with each other (like with SEC) • We need conflict resolution for concurrent updates at a single node • User-defined conflict resolution - leave it for manual resolution by the user • Last Write Wins - Pick the version with the highest timestamp (discard other versions) • Arbitrarily choose which operation wins over the other (Add over Remove or Insert over Delete)

Final Remarks • Informal reasoning has repeatedly produced approaches that
fail to converge in certain scenarios - several proofs turned out to be false

fail to converge in certain scenarios - several proofs turned out to be false • Formal veriﬁcation to distributed systems is an active area of research

fail to converge in certain scenarios - several proofs turned out to be false • Formal veriﬁcation to distributed systems is an active area of research • This is a very interesting paper

Additional Papers That Can Be Read • Formal design and
veriﬁcation of operational transformation algorithms for copies convergence • Failed Operational Transform Models ◦ Concurrency control in groupware systems ◦ An Integrating, Transformation-Oriented Approach to Concurrency Control and Undo in Group Editors • Tutorial to Locales and Locale Interpretation • Detecting causal relationships in distributed computations: In search of the holy grail

References • Verifying Strong Eventual Consistency in Distributed Systems •
Github link to all the proofs • CRDTs and the Quest for Distributed Consistency - Martin Kleppmann • Strong Eventual Consistency and Conflict-free Replicated Data Types - Marc Shapiro • Intro to formal verification using traffic signal controllers • Course slides from TUM - For Isabelle • A critique of the CAP theorem - Martin Kleppmann • Operation-based CRDTs: arrays (part 2)

Thank you!

Verifying Strong Eventual Consistency in Distri...

Verifying Strong Eventual Consistency in Distributed Systems

More Decks by Raghav Roy

Featured

Transcript