Problem: identifying and resolving concurrent operations Even worse in multi-DC settings Semantic resolution Need to be provided by the application server / client Dynamo; deterministic, but not intuitive
together Ad-hoc solutions are error prone Map CRDT allows to compose CRDT objects via embedding Guarantees atomic update But: Deep embedding can lead to large objects
causal consistency Snapshot reads allow for consistent observation of objects Allows for atomic and dynamic combinations of updates across many objects Needs careful engineering to have well-behaved metadata while being fault-tolerant
not spend more money than she has on her account Balance checking and reducing would require global synchronization operations such as 2PC Impossible under network partitions!
limited number of times to users in a certain area / country Keeping track of how often it is displayed requires counters to deal with high contention Estimated count of delivered ads should not diverge too much from actual number But exact number is not necessary
item with leases / reservations / escrow Pro-actively distribute them among the replicas Fast, local operations possible when reservation is locally available Allocation of leases in the background using strongly consistent operations Precise on bound
of bounded CRDTs Adaptive CRDTs restrict divergence by reducing the number of replicas Adapting replication schemes probabilistically, over time, according to usage patterns, ... Changing the number of replicas or moving requires coordination Reduces divergence, impacts availability
the top 10 by score Matchmaking between cohorts by rank Aggregate data from all replicas of all objects across multiple DCs Current approaches are ad hoc
which preserves their strong properties Eventual consistency applied to computations Different evaluation strategies (previously discussed in Chris Meiklejohn’s talk)
employ CRDTs in applications State-of-art: Simple operational interface for updates and queries But CRDT semantics can lead to much more powerful programming methodology Account for (static) analyses and tools (correct by construction)
about distributed systems Analyses to test and/or verify the correctness of applications Verifying applications on top of architectures with replication is challenging Models on different levels - from core libraries to full applications
causal history Building on states, operations and merges System abstraction: Reliable messages between replicating nodes All specifications encoded in theorem proverIsabelle/HOL and proved correct
use cases (TLA+ = model checker for Temporal Logic of Actions) Virtual wallet Ad counter Leaderboard Encode application invariants in the specification No loss of money Positive balance Verify invariants hold by using a model checker Verifies both individual CRDTs and interaction between CRDTs
algorithms for update propagation Network architecture: Replicated DCs + app servers to which clients connect Core features: Fault-tolerance, scalability, modularity Written in Erlang with riak core
cases at Rovio, Trifork, ... Future evaluation will be based on (1 -2 years): Ease of development of correct applications Understandability of developed programs Testing applications at scale with real traffic