Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SyncFree: Large Scale Computation without Synch...

SyncFree: Large Scale Computation without Synchronization

RICON 2014, Annette Bieniusa and Christopher Meiklejohn

Christopher Meiklejohn

October 29, 2014
Tweet

More Decks by Christopher Meiklejohn

Other Decks in Programming

Transcript

  1. SyncFree Large-Scale Computation Without Synchronization Annette Bieniusa and Christopher Meiklejohn

    University of Kaiserslautern and Basho Technologies, Inc. October 28, 2014
  2. Outline Motivation CRDTs - a success story Challenges Atomic updates

    Divergence control Computations Optimizations Programming models Provably correct! Up and running! Conclusion - Where to go from here?
  3. Replicated Data Types Typically key-value stores operate with opaque objects

    Problem: identifying and resolving concurrent operations Even worse in multi-DC settings Semantic resolution Need to be provided by the application server / client Dynamo; deterministic, but not intuitive
  4. CRDTs in Riak 2.0 Conflict-Free/Confluent/Commutative/Convergent Replicated Data Types Library riak

    dt implements different types of state-based CRDTs: Counters (G-Counter, PN-Counter) Sets (G-Set, 2P-Set, OR-Set, ORSWOT) Flags Registers (LWW-Register, MV-Register) Maps Subset exposed in Riak 2.0 Concurrent updates are merged following principled techniques All problems solved?!
  5. The SyncFree Project EU Project on large-scale computation without synchronization

    Project consortium of academic and industry partners
  6. Purchasing items Scenario: Virtual wallet User can exchange (virtual) currency

    for vouchers, game items, ... Operation should be atomic No money lost! No voucher used twice! How can we achieve this under eventual consistency?
  7. Technology: CRDT Composition Compose CRDTs that are to be updated

    together Ad-hoc solutions are error prone Map CRDT allows to compose CRDT objects via embedding Guarantees atomic update But: Deep embedding can lead to large objects
  8. Technology: Transactions Transactions with weak, yet helpful guarantees such as

    causal consistency Snapshot reads allow for consistent observation of objects Allows for atomic and dynamic combinations of updates across many objects Needs careful engineering to have well-behaved metadata while being fault-tolerant
  9. Handing out limited resources Scenario: (Shared) Virtual wallet User should

    not spend more money than she has on her account Balance checking and reducing would require global synchronization operations such as 2PC Impossible under network partitions!
  10. Bounded divergence Scenario: Ad counting Advertisement should be displayed a

    limited number of times to users in a certain area / country Keeping track of how often it is displayed requires counters to deal with high contention Estimated count of delivered ads should not diverge too much from actual number But exact number is not necessary
  11. Technology: Bounded CRDTs Idea: Extend replicas of the shared data

    item with leases / reservations / escrow Pro-actively distribute them among the replicas Fast, local operations possible when reservation is locally available Allocation of leases in the background using strongly consistent operations Precise on bound
  12. Technology: Adaptive CRDTs Orthogonal technique Applied as optimization on top

    of bounded CRDTs Adaptive CRDTs restrict divergence by reducing the number of replicas Adapting replication schemes probabilistically, over time, according to usage patterns, ... Changing the number of replicas or moving requires coordination Reduces divergence, impacts availability
  13. Computation Scenario: Leaderboard Database of users playing a game; compute

    the top 10 by score Matchmaking between cohorts by rank Aggregate data from all replicas of all objects across multiple DCs Current approaches are ad hoc
  14. Technology: Deterministic Dataflow Idea: Connect CRDTs together in a mechanism

    which preserves their strong properties Eventual consistency applied to computations Different evaluation strategies (previously discussed in Chris Meiklejohn’s talk)
  15. Reducing bandwidth Carlos Baquero’s talk tomorrow Forward only operations and

    replay them at the other replicas (POLog) State-based optimizations (δ-CRDT) Keep metadata size small and well-behaved
  16. How to make use of CRDTs Need some way to

    employ CRDTs in applications State-of-art: Simple operational interface for updates and queries But CRDT semantics can lead to much more powerful programming methodology Account for (static) analyses and tools (correct by construction)
  17. Deterministic dataflow programming Methodology for programming with CRDTs Fault-tolerant, replicated

    application code Applications should be correct under any execution
  18. Theoretical Models Abstracting from real-world mess Supporting programmers to reason

    about distributed systems Analyses to test and/or verify the correctness of applications Verifying applications on top of architectures with replication is challenging Models on different levels - from core libraries to full applications
  19. Example: Observed-Remove Set Specification: Remove operation deletes only elements from

    the set that have been observed at the replica issuing the remove op When concurrently adding the element (again), it will remain in the set
  20. A Formal Model for CRDTs Semantics for CRDTs based on

    causal history Building on states, operations and merges System abstraction: Reliable messages between replicating nodes All specifications encoded in theorem proverIsabelle/HOL and proved correct
  21. A Formal Model for Applications Build specifications for applications and

    use cases (TLA+ = model checker for Temporal Logic of Actions) Virtual wallet Ad counter Leaderboard Encode application invariants in the specification No loss of money Positive balance Verify invariants hold by using a model checker Verifies both individual CRDTs and interaction between CRDTs
  22. The SyncFree Platform Experimental platform Development, verification, and evaluation of

    algorithms for update propagation Network architecture: Replicated DCs + app servers to which clients connect Core features: Fault-tolerance, scalability, modularity Written in Erlang with riak core
  23. Riak Build prototypes which can operate on top of Riak

    Compare new approaches to existing ad hoc approaches Leverage industrial use cases to drive academic research
  24. Real-world Evaluation Existing use cases are driven from actual use

    cases at Rovio, Trifork, ... Future evaluation will be based on (1 -2 years): Ease of development of correct applications Understandability of developed programs Testing applications at scale with real traffic
  25. Outlook Extending to the mobile world: Support for offline mode

    in clients Reduce bandwidth by employing partial replication, sharding, δ-mutations, ... Moving out of the cloud: Gossiping updates in P2P scenarios
  26. The SyncFree Project SyncFree on GitHub: https://github.com/syncfree Project overview: https://syncfree.lip6.fr

    Research publications available: https://syncfree.lip6.fr/index.php/publications