Slide 1

Slide 1 text

PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL PRACTICAL DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF DEMYSTIFICATION OF CRDTS CRDTS CRDTS CRDTS CRDTS CRDTS AMSTERDAM.SCALA MEETUP, 27TH AUGUST 2015 Nami Nasserazad < > , Didier Liauw , Dmitry Ivanov < > @nami4552 @idajantis

Slide 2

Slide 2 text

DISCLAIMER & WARNING We are neither distributed systems experts, nor hardcore academia guys. There is no Scala-only specific stuff in the talk.

Slide 3

Slide 3 text

WHO ARE WE? Full stack developers* Server Mobile / SDKs for different platforms Infrastructure / AWS * - sorry for a buzz-word :)

Slide 4

Slide 4 text

WHAT IS NAVCLOUD? A cloud based storage service to allow users to seamlessly synchronize trip information (destination, favorite locations, community points of interest, routes etc.) between devices as well as MyDrive website. NavCloud aims to be scalable and reactive while ensuring privacy and security.

Slide 5

Slide 5 text

DEVELOPMENT STACK Scala Akka Spray RabbitMQ Riak AWS

Slide 6

Slide 6 text

SDKS Java stateless SDK Encryption/Decryption Android and iOS statefull SDK Seamlessly working in offline mode Re-establishing push notification channel if connection drops Refreshing session upon token expiration Resumable and bandwidth optimized download/upload for large contents

Slide 7

Slide 7 text

CHARACTERISTICS Devices are not always available Edit/View should work in offline mode: No Strong Consistency Data should be converged to a correct eventual state Order is not guaranteed Bandwidth is limited: Only changes should be transmitted

Slide 8

Slide 8 text

WHAT IS CRDT? DT Data Type “ Bad programmers worry about the code. Good programmers worry about data structures and their relationships ”

Slide 9

Slide 9 text

WHAT IS CRDT? R Replicated CRDT is a family of data structures which has been designed to be distributed

Slide 10

Slide 10 text

WHAT IS CRDT? C Conflict Free Resolving conflicts is done automatically

Slide 11

Slide 11 text

WHAT DOES IT BRING IN PRACTICE? local updates without needing remote synchronization local merge upon receiving data from other nodes guaranteeing that all local merges converge

Slide 12

Slide 12 text

HOW? MERGE MERGE MERGE MERGE MERGE MERGE

Slide 13

Slide 13 text

WHAT IS MERGE? Binary operation on two CRDTs Commutative: x • y = y • x Associative: ( x • y ) • z = x • ( y • z ) Idempotent: x • x = x

Slide 14

Slide 14 text

HOW DOES IT HELP? In Distributed Systems: Order is not guaranteed: No problem: Merge is Commutative and Associative Events can be delivered more than one time: No problem: Merge is Idempotent

Slide 15

Slide 15 text

EXAMPLE G-COUNTER

Slide 16

Slide 16 text

G-COUNTER Each node has a counter Each node should only increase its own counter G-Counter Data Type: An array of counters where each element belongs to a node

Slide 17

Slide 17 text

G-COUNTER Machine A: A:6 B:0 C:0 Machine B: A:0 B:3 C:0 Machine C: A:0 B:0 C:9 Merge: Max on corresponding elements: A:6 B:3 C:9 Total value: Sum of all elements: 6 + 3 + 9 = 18

Slide 18

Slide 18 text

MAX FUNCTION Binary operation on two CRDTs Commutative: x max y = y max x Associative: ( x max y ) max z = x max ( y max z ) Idempotent: x max x = x

Slide 19

Slide 19 text

EXAMPLE G-SET

Slide 20

Slide 20 text

G-SET Each node has a set Each node should add element to its own set G-Set Data Type: An array of sets where each set belongs to a node

Slide 21

Slide 21 text

G-SET Machine A: A:{x, y} B:{} C:{} Machine B: A:{} B:{z} C:{} Machine C: A:{} B:{} C:{a, b, c} Merge: Union on corresponding sets: A:{x, y} B: {z} C:{a, b, c} Total value: Union of all sets: {x, y, z, a, b, c}

Slide 22

Slide 22 text

UNION FUNCTION Binary operation on two CRDTs Commutative: x ∪ y = y ∪ x Associative: ( x ∪ y ) ∪ z = x ∪ ( y ∪ z ) Idempotent: x ∪ x = x

Slide 23

Slide 23 text

HOW DID CRDT HELP IN NAVCLOUD?

Slide 24

Slide 24 text

SYNCHRONIZING FAVORITES SET OF FAVORITES Name Latitude/Longitude ...

Slide 25

Slide 25 text

SYNCHRONIZING FAVORITES USE CASES

Slide 26

Slide 26 text

SYNCHRONIZING FAVORITES USE CASES Users can add, delete or modify Replica's are spread over multiple devices: Client devices might not be connected (yet) Modifications have to be done without synchronization with remote replicas

Slide 27

Slide 27 text

SYNCHRONIZING FAVORITES NAIVE APPROACH AND PROBLEMS Whenever clients make connections to the server local state is sent to the server Synchronization is done on the server by using a Last Write Wins strategy This can result in inconsistencies, due to: Time when updates are sent to the server differs from the update time Network latency Unreliable clocks on the server

Slide 28

Slide 28 text

SYNCHRONIZING FAVORITES CRDTS MATCH VERY WELL WITH OUR SITUATION: CLIENTS Synchronization can be done locally: Changes are made instantly No connection is needed for proper synchronization Synchronization is decentralized Order is not important Latency, failed requests etc. have no affect Implementation becomes easier, because it is contained within a single component

Slide 29

Slide 29 text

SYNCHRONIZING FAVORITES CRDTS MATCH VERY WELL WITH OUR SITUATION: SERVER We use CRDTs everywhere, from the clients to even the individual database nodes We use Riak which supports CRDTs by storing siblings and allows users to resolve them themselves Which means that our database is fully partition tolerant

Slide 30

Slide 30 text

CRDT WHICH ONE: 2P-SET 2 Sets: Add Set Remove Set: Also known as tombstone set Merge: Take the union of the add-sets and remove-sets Lookup: Contains an element if it is in the add-set and not in the remove-set Doesn't work for us: 1. Once removed you cannot add again 2. Mutating values (updates) is not possible

Slide 31

Slide 31 text

CRDT WHICH ONE: 2P-SET A B Add-Set {"cat", "dog" } {"cat", "ape"} Remove-Set {"cat"} {} Merge Add-Set {"cat", "dog", "ape"} Remove-Set {"cat"} Lookup {"dog", "ape"}

Slide 32

Slide 32 text

CRDT WHICH ONE: LWW-ELEMENT-SET Attaches a timestamp to each element You can add again by adding the element with a higher timestamp than the one in the remove-set Merge: Take the union of the add-sets and remove-sets Lookup: Contains the element if it is in add- set and not in remove-set with a higher timestamp Still doesn't work for us, because mutating is

Slide 33

Slide 33 text

CRDT WHICH ONE: LWW-ELEMENT-SET A B Add-Set {(1,"cat"), (1,"dog")} {(5,"cat"), (1,"ape")} Remove- Set {(3,"cat")} {(1,"cat") Merge Add-Set {(1,"cat"), (5,"cat"), (1,"dog"), (1,"ape")} Remove- Set {(1,"cat"),(3,"cat")}

Slide 34

Slide 34 text

CRDT WHICH ONE: OR-SET You can add again Store elements with a unique identifier Deleting an element adds it to the remove-set for all the (element,id) in the add-set Merge: Take the union of the add-sets and remove-sets Lookup: Contains the element if there is an element in the add-set with an identifier that is not in the remove-set Doesn't work for us, because it doesn't support updates

Slide 35

Slide 35 text

CRDT WHICH ONE: OR-SET A B Add-Set {(#a,"cat"), (#b,"dog")} {(#c,"cat"), (#d,"ape")} Remove- Set {(#a,"cat")} {(#a,"cat") Merge Add-Set {(#a,"cat"), (#c,"cat"), (#b,"dog"), (#d,"ape")} Remove-Set {(#a,"cat")} Lookup {"cat", "dog", "ape"}

Slide 36

Slide 36 text

CRDT OUR-SET Combination of all the sets Store elements with a unique identifier This identifier is actually used to identify an element Element can be changed if identifier remains the same Updates are possible! Store elements with a timestamp Is updated on any change

Slide 37

Slide 37 text

CRDT OUR-SET Single Set No add-set and removed sets Replaced by a removed flag Set can only contain one element with a particular id Element with highest timestamp wins Merge: Take union of the two sets and for every element with the same identifier take only the highest one Lookup: Contains the element if there is an element with the same id and the removed flag is

Slide 38

Slide 38 text

CRDT OUR-SET A B Set {(#a,1,"cat",removed), (#b,2,"dog",removed)} {(#a,5,"tiger"), (#c,1,"ape"), (#b,1,"dog"} Merge Set {(#a,5,"tiger"), (#b,2,"dog",removed), (#c,1,"ape")} Lookup {"tiger", "ape"}

Slide 39

Slide 39 text

IMPLEMENTATION CRDT MODEL: FAVORITE ID to uniquely identify a favorite Timestamp to indicate when the last change was made Removed Flag to indicate that the favorite has been removed Name Latitude/Longitude ...

Slide 40

Slide 40 text

IMPLEMENTATION METHODS Add a compare function, which compares all the fields in order of priority: Timestamp Removed flag ... Add an equals and hash function

Slide 41

Slide 41 text

IMPLEMENTATION USING THE CRDT Use the same algorithm everywhere As simple as calling the merge function

Slide 42

Slide 42 text

IMPLEMENTATION USING THE CRDT //Synchronize doesn't return anything because client is already synced //This is purely for the server and database def synchronize(fromClient: CRDTSet, database: CRDTcomp onent): Unit = { val changedSet = fromClient val currentSet = database.crdtset val newSet = currentSet.merge(changedSet) database.push(newSet) //This is fire and forget }

Slide 43

Slide 43 text

CONSIDERATIONS & LIMITATIONS

Slide 44

Slide 44 text

"WHAT ABOUT GARBAGE?" CRDTs tend to grow because of tombstones Deleted Favorite in the Set == Tombstone A potentially unbounded growth. Case: MyDrive user with ~3000 deleted favorites and 5 non-deleted ones. -> 1Mb Favorites.json

Slide 45

Slide 45 text

"WHAT ABOUT GARBAGE?" Solution #1: Prune deleted favorites But when? Requirement: all nodes holding a Favorites set should have seen a deleted element before it can be pruned. Otherwise deleted elements can be resurrected.

Slide 46

Slide 46 text

"WHAT ABOUT GARBAGE?" Client-awareness: capturing a timestamp of the last sync between a client and the service. if (clients.forAll(_.lastSyncTimestamp > deletedFavorite.lastUpdat edTimestamp)) { favorites.drop(deletedFavorite) }

Slide 47

Slide 47 text

"WHAT ABOUT GARBAGE?" Solution #2: Sending only diff upon any update. Client has a set of [A', B', C']; Server has a set of [A'', B''', C']. Client modifies and sends [B''] Before: responding with a full merged set [A'', B''', C']. We introduced a scoped diff: Now: responding with a diff set [B'''] as B'' update from the client has lost to B''' on the server.

Slide 48

Slide 48 text

TROUBLE WITH TIME There is no such thing as reliable time. (c) Jonas Bonér, "Life Beyond the Illusion of Present" Important: Causality and events Ordering. “ Tracking time is actually tracking causality. ”

Slide 49

Slide 49 text

TROUBLE WITH TIME A time that is just good enough. Ordering updates between different nodes: If GPS clock is available -> use it (PND case). Prefer the server time to a client local time. WARN: conflicts may happen if two or more devices are modifying the same Favorite element concurrently.

Slide 50

Slide 50 text

TROUBLE WITH TIME Ordering updates within a node boundary: Timestamp field as a logical clock. Timestamp should always grow monotonically. "+1 Strategy" def getFavoriteTimestamp(favorite: Favorite): Long = { Math.max(client.retrieveServerTimestamp(), favorite.lastModified + 1) }

Slide 51

Slide 51 text

ONE 'MERGE' TO RULE THEM ALL Client and server should behave the same way when merging Favorites CRDT states. == When given the same input, their merge functions should emit the same results. WARN: divergence can lead to endless synchronisation loops!

Slide 52

Slide 52 text

ONE 'MERGE' TO RULE THEM ALL Sharing common CRDT-related code (Classes, merge/diff/equals/compare logic) FTW. Case #1: Scala.JS client with Web Server in Scala. Richard Dallaway Case #2: Using a TCK library to verify client compatibility. "Towards Browser and Server Utopia with Scala.JS"

Slide 53

Slide 53 text

RIAK & CRDTS Data Agnostic vs Data Awareness Counters Sets Maps Flags * Registers * * - Embedded within a Map only

Slide 54

Slide 54 text

RIAK & CRDTS Pros Simplicity. No 'Read -> Merge -> Write' code is needed on the server. Composability: Most of the data can be modelled by combining supported primitive types. Proven and tested*. * - Basho is serious about testing their stuff: "Distributed data structures with Coq", Christopher Meiklejohn.

Slide 55

Slide 55 text

RIAK & CRDTS Cons No fine-grained merge: lack of merge strategy control on the server. Clients complexity: clients have to carry a Data Type context (a-la 'causal context' with Vector Clocks). Riak 2.0+ only. * * - For those who is still on Riak 1.4. ^_^

Slide 56

Slide 56 text

CONCLUSIONS Academia sometimes is not as scary as it seems to pragmatic devs. Look for the best & simplest solutions. Understand your solution limitations. Analyse and monitor real usage. Always search how to tune & improve your solutions

Slide 57

Slide 57 text

USEFUL REFERENCES , - Noel Welsh, 2013. , Christopher Meiklejohn , - Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski, 2011 , - Meiklejohn & Van Roy, 2015 CRDTs for fun and eventual profit Readings in conflict-free replicated data types A comprehensive study of Convergent and Commutative Replicated Data Types Lasp: A language for distributed, coordination-free programming

Slide 58

Slide 58 text

Image credit: Dex Media

Slide 59

Slide 59 text

WE ARE HIRING! Interested in hacking on this stuff? http://www.tomtom.jobs/