Slide 1

Slide 1 text

Scaling Data Lucas Dohmen Senior Consultant @ INNOQ 1

Slide 2

Slide 2 text

2 / Motivation Why? Scaling Reads Scaling Writes Geographical Distribution Big Data Sets Failure Resistance

Slide 3

Slide 3 text

Lucas Dohmen • Senior Consultant at INNOQ • Everything Web & Databases • Previously worked at ArangoDB • http://faucet-pipeline.org 3

Slide 4

Slide 4 text

Structure 1. Consistency 2. Scaling 3. Trouble 4

Slide 5

Slide 5 text

5 Part 1: Consistency

Slide 6

Slide 6 text

6 / Consistency

Slide 7

Slide 7 text

Linearizable 7 / Consistency a b w w’

Slide 8

Slide 8 text

Linearizable 7 / Consistency a b w w’ r r’

Slide 9

Slide 9 text

Linearizable 7 / Consistency a b w w’ r r’ r r’

Slide 10

Slide 10 text

Linearizable 7 / Consistency a b w w’ r r’ r r’ r r’

Slide 11

Slide 11 text

Linearizable 7 / Consistency a b w w’ r r’ r r’ r r’

Slide 12

Slide 12 text

Consistency Models: Which histories are valid? 8 / Consistency Read() => a Write(b) Read() => b Read() => b Read() => a Write(b) Read() => a Read() => a Read() => b Write(b) Read() => b Read() => b

Slide 13

Slide 13 text

https:/ /aphyr.com/posts/313- strong-consistency-models 9 / Consistency

Slide 14

Slide 14 text

10 / Consistency strictly serializable serializable linearizable sequential repeatable read SI causal PRAM RL RV Highly Available Transactions: Virtues and Limitations – Bailis et al.

Slide 15

Slide 15 text

11 Part 2: Scaling

Slide 16

Slide 16 text

How do we scale web applications? • Share nothing between application servers • Put behind a load balancer • Add servers 12 / Scaling Applications Load Balancer App App App App Database

Slide 17

Slide 17 text

How do we scale web applications? • Share nothing between application servers • Put behind a load balancer • Add servers 12 / Scaling Applications Load Balancer App App App App Database

Slide 18

Slide 18 text

Share Nothing for Databases? • Possible & Underused • Separate databases for separate data • If we need to join data, we need to join in the application 13 / Scaling / Sharding MySQL Redis

Slide 19

Slide 19 text

Replication 14

Slide 20

Slide 20 text

Replication = Same data on multiple nodes 15 / Scaling / Replication

Slide 21

Slide 21 text

Single Leader • Failover • Read scaling • No write scaling 16 / Scaling / Replication Leader Follower

Slide 22

Slide 22 text

Sync or Async Replication? • Trade-off between consistency & speed • Sync: Every follower we add decreases performance • Async: If our leader dies and the replication is not done, we have lost acknowledged data 17 / Scaling / Replication

Slide 23

Slide 23 text

Examples • Redis • MariaDB • PostgeSQL • MongoDB 18

Slide 24

Slide 24 text

Multi Leader • Failover • Read & write scaling 19 / Scaling / Replication Leader Leader

Slide 25

Slide 25 text

Write Conflicts • Two leaders can accept a conflicting write • We usually resolve them when reading • Do we have all information we need to resolve a conflict at read time? 20 / Scaling / Replication

Slide 26

Slide 26 text

Examples • CouchDB • Percona Server for MySQL • ArangoDB 21

Slide 27

Slide 27 text

Leaderless • Failover • Read & write scaling 22 / Scaling / Replication

Slide 28

Slide 28 text

Quorum • Clients write to multiple nodes at once • When more than n nodes acknowledged the write, the write is successful (n is the write quorum) • When we read, we read from m nodes (m is the read quorum) 23 / Scaling / Replication

Slide 29

Slide 29 text

Examples • riak • Cassandra • aerospike 24

Slide 30

Slide 30 text

Sharding 25 / Scaling / Sharding

Slide 31

Slide 31 text

Sharding = Each node only has part of the data 26 / Scaling / Sharding

Slide 32

Slide 32 text

Sharding by Primary Key 27 / Scaling / Sharding A-G H-L M-Z

Slide 33

Slide 33 text

Sharding by Hashed Primary Key • Equal distribution to all shards 28 / Scaling / Sharding

Slide 34

Slide 34 text

Combining Replication & Sharding 29 / Scaling Replicas Shards Shard A Shard A Shard A Shard A Shard A Shard A Shard A Shard B Shard A Shard A Shard A Shard C

Slide 35

Slide 35 text

30 Part 3: Trouble

Slide 36

Slide 36 text

31 / Trouble SHARED MUTABLE STATE IS EVIL

Slide 37

Slide 37 text

Clocks are monotonic & synchronized 32 / Trouble

Slide 38

Slide 38 text

Clocks are monotonic & synchronized 32 / Trouble leap seconds

Slide 39

Slide 39 text

Clocks are monotonic & synchronized 32 / Trouble leap seconds NTP fails

Slide 40

Slide 40 text

Clocks are monotonic & synchronized 32 / Trouble leap seconds NTP fails NTP Sync 㱺 Going back in time

Slide 41

Slide 41 text

Clocks are monotonic & synchronized 32 / Trouble leap seconds NTP fails NTP Sync 㱺 Going back in time NTP is an estimation

Slide 42

Slide 42 text

Clocks are monotonic & synchronized 32 / Trouble leap seconds NTP fails NTP Sync 㱺 Going back in time NTP is an estimation

Slide 43

Slide 43 text

33 / Trouble DO NOT USE WALL CLOCKS FOR ORDERING

Slide 44

Slide 44 text

Solution: Vector Clocks 34 / Trouble

Slide 45

Slide 45 text

The network is reliable 35 / Trouble

Slide 46

Slide 46 text

36 / Trouble Problem IP TCP Reordered Messages ✘ ✔ (Sequence Numbers) Lost Messages ✘ ✔ (ack) Duplicated Messages ✘ ✔ (Sequence Numbers) Delayed Messages ✘ ✘

Slide 47

Slide 47 text

The network is reliable 37 / Trouble

Slide 48

Slide 48 text

The network is reliable 37 / Trouble

Slide 49

Slide 49 text

The network is reliable 37 / Trouble packages can take a looooong time

Slide 50

Slide 50 text

The network is reliable 37 / Trouble packages can take a looooong time the network can fail partially/entirely

Slide 51

Slide 51 text

38 / Trouble Node Failure Node Recovery Crash Amnesia … ? ?

Slide 52

Slide 52 text

Availability 39 / Trouble

Slide 53

Slide 53 text

Availability vs. Consistency 40 / Trouble

Slide 54

Slide 54 text

41 / Trouble YOUR NODES WILL FAIL YOUR NETWORK WILL FAIL

Slide 55

Slide 55 text

42 / Trouble A B C D

Slide 56

Slide 56 text

42 / Trouble A B C D

Slide 57

Slide 57 text

You have two choices • Stop taking requests • Not available, but consistent 43 / Trouble • Continue taking requests • Available, but not consistent CP AP A B C D A B C D a=1 a=2 Sorry we’re CLOSED

Slide 58

Slide 58 text

44 / Trouble not possible with total availability possible with total or sticky availability strictly serializable serializable linearizable sequential repeatable read SI causal PRAM RL RV Highly Available Transactions: Virtues and Limitations – Bailis et al.

Slide 59

Slide 59 text

Wrap-Up 45

Slide 60

Slide 60 text

Remember! • Nodes will fail • The network will fail • Clocks aren’t reliable 46 / Wrap Up

Slide 61

Slide 61 text

What are your requirements? 47 / Wrap Up Scaling Reads Scaling Writes Geographical Distribution Big Data Sets Failure Resistance Inconsistency

Slide 62

Slide 62 text

Thank you! • @moonbeamlabs on Twitter • Photo Credit • Slide 5: Shoot N' Design on Unsplash • Slide 11: Andy Hall on Unsplash • Slide 30: Hermes Rivera on Unsplash 48