Scaling Data @ Munich Data Engineering Meetup

Scaling Data @ Munich Data Engineering Meetup

Let's look at how databases can be distributed to multiple nodes. We will start with covering consistency models like strong consistency and eventual consistency. We will then talk about different ways of distributing databases and the reasons for distributing databases: Scaling reads and/or writes, reliability, big data sets and geographical distribution. Distributed databases have disadvantages too: We will talk about the influence of networks and clocks on our database. And yes, we will cover the CAP theorem and what it means.

B049f961d55097ef9104ff4b275a517b?s=128

Lucas Dohmen

March 15, 2018
Tweet

Transcript

  1. Scaling Data Lucas Dohmen Senior Consultant @ INNOQ 1

  2. 2 / Motivation Why? Scaling Reads Scaling Writes Geographical Distribution

    Big Data Sets Failure Resistance
  3. Lucas Dohmen • Senior Consultant at INNOQ • Everything Web

    & Databases • Previously worked at ArangoDB • http://faucet-pipeline.org 3
  4. Structure 1. Consistency 2. Scaling 3. Trouble 4

  5. 5 Part 1: Consistency

  6. 6 / Consistency 

  7. Linearizable 7 / Consistency a b w w’

  8. Linearizable 7 / Consistency a b w w’ r r’

  9. Linearizable 7 / Consistency a b w w’ r r’

    r r’
  10. Linearizable 7 / Consistency a b w w’ r r’

    r r’ r r’
  11. Linearizable 7 / Consistency a b w w’ r r’

    r r’ r r’
  12. Consistency Models: Which histories are valid? 8 / Consistency Read()

    => a Write(b) Read() => b Read() => b Read() => a Write(b) Read() => a Read() => a Read() => b Write(b) Read() => b Read() => b
  13. https:/ /aphyr.com/posts/313- strong-consistency-models 9 / Consistency

  14. 10 / Consistency strictly serializable serializable linearizable sequential repeatable read

    SI causal PRAM RL RV Highly Available Transactions: Virtues and Limitations – Bailis et al.
  15. 11 Part 2: Scaling

  16. How do we scale web applications? • Share nothing between

    application servers • Put behind a load balancer • Add servers 12 / Scaling Applications Load Balancer App App App App Database
  17. How do we scale web applications? • Share nothing between

    application servers • Put behind a load balancer • Add servers 12 / Scaling Applications Load Balancer App App App App Database
  18. Share Nothing for Databases? • Possible & Underused • Separate

    databases for separate data • If we need to join data, we need to join in the application 13 / Scaling / Sharding MySQL Redis
  19. Replication 14

  20. Replication = Same data on multiple nodes 15 / Scaling

    / Replication
  21. Single Leader • Failover • Read scaling • No write

    scaling 16 / Scaling / Replication Leader Follower
  22. Sync or Async Replication? • Trade-off between consistency & speed

    • Sync: Every follower we add decreases performance • Async: If our leader dies and the replication is not done, we have lost acknowledged data 17 / Scaling / Replication
  23. Examples • Redis • MariaDB • PostgeSQL • MongoDB 18

  24. Multi Leader • Failover • Read & write scaling 19

    / Scaling / Replication Leader Leader
  25. Write Conflicts • Two leaders can accept a conflicting write

    • We usually resolve them when reading • Do we have all information we need to resolve a conflict at read time? 20 / Scaling / Replication
  26. Examples • CouchDB • Percona Server for MySQL • ArangoDB

    21
  27. Leaderless • Failover • Read & write scaling 22 /

    Scaling / Replication
  28. Quorum • Clients write to multiple nodes at once •

    When more than n nodes acknowledged the write, the write is successful (n is the write quorum) • When we read, we read from m nodes (m is the read quorum) 23 / Scaling / Replication
  29. Examples • riak • Cassandra • aerospike 24

  30. Sharding 25 / Scaling / Sharding

  31. Sharding = Each node only has part of the data

    26 / Scaling / Sharding
  32. Sharding by Primary Key 27 / Scaling / Sharding A-G

    H-L M-Z
  33. Sharding by Hashed Primary Key • Equal distribution to all

    shards 28 / Scaling / Sharding
  34. Combining Replication & Sharding 29 / Scaling Replicas Shards Shard

    A Shard A Shard A Shard A Shard A Shard A Shard A Shard B Shard A Shard A Shard A Shard C
  35. 30 Part 3: Trouble

  36. 31 / Trouble SHARED MUTABLE STATE IS EVIL

  37. Clocks are monotonic & synchronized 32 / Trouble

  38. Clocks are monotonic & synchronized 32 / Trouble leap seconds

  39. Clocks are monotonic & synchronized 32 / Trouble leap seconds

    NTP fails
  40. Clocks are monotonic & synchronized 32 / Trouble leap seconds

    NTP fails NTP Sync 㱺 Going back in time
  41. Clocks are monotonic & synchronized 32 / Trouble leap seconds

    NTP fails NTP Sync 㱺 Going back in time NTP is an estimation
  42. Clocks are monotonic & synchronized 32 / Trouble leap seconds

    NTP fails NTP Sync 㱺 Going back in time NTP is an estimation
  43. 33 / Trouble DO NOT USE WALL CLOCKS FOR ORDERING

  44. Solution: Vector Clocks 34 / Trouble

  45. The network is reliable 35 / Trouble

  46. 36 / Trouble Problem IP TCP Reordered Messages ✘ ✔

    (Sequence Numbers) Lost Messages ✘ ✔ (ack) Duplicated Messages ✘ ✔ (Sequence Numbers) Delayed Messages ✘ ✘
  47. The network is reliable 37 / Trouble

  48. The network is reliable 37 / Trouble

  49. The network is reliable 37 / Trouble packages can take

    a looooong time
  50. The network is reliable 37 / Trouble packages can take

    a looooong time the network can fail partially/entirely
  51. 38 / Trouble Node Failure Node Recovery Crash Amnesia …

    ? ?
  52. Availability 39 / Trouble

  53. Availability vs. Consistency 40 / Trouble

  54. 41 / Trouble YOUR NODES WILL FAIL YOUR NETWORK WILL

    FAIL
  55. 42 / Trouble A B C D

  56. 42 / Trouble A B C D

  57. You have two choices • Stop taking requests • Not

    available, but consistent 43 / Trouble • Continue taking requests • Available, but not consistent CP AP A B C D A B C D a=1 a=2 Sorry we’re CLOSED
  58. 44 / Trouble not possible with total availability possible with

    total or sticky availability strictly serializable serializable linearizable sequential repeatable read SI causal PRAM RL RV Highly Available Transactions: Virtues and Limitations – Bailis et al.
  59. Wrap-Up 45

  60. Remember! • Nodes will fail • The network will fail

    • Clocks aren’t reliable 46 / Wrap Up
  61. What are your requirements? 47 / Wrap Up Scaling Reads

    Scaling Writes Geographical Distribution Big Data Sets Failure Resistance Inconsistency
  62. Thank you! • @moonbeamlabs on Twitter • Photo Credit •

    Slide 5: Shoot N' Design on Unsplash • Slide 11: Andy Hall on Unsplash • Slide 30: Hermes Rivera on Unsplash 48