Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Modelling prototypes to critical systems with Cassandra

Modelling prototypes to critical systems with Cassandra

Cassandra provides highly scalable resilient data storage for a variety of needs, but its effectiveness is often limited by the way we model our data, design schemas, and determine usage patterns. In this talk Matt will look at some of the design patterns Monzo have adopted and developed, from the prototype stage, to their highly scalable core banking systems.

mattheath

March 07, 2019
Tweet

More Decks by mattheath

Other Decks in Programming

Transcript

  1. “construct a highly agile and highly available service from ephemeral

    and assumed broken components” - Adrian Cockcroſt
  2. How does Monzo operate? - AWS, GCP, and physical data

    centres - Cloud Native technologies:
 Kubernetes, Docker, Calico, Cassandra, Kafka, NSQ, Etcd, Prometheus, Jaeger, Envoy, Elasticsearch… - Go based micro services
 ⛅
  3. Key Value Value Value Column A Column B Column C

    Timestamp Timestamp Timestamp
  4. CREATE TABLE IF NOT EXISTS account ( id text, userid

    text, created timestamp, currency text, country text, description text, type text, PRIMARY KEY ((id)) );
  5. Key Value Value Value Column A Column B Column C

    Timestamp Timestamp Timestamp
  6. CREATE TABLE IF NOT EXISTS accounts_by_userid ( id text, userid

    text, created timestamp, currency text, country text, description text, type text, PRIMARY KEY ((userid), id) );
  7. CREATE TABLE IF NOT EXISTS transaction ( id text, accountid

    text, created timestamp, currency text, amount bigint, description text, PRIMARY KEY ((id)) );
  8. CREATE TABLE IF NOT EXISTS transaction_by_account ( id text, accountid

    text, created timestamp, currency text, amount bigint, description text, PRIMARY KEY ((accountid), id) );
  9. Bucket 2 3 4 5 7 Composite Partition Key (Time

    range and Account ID) PRIMARY KEY ((accountid, timebucket), created, id)
  10. Day 1 2 3 4 5 7 Composite Partition Key

    (Time range and Account ID) PRIMARY KEY ((accountid, timebucket), created, id)
  11. Day 1 2 3 4 5 7 9 12 15

    16 17 18 19 20 Day 2 21 22 24 23
  12. Day 1 2 3 4 5 7 9 12 15

    16 17 18 19 20 Day 2 21 22 24 23 “Return last 100 transactions”
  13. 9 12 15 16 17 18 19 20 Day 3

    21 22 24 23 Day 2 Day 1 2 3 4 5 7 No data in this time period
  14. 9 12 15 16 17 18 19 20 Day 1000

    21 22 24 23 Day 999 Day 1 2 3 4 5 7 Day 998 Day 3 Day 2 … No data in this time period
  15. 9 12 15 16 17 18 19 20 Day 1000

    21 22 24 23 Day 999 Day 1 Day 998 Day 3 Day 2 … Did the data ever exist?! When do we stop…?
  16. Day 1 2 3 4 5 7 9 12 15

    16 17 18 19 20 Day 2 21 22 24 23
  17. Day 1 2 3 4 5 7 9 12 15

    16 17 18 19 20 Day 2 21 22 24 23 Hot Partition Key
  18. Is your data predictable? How do you choose your bucket

    size?
 Ensure correct partitioning!
  19. Bucket 2 3 4 5 7 Composite Partition Key (Time

    range and Account ID) PRIMARY KEY ((accountid, timebucket), created, id)
  20. Bucket 2 3 4 5 7 Time range: We need

    to know the timestamp to read PRIMARY KEY ((accountid, timebucket), created, id)
  21. Bucket 2 3 4 5 7 Time range: We need

    to know the timestamp to read PRIMARY KEY ((accountid, timebucket), created, id) accountid = acc_00009Wd3Yeh2O329bFTVHF
  22. accountid = acc_00009Wd3Yeh2O329bFTVHF Flake IDs = Time based lexically sortable

    IDs Base62 encoded 128bit Int
 eg 26341991268378369512474991263748
  23. accountid = acc_00009Wd3Yeh2O329bFTVHF Flake IDs = Time based lexically sortable

    IDs Base62 encoded 128bit Int
 eg 26341991268378369512474991263748 64 bits - Time in ms since epoch
 48 bits - Worker ID 16 bits - Sequence ID
  24. accountid = acc_00009Wd3Yeh2O329bFTVHF transactionid = tx_00009gEBzyFoAtFYllr9Qf PRIMARY KEY ((bucket), flake_created,

    transactionid) PRIMARY KEY ((accountid, bucket), flake_created, transactionid)
  25. Key 2 3 4 5 7 9 12 15 16

    17 18 19 20 Key 21 22 24 23
  26. Key 2 3 4 5 7 9 12 15 16

    17 18 19 20 Key 21 22 24 23 Mixed TTLs across same data set

  27. Key 2 3 4 5 7 9 12 15 16

    17 18 19 20 Key 21 22 24 23 Mixed TTLs across same data set
 Compacted into same sstables
  28. Key 2 4 5 7 9 12 15 17 18

    19 20 Key 21 24 23 First set of rows expire
  29. Key 2 4 7 9 12 15 17 Key 21

    24 23 Second set of rows expire
  30. No data is removed until compaction occurs Key 2 3

    4 5 7 9 12 15 16 17 18 19 20 Key 21 22 24 23
  31. Time Window Compaction Strategy Create sstables per time range Bucket

    1 Bucket 2 9 12 15 16 17 18 19 20 Bucket 3 21 22 24 23 2 3 4 5 7 9 12 15 16 17 18 19 20 21 22 24 23
  32. Time Window Compaction Strategy sstables dropped once all data has

    expired Bucket 1 Bucket 2 9 12 15 16 17 18 19 20 Bucket 3 21 22 24 23 2 3 4 5 7 9 12 15 16 17 18 19 20 21 22 24 23
  33. Time Window Compaction Strategy sstables dropped once all data has

    expired Bucket 1 Bucket 2 9 12 15 16 17 18 19 20 Bucket 3 21 22 24 23 2 3 4 5 7 9 12 15 16 17 18 19 20 21 22 24 23
  34. Time Window Compaction Strategy sstables dropped once all data has

    expired Bucket 1 Bucket 2 9 12 15 16 17 18 19 20 Bucket 3 21 22 24 23 2 3 4 5 7 9 12 15 16 17 18 19 20 21 22 24 23
  35. Data modelling take aways - Correct data modelling is incredibly

    important! - Wide rows are ok to a point - Repairs on wide rows are problematic - Make Timeseries buckets predictable - Watch for Hot Keys! - TTLs don’t always mean your data is deleted