Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Modelling prototypes to critical systems with Cassandra

Modelling prototypes to critical systems with Cassandra

Cassandra provides highly scalable resilient data storage for a variety of needs, but its effectiveness is often limited by the way we model our data, design schemas, and determine usage patterns. In this talk Matt will look at some of the design patterns Monzo have adopted and developed, from the prototype stage, to their highly scalable core banking systems.

67f4a8f2a209a38d7242829947b26ba3?s=128

mattheath

March 07, 2019
Tweet

Transcript

  1. Mistakes were made… Modelling prototypes to critical systems with Cassandra

    Matt Heath, Monzo
  2. Hi, I’m Matt

  3. @mattheath

  4. None
  5. None
  6. None
  7. Nov 2015 Mar 2019

  8. Nov 2015 Mar 2019

  9. Nov 2015 Mar 2019

  10. Nov 2015 Mar 2019

  11. Nov 2015 Mar 2019 Fastest growing UK Bank 1.6M Customers

  12. ☁ ?

  13. “construct a highly agile and highly available service from ephemeral

    and assumed broken components” - Adrian Cockcroſt
  14. How does Monzo operate? - AWS, GCP, and physical data

    centres - Cloud Native technologies:
 Kubernetes, Docker, Calico, Cassandra, Kafka, NSQ, Etcd, Prometheus, Jaeger, Envoy, Elasticsearch… - Go based micro services
 ⛅
  15. Feb 2015 Mar 2019 1000 services

  16. None
  17. None
  18. None
  19. None
  20. None
  21. None
  22. Why Cassandra? - High Availability - Tuneable Consistency - Linear

    Scalability - Geographic Replication

  23. eu-west-1a eu-west-1b eu-west-1c

  24. CREATE KEYSPACE account WITH replication = { 'class': 'NetworkTopologyStrategy', 'eu-west-1':

    '3' }
  25. eu-west-1a eu-west-1b eu-west-1c

  26. eu-west-1a eu-west-1b eu-west-1c

  27. eu-west-1a eu-west-1b eu-west-1c

  28. Data Modelling

  29. @jrecursive

  30. Column B Column C Key Column A

  31. Column B Column C ID Column A

  32. Key Value Value Value Column A Column B Column C

    Timestamp Timestamp Timestamp
  33. CREATE TABLE IF NOT EXISTS account ( id text, userid

    text, created timestamp, currency text, country text, description text, type text, PRIMARY KEY ((id)) );
  34. Key Value Value Value Column A Column B Column C

    Timestamp Timestamp Timestamp
  35. acc_as8d… 2018-xx-xx Matt’s account user_007xUi8… “created" “description” “userid” 1526917782000 1526917782000

    1526917782000
  36. CREATE TABLE IF NOT EXISTS accounts_by_userid ( id text, userid

    text, created timestamp, currency text, country text, description text, type text, PRIMARY KEY ((userid), id) );
  37. Partition Key Row 1 Row 2 Row 3

  38. Partition
 Key Row 1 Row 2 Row 3

  39. None
  40. Partition Key

  41. Partition
 Key Row 1 Row 2 Row 3

  42. Partition
 Key Row 1 Row infinity

  43. User ID Account 1 Account 2 Account 3 No-one has

    infinite accounts!
  44. CREATE TABLE IF NOT EXISTS transaction ( id text, accountid

    text, created timestamp, currency text, amount bigint, description text, PRIMARY KEY ((id)) );
  45. Column B Column C Transaction
 ID Column A

  46. Column B Column C Transaction
 ID Column A Must know

    primary key — can’t iterate
  47. CREATE TABLE IF NOT EXISTS transaction_by_account ( id text, accountid

    text, created timestamp, currency text, amount bigint, description text, PRIMARY KEY ((accountid), id) );
  48. Account
 ID Transaction 1 Transaction
 infinity

  49. Timeseries: Partition by Time ⏱

  50. Partition 2 3 4 5 7

  51. Bucket 2 3 4 5 7 Composite Partition Key (Time

    range and Account ID) PRIMARY KEY ((accountid, timebucket), created, id)
  52. Day 1 2 3 4 5 7 Composite Partition Key

    (Time range and Account ID) PRIMARY KEY ((accountid, timebucket), created, id)
  53. Day 1 2 3 4 5 7 9 12 15

    16 17 18 19 20 Day 2 21 22 24 23
  54. Day 1 2 3 4 5 7 9 12 15

    16 17 18 19 20 Day 2 21 22 24 23 “Return last 100 transactions”
  55. 9 12 15 16 17 18 19 20 Day 3

    21 22 24 23 Day 2 Day 1 2 3 4 5 7 No data in this time period
  56. 9 12 15 16 17 18 19 20 Day 1000

    21 22 24 23 Day 999 Day 1 2 3 4 5 7 Day 998 Day 3 Day 2 … No data in this time period
  57. 9 12 15 16 17 18 19 20 Day 1000

    21 22 24 23 Day 999 Day 1 Day 998 Day 3 Day 2 … Did the data ever exist?! When do we stop…?
  58. Is your data predictable? How do you choose your bucket

    size?
  59. Day 1 2 3 4 5 7 9 12 15

    16 17 18 19 20 Day 2 21 22 24 23
  60. Day 1 2 3 4 5 7 9 12 15

    16 17 18 19 20 Day 2 21 22 24 23 Hot Partition Key
  61. eu-west-1a eu-west-1b eu-west-1c Partition Key = Day 1

  62. eu-west-1a eu-west-1b eu-west-1c Partition Key = Day 2

  63. Is your data predictable? How do you choose your bucket

    size?
 Ensure correct partitioning!
  64. Flakeseries: Partition by Time ⏱ Retrieve by ID

  65. Bucket 2 3 4 5 7 Composite Partition Key (Time

    range and Account ID) PRIMARY KEY ((accountid, timebucket), created, id)
  66. Bucket 2 3 4 5 7 Time range: We need

    to know the timestamp to read PRIMARY KEY ((accountid, timebucket), created, id)
  67. Bucket 2 3 4 5 7 Time range: We need

    to know the timestamp to read PRIMARY KEY ((accountid, timebucket), created, id) accountid = acc_00009Wd3Yeh2O329bFTVHF
  68. accountid = acc_00009Wd3Yeh2O329bFTVHF

  69. accountid = acc_00009Wd3Yeh2O329bFTVHF Flake IDs = Time based lexically sortable

    IDs
  70. accountid = acc_00009Wd3Yeh2O329bFTVHF Flake IDs = Time based lexically sortable

    IDs Base62 encoded 128bit Int
 eg 26341991268378369512474991263748
  71. accountid = acc_00009Wd3Yeh2O329bFTVHF Flake IDs = Time based lexically sortable

    IDs Base62 encoded 128bit Int
 eg 26341991268378369512474991263748 64 bits - Time in ms since epoch
 48 bits - Worker ID 16 bits - Sequence ID
  72. accountid = acc_00009Wd3Yeh2O329bFTVHF

  73. accountid = acc_00009Wd3Yeh2O329bFTVHF transactionid = tx_00009gEBzyFoAtFYllr9Qf

  74. accountid = acc_00009Wd3Yeh2O329bFTVHF transactionid = tx_00009gEBzyFoAtFYllr9Qf PRIMARY KEY ((bucket), flake_created,

    transactionid)
  75. accountid = acc_00009Wd3Yeh2O329bFTVHF transactionid = tx_00009gEBzyFoAtFYllr9Qf PRIMARY KEY ((bucket), flake_created,

    transactionid) PRIMARY KEY ((accountid, bucket), flake_created, transactionid)
  76. TTLing your data ⏱

  77. TTLing your data

  78. Key 2 3 4 5 7 9 12 15 16

    17 18 19 20 Key 21 22 24 23
  79. Key 2 3 4 5 7 9 12 15 16

    17 18 19 20 Key 21 22 24 23 Mixed TTLs across same data set

  80. Key 2 3 4 5 7 9 12 15 16

    17 18 19 20 Key 21 22 24 23 Mixed TTLs across same data set
 Compacted into same sstables
  81. Key 2 4 5 7 9 12 15 17 18

    19 20 Key 21 24 23 First set of rows expire
  82. Key 2 4 7 9 12 15 17 Key 21

    24 23 Second set of rows expire
  83. No data is removed until compaction occurs Key 2 3

    4 5 7 9 12 15 16 17 18 19 20 Key 21 22 24 23
  84. Time Window Compaction Strategy Create sstables per time range Bucket

    1 Bucket 2 9 12 15 16 17 18 19 20 Bucket 3 21 22 24 23 2 3 4 5 7 9 12 15 16 17 18 19 20 21 22 24 23
  85. Time Window Compaction Strategy sstables dropped once all data has

    expired Bucket 1 Bucket 2 9 12 15 16 17 18 19 20 Bucket 3 21 22 24 23 2 3 4 5 7 9 12 15 16 17 18 19 20 21 22 24 23
  86. Time Window Compaction Strategy sstables dropped once all data has

    expired Bucket 1 Bucket 2 9 12 15 16 17 18 19 20 Bucket 3 21 22 24 23 2 3 4 5 7 9 12 15 16 17 18 19 20 21 22 24 23
  87. Time Window Compaction Strategy sstables dropped once all data has

    expired Bucket 1 Bucket 2 9 12 15 16 17 18 19 20 Bucket 3 21 22 24 23 2 3 4 5 7 9 12 15 16 17 18 19 20 21 22 24 23
  88. Data modelling take aways - Correct data modelling is incredibly

    important! - Wide rows are ok to a point - Repairs on wide rows are problematic - Make Timeseries buckets predictable - Watch for Hot Keys! - TTLs don’t always mean your data is deleted
  89. Nov 2015 Mar 2019 What works here Might not work

    here
  90. None
  91. monzo.com/careers