Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Modelling prototypes to critical systems with Cassandra

Modelling prototypes to critical systems with Cassandra

Cassandra provides highly scalable resilient data storage for a variety of needs, but its effectiveness is often limited by the way we model our data, design schemas, and determine usage patterns. In this talk Matt will look at some of the design patterns Monzo have adopted and developed, from the prototype stage, to their highly scalable core banking systems.

mattheath

March 07, 2019
Tweet

More Decks by mattheath

Other Decks in Programming

Transcript

  1. Mistakes were made…
    Modelling prototypes to critical
    systems with Cassandra
    Matt Heath, Monzo

    View Slide

  2. Hi, I’m Matt

    View Slide

  3. @mattheath

    View Slide

  4. View Slide

  5. View Slide

  6. View Slide

  7. Nov
    2015
    Mar
    2019

    View Slide

  8. Nov
    2015
    Mar
    2019

    View Slide

  9. Nov
    2015
    Mar
    2019

    View Slide

  10. Nov
    2015
    Mar
    2019

    View Slide

  11. Nov
    2015
    Mar
    2019
    Fastest growing UK Bank
    1.6M Customers

    View Slide


  12. ?

    View Slide

  13. “construct a highly agile and
    highly available service from
    ephemeral and assumed
    broken components”
    - Adrian Cockcroſt

    View Slide

  14. How does Monzo operate?
    - AWS, GCP, and physical data centres
    - Cloud Native technologies:

    Kubernetes, Docker, Calico, Cassandra, Kafka, NSQ,
    Etcd, Prometheus, Jaeger, Envoy, Elasticsearch…
    - Go based micro services
 ⛅

    View Slide

  15. Feb
    2015
    Mar
    2019
    1000
    services

    View Slide

  16. View Slide

  17. View Slide

  18. View Slide

  19. View Slide

  20. View Slide

  21. View Slide

  22. Why Cassandra?
    - High Availability
    - Tuneable Consistency
    - Linear Scalability
    - Geographic Replication


    View Slide

  23. eu-west-1a eu-west-1b eu-west-1c

    View Slide

  24. CREATE KEYSPACE account WITH replication = {
    'class': 'NetworkTopologyStrategy',
    'eu-west-1': '3'
    }

    View Slide

  25. eu-west-1a eu-west-1b eu-west-1c

    View Slide

  26. eu-west-1a eu-west-1b eu-west-1c


    View Slide

  27. eu-west-1a eu-west-1b eu-west-1c












    View Slide

  28. Data Modelling

    View Slide

  29. @jrecursive

    View Slide

  30. Column B Column C
    Key Column A

    View Slide

  31. Column B Column C
    ID Column A

    View Slide

  32. Key Value Value Value
    Column A Column B Column C
    Timestamp Timestamp Timestamp

    View Slide

  33. CREATE TABLE IF NOT EXISTS account (
    id text,
    userid text,
    created timestamp,
    currency text,
    country text,
    description text,
    type text,
    PRIMARY KEY ((id))
    );

    View Slide

  34. Key Value Value Value
    Column A Column B Column C
    Timestamp Timestamp Timestamp

    View Slide

  35. acc_as8d… 2018-xx-xx Matt’s account user_007xUi8…
    “created" “description” “userid”
    1526917782000 1526917782000 1526917782000

    View Slide

  36. CREATE TABLE IF NOT EXISTS accounts_by_userid (
    id text,
    userid text,
    created timestamp,
    currency text,
    country text,
    description text,
    type text,
    PRIMARY KEY ((userid), id)
    );

    View Slide

  37. Partition
    Key
    Row 1 Row 2 Row 3

    View Slide

  38. Partition

    Key
    Row 1 Row 2 Row 3

    View Slide

  39. View Slide

  40. Partition Key

    View Slide

  41. Partition

    Key
    Row 1 Row 2 Row 3

    View Slide

  42. Partition

    Key
    Row 1 Row infinity

    View Slide

  43. User ID Account 1 Account 2 Account 3

    No-one has infinite accounts!

    View Slide

  44. CREATE TABLE IF NOT EXISTS transaction (
    id text,
    accountid text,
    created timestamp,
    currency text,
    amount bigint,
    description text,
    PRIMARY KEY ((id))
    );

    View Slide

  45. Column B Column C
    Transaction

    ID
    Column A

    View Slide

  46. Column B Column C
    Transaction

    ID
    Column A
    Must know primary key — can’t iterate

    View Slide

  47. CREATE TABLE IF NOT EXISTS transaction_by_account (
    id text,
    accountid text,
    created timestamp,
    currency text,
    amount bigint,
    description text,
    PRIMARY KEY ((accountid), id)
    );

    View Slide

  48. Account

    ID
    Transaction 1
    Transaction

    infinity

    View Slide

  49. Timeseries:
    Partition by Time ⏱

    View Slide

  50. Partition 2 3 4 5 7

    View Slide

  51. Bucket 2 3 4 5 7
    Composite Partition Key
    (Time range and Account ID)
    PRIMARY KEY ((accountid, timebucket), created, id)

    View Slide

  52. Day 1 2 3 4 5 7
    Composite Partition Key
    (Time range and Account ID)
    PRIMARY KEY ((accountid, timebucket), created, id)

    View Slide

  53. Day 1 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Day 2 21 22 24
    23

    View Slide

  54. Day 1 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Day 2 21 22 24
    23
    “Return last 100 transactions”

    View Slide

  55. 9 12 15 16 17 18 19 20
    Day 3 21 22 24
    23
    Day 2
    Day 1 2 3 4 5 7
    No data in this time period

    View Slide

  56. 9 12 15 16 17 18 19 20
    Day 1000 21 22 24
    23
    Day 999
    Day 1 2 3 4 5 7
    Day 998
    Day 3
    Day 2
    … No data in this time period

    View Slide

  57. 9 12 15 16 17 18 19 20
    Day 1000 21 22 24
    23
    Day 999
    Day 1
    Day 998
    Day 3
    Day 2

    Did the data ever exist?! When do we stop…?

    View Slide

  58. Is your data predictable?
    How do you choose your bucket size?

    View Slide

  59. Day 1 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Day 2 21 22 24
    23

    View Slide

  60. Day 1 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Day 2 21 22 24
    23
    Hot Partition Key

    View Slide

  61. eu-west-1a eu-west-1b eu-west-1c



    Partition Key = Day 1

    View Slide

  62. eu-west-1a eu-west-1b eu-west-1c


    Partition Key = Day 2

    View Slide

  63. Is your data predictable?
    How do you choose your bucket size?

    Ensure correct partitioning!

    View Slide

  64. Flakeseries:
    Partition by Time ⏱
    Retrieve by ID

    View Slide

  65. Bucket 2 3 4 5 7
    Composite Partition Key
    (Time range and Account ID)
    PRIMARY KEY ((accountid, timebucket), created, id)

    View Slide

  66. Bucket 2 3 4 5 7
    Time range:
    We need to know the timestamp to read
    PRIMARY KEY ((accountid, timebucket), created, id)

    View Slide

  67. Bucket 2 3 4 5 7
    Time range:
    We need to know the timestamp to read
    PRIMARY KEY ((accountid, timebucket), created, id)
    accountid = acc_00009Wd3Yeh2O329bFTVHF

    View Slide

  68. accountid = acc_00009Wd3Yeh2O329bFTVHF

    View Slide

  69. accountid = acc_00009Wd3Yeh2O329bFTVHF
    Flake IDs = Time based lexically sortable IDs

    View Slide

  70. accountid = acc_00009Wd3Yeh2O329bFTVHF
    Flake IDs = Time based lexically sortable IDs
    Base62 encoded 128bit Int

    eg 26341991268378369512474991263748

    View Slide

  71. accountid = acc_00009Wd3Yeh2O329bFTVHF
    Flake IDs = Time based lexically sortable IDs
    Base62 encoded 128bit Int

    eg 26341991268378369512474991263748
    64 bits - Time in ms since epoch

    48 bits - Worker ID
    16 bits - Sequence ID

    View Slide

  72. accountid = acc_00009Wd3Yeh2O329bFTVHF

    View Slide

  73. accountid = acc_00009Wd3Yeh2O329bFTVHF
    transactionid = tx_00009gEBzyFoAtFYllr9Qf

    View Slide

  74. accountid = acc_00009Wd3Yeh2O329bFTVHF
    transactionid = tx_00009gEBzyFoAtFYllr9Qf
    PRIMARY KEY ((bucket), flake_created, transactionid)

    View Slide

  75. accountid = acc_00009Wd3Yeh2O329bFTVHF
    transactionid = tx_00009gEBzyFoAtFYllr9Qf
    PRIMARY KEY ((bucket), flake_created, transactionid)
    PRIMARY KEY ((accountid, bucket), flake_created, transactionid)

    View Slide

  76. TTLing your data ⏱

    View Slide

  77. TTLing your data

    View Slide

  78. Key 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Key 21 22 24
    23

    View Slide

  79. Key 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Key 21 22 24
    23
    Mixed TTLs across same data set


    View Slide

  80. Key 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Key 21 22 24
    23
    Mixed TTLs across same data set

    Compacted into same sstables

    View Slide

  81. Key 2 4 5 7
    9 12 15 17 18 19 20
    Key 21 24
    23
    First set of rows expire

    View Slide

  82. Key 2 4 7
    9 12 15 17
    Key 21 24
    23
    Second set of rows expire

    View Slide

  83. No data is removed until compaction occurs
    Key 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Key 21 22 24
    23

    View Slide

  84. Time Window Compaction Strategy
    Create sstables per time range
    Bucket 1
    Bucket 2
    9 12 15 16 17 18 19 20
    Bucket 3 21 22 24
    23
    2 3 4 5 7
    9 12 15 16 17 18 19 20 21 22 24
    23

    View Slide

  85. Time Window Compaction Strategy
    sstables dropped once all data has expired
    Bucket 1
    Bucket 2
    9 12 15 16 17 18 19 20
    Bucket 3 21 22 24
    23
    2 3 4 5 7
    9 12 15 16 17 18 19 20 21 22 24
    23

    View Slide

  86. Time Window Compaction Strategy
    sstables dropped once all data has expired
    Bucket 1
    Bucket 2
    9 12 15 16 17 18 19 20
    Bucket 3 21 22 24
    23
    2 3 4 5 7
    9 12 15 16 17 18 19 20 21 22 24
    23

    View Slide

  87. Time Window Compaction Strategy
    sstables dropped once all data has expired
    Bucket 1
    Bucket 2
    9 12 15 16 17 18 19 20
    Bucket 3 21 22 24
    23
    2 3 4 5 7
    9 12 15 16 17 18 19 20 21 22 24
    23

    View Slide

  88. Data modelling take aways
    - Correct data modelling is incredibly important!
    - Wide rows are ok to a point
    - Repairs on wide rows are problematic
    - Make Timeseries buckets predictable
    - Watch for Hot Keys!
    - TTLs don’t always mean your data is deleted

    View Slide

  89. Nov
    2015
    Mar
    2019
    What works here
    Might not work here

    View Slide

  90. View Slide

  91. monzo.com/careers

    View Slide