Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Modelling prototypes to critical systems with Cassandra

Modelling prototypes to critical systems with Cassandra

Cassandra provides highly scalable resilient data storage for a variety of needs, but its effectiveness is often limited by the way we model our data, design schemas, and determine usage patterns. In this talk Matt will look at some of the design patterns Monzo have adopted and developed, from the prototype stage, to their highly scalable core banking systems.

mattheath

March 07, 2019
Tweet

More Decks by mattheath

Other Decks in Programming

Transcript

  1. Mistakes were made…
    Modelling prototypes to critical
    systems with Cassandra
    Matt Heath, Monzo

    View full-size slide

  2. Hi, I’m Matt

    View full-size slide

  3. Nov
    2015
    Mar
    2019

    View full-size slide

  4. Nov
    2015
    Mar
    2019

    View full-size slide

  5. Nov
    2015
    Mar
    2019

    View full-size slide

  6. Nov
    2015
    Mar
    2019

    View full-size slide

  7. Nov
    2015
    Mar
    2019
    Fastest growing UK Bank
    1.6M Customers

    View full-size slide

  8. “construct a highly agile and
    highly available service from
    ephemeral and assumed
    broken components”
    - Adrian Cockcroſt

    View full-size slide

  9. How does Monzo operate?
    - AWS, GCP, and physical data centres
    - Cloud Native technologies:

    Kubernetes, Docker, Calico, Cassandra, Kafka, NSQ,
    Etcd, Prometheus, Jaeger, Envoy, Elasticsearch…
    - Go based micro services
 ⛅

    View full-size slide

  10. Feb
    2015
    Mar
    2019
    1000
    services

    View full-size slide

  11. Why Cassandra?
    - High Availability
    - Tuneable Consistency
    - Linear Scalability
    - Geographic Replication


    View full-size slide

  12. eu-west-1a eu-west-1b eu-west-1c

    View full-size slide

  13. CREATE KEYSPACE account WITH replication = {
    'class': 'NetworkTopologyStrategy',
    'eu-west-1': '3'
    }

    View full-size slide

  14. eu-west-1a eu-west-1b eu-west-1c

    View full-size slide

  15. eu-west-1a eu-west-1b eu-west-1c


    View full-size slide

  16. eu-west-1a eu-west-1b eu-west-1c












    View full-size slide

  17. Data Modelling

    View full-size slide

  18. Column B Column C
    Key Column A

    View full-size slide

  19. Column B Column C
    ID Column A

    View full-size slide

  20. Key Value Value Value
    Column A Column B Column C
    Timestamp Timestamp Timestamp

    View full-size slide

  21. CREATE TABLE IF NOT EXISTS account (
    id text,
    userid text,
    created timestamp,
    currency text,
    country text,
    description text,
    type text,
    PRIMARY KEY ((id))
    );

    View full-size slide

  22. Key Value Value Value
    Column A Column B Column C
    Timestamp Timestamp Timestamp

    View full-size slide

  23. acc_as8d… 2018-xx-xx Matt’s account user_007xUi8…
    “created" “description” “userid”
    1526917782000 1526917782000 1526917782000

    View full-size slide

  24. CREATE TABLE IF NOT EXISTS accounts_by_userid (
    id text,
    userid text,
    created timestamp,
    currency text,
    country text,
    description text,
    type text,
    PRIMARY KEY ((userid), id)
    );

    View full-size slide

  25. Partition
    Key
    Row 1 Row 2 Row 3

    View full-size slide

  26. Partition

    Key
    Row 1 Row 2 Row 3

    View full-size slide

  27. Partition Key

    View full-size slide

  28. Partition

    Key
    Row 1 Row 2 Row 3

    View full-size slide

  29. Partition

    Key
    Row 1 Row infinity

    View full-size slide

  30. User ID Account 1 Account 2 Account 3

    No-one has infinite accounts!

    View full-size slide

  31. CREATE TABLE IF NOT EXISTS transaction (
    id text,
    accountid text,
    created timestamp,
    currency text,
    amount bigint,
    description text,
    PRIMARY KEY ((id))
    );

    View full-size slide

  32. Column B Column C
    Transaction

    ID
    Column A

    View full-size slide

  33. Column B Column C
    Transaction

    ID
    Column A
    Must know primary key — can’t iterate

    View full-size slide

  34. CREATE TABLE IF NOT EXISTS transaction_by_account (
    id text,
    accountid text,
    created timestamp,
    currency text,
    amount bigint,
    description text,
    PRIMARY KEY ((accountid), id)
    );

    View full-size slide

  35. Account

    ID
    Transaction 1
    Transaction

    infinity

    View full-size slide

  36. Timeseries:
    Partition by Time ⏱

    View full-size slide

  37. Partition 2 3 4 5 7

    View full-size slide

  38. Bucket 2 3 4 5 7
    Composite Partition Key
    (Time range and Account ID)
    PRIMARY KEY ((accountid, timebucket), created, id)

    View full-size slide

  39. Day 1 2 3 4 5 7
    Composite Partition Key
    (Time range and Account ID)
    PRIMARY KEY ((accountid, timebucket), created, id)

    View full-size slide

  40. Day 1 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Day 2 21 22 24
    23

    View full-size slide

  41. Day 1 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Day 2 21 22 24
    23
    “Return last 100 transactions”

    View full-size slide

  42. 9 12 15 16 17 18 19 20
    Day 3 21 22 24
    23
    Day 2
    Day 1 2 3 4 5 7
    No data in this time period

    View full-size slide

  43. 9 12 15 16 17 18 19 20
    Day 1000 21 22 24
    23
    Day 999
    Day 1 2 3 4 5 7
    Day 998
    Day 3
    Day 2
    … No data in this time period

    View full-size slide

  44. 9 12 15 16 17 18 19 20
    Day 1000 21 22 24
    23
    Day 999
    Day 1
    Day 998
    Day 3
    Day 2

    Did the data ever exist?! When do we stop…?

    View full-size slide

  45. Is your data predictable?
    How do you choose your bucket size?

    View full-size slide

  46. Day 1 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Day 2 21 22 24
    23

    View full-size slide

  47. Day 1 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Day 2 21 22 24
    23
    Hot Partition Key

    View full-size slide

  48. eu-west-1a eu-west-1b eu-west-1c



    Partition Key = Day 1

    View full-size slide

  49. eu-west-1a eu-west-1b eu-west-1c


    Partition Key = Day 2

    View full-size slide

  50. Is your data predictable?
    How do you choose your bucket size?

    Ensure correct partitioning!

    View full-size slide

  51. Flakeseries:
    Partition by Time ⏱
    Retrieve by ID

    View full-size slide

  52. Bucket 2 3 4 5 7
    Composite Partition Key
    (Time range and Account ID)
    PRIMARY KEY ((accountid, timebucket), created, id)

    View full-size slide

  53. Bucket 2 3 4 5 7
    Time range:
    We need to know the timestamp to read
    PRIMARY KEY ((accountid, timebucket), created, id)

    View full-size slide

  54. Bucket 2 3 4 5 7
    Time range:
    We need to know the timestamp to read
    PRIMARY KEY ((accountid, timebucket), created, id)
    accountid = acc_00009Wd3Yeh2O329bFTVHF

    View full-size slide

  55. accountid = acc_00009Wd3Yeh2O329bFTVHF

    View full-size slide

  56. accountid = acc_00009Wd3Yeh2O329bFTVHF
    Flake IDs = Time based lexically sortable IDs

    View full-size slide

  57. accountid = acc_00009Wd3Yeh2O329bFTVHF
    Flake IDs = Time based lexically sortable IDs
    Base62 encoded 128bit Int

    eg 26341991268378369512474991263748

    View full-size slide

  58. accountid = acc_00009Wd3Yeh2O329bFTVHF
    Flake IDs = Time based lexically sortable IDs
    Base62 encoded 128bit Int

    eg 26341991268378369512474991263748
    64 bits - Time in ms since epoch

    48 bits - Worker ID
    16 bits - Sequence ID

    View full-size slide

  59. accountid = acc_00009Wd3Yeh2O329bFTVHF

    View full-size slide

  60. accountid = acc_00009Wd3Yeh2O329bFTVHF
    transactionid = tx_00009gEBzyFoAtFYllr9Qf

    View full-size slide

  61. accountid = acc_00009Wd3Yeh2O329bFTVHF
    transactionid = tx_00009gEBzyFoAtFYllr9Qf
    PRIMARY KEY ((bucket), flake_created, transactionid)

    View full-size slide

  62. accountid = acc_00009Wd3Yeh2O329bFTVHF
    transactionid = tx_00009gEBzyFoAtFYllr9Qf
    PRIMARY KEY ((bucket), flake_created, transactionid)
    PRIMARY KEY ((accountid, bucket), flake_created, transactionid)

    View full-size slide

  63. TTLing your data ⏱

    View full-size slide

  64. TTLing your data

    View full-size slide

  65. Key 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Key 21 22 24
    23

    View full-size slide

  66. Key 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Key 21 22 24
    23
    Mixed TTLs across same data set


    View full-size slide

  67. Key 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Key 21 22 24
    23
    Mixed TTLs across same data set

    Compacted into same sstables

    View full-size slide

  68. Key 2 4 5 7
    9 12 15 17 18 19 20
    Key 21 24
    23
    First set of rows expire

    View full-size slide

  69. Key 2 4 7
    9 12 15 17
    Key 21 24
    23
    Second set of rows expire

    View full-size slide

  70. No data is removed until compaction occurs
    Key 2 3 4 5 7
    9 12 15 16 17 18 19 20
    Key 21 22 24
    23

    View full-size slide

  71. Time Window Compaction Strategy
    Create sstables per time range
    Bucket 1
    Bucket 2
    9 12 15 16 17 18 19 20
    Bucket 3 21 22 24
    23
    2 3 4 5 7
    9 12 15 16 17 18 19 20 21 22 24
    23

    View full-size slide

  72. Time Window Compaction Strategy
    sstables dropped once all data has expired
    Bucket 1
    Bucket 2
    9 12 15 16 17 18 19 20
    Bucket 3 21 22 24
    23
    2 3 4 5 7
    9 12 15 16 17 18 19 20 21 22 24
    23

    View full-size slide

  73. Time Window Compaction Strategy
    sstables dropped once all data has expired
    Bucket 1
    Bucket 2
    9 12 15 16 17 18 19 20
    Bucket 3 21 22 24
    23
    2 3 4 5 7
    9 12 15 16 17 18 19 20 21 22 24
    23

    View full-size slide

  74. Time Window Compaction Strategy
    sstables dropped once all data has expired
    Bucket 1
    Bucket 2
    9 12 15 16 17 18 19 20
    Bucket 3 21 22 24
    23
    2 3 4 5 7
    9 12 15 16 17 18 19 20 21 22 24
    23

    View full-size slide

  75. Data modelling take aways
    - Correct data modelling is incredibly important!
    - Wide rows are ok to a point
    - Repairs on wide rows are problematic
    - Make Timeseries buckets predictable
    - Watch for Hot Keys!
    - TTLs don’t always mean your data is deleted

    View full-size slide

  76. Nov
    2015
    Mar
    2019
    What works here
    Might not work here

    View full-size slide

  77. monzo.com/careers

    View full-size slide