Upgrade to Pro — share decks privately, control downloads, hide ads and more …

[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB (Azure Cosmos DB でのデータモデリングとパーティション分割)

[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB (Azure Cosmos DB でのデータモデリングとパーティション分割)

SATO Naoki (Neo)

September 13, 2020
Tweet

More Decks by SATO Naoki (Neo)

Other Decks in Technology

Transcript

  1. Data modelling and partitioning in Azure
    Cosmos DB
    (Azure Cosmos DB でのデータ モデリングとパーティション分割)

    View Slide

  2. Session's objectives

    View Slide

  3. What is Azure Cosmos DB?
    Non-relational and horizontally scalable

    View Slide

  4. What is Azure Cosmos DB?
    horizontally scalable

    View Slide

  5. What is Azure Cosmos DB?
    non-relational

    View Slide

  6. What is Azure Cosmos DB?
    non-relational
    and
    horizontally scalable

    View Slide

  7. So is Azure Cosmos DB suitable for relational
    workloads?

    View Slide

  8. Let's look at a concrete example

    View Slide

  9. Identifying the operations we have to serve

    View Slide

  10. Now let's implement this model on Azure Cosmos DB!

    View Slide

  11. Starting with the Customer entity

    View Slide

  12. Starting with the Customer entity

    View Slide

  13. To embed or to reference?

    View Slide

  14. To embed or to reference?
    -
    -
    -
    -
    -
    -

    View Slide

  15. Our first entity: Customer

    View Slide

  16. Customer
    customers
    PK: ?

    View Slide

  17. What is partitioning?

    View Slide

  18. What is partitioning?
    logical partitions

    View Slide

  19. What is partitioning?
    Andrew
    Theo
    Mark
    Tim
    Deborah Luis

    View Slide

  20. What is partitioning?
    Max size: 20 GB
    Max size: 2 MB

    View Slide

  21. What is partitioning?

    View Slide

  22. What is partitioning?

    View Slide

  23. What is partitioning?

    View Slide

  24. What is partitioning?
    Andrew Theo
    Mark
    Tim
    Deborah Luis
    SELECT * FROM c WHERE c.username = 'Mark'
    our partition key

    View Slide

  25. What is partitioning?
    Andrew Theo
    Mark
    Tim
    Deborah Luis
    SELECT * FROM c WHERE c.favoriteColor = 'orange'
    ?

    View Slide

  26. Choosing a partition key for customers
    customers
    PK: ?

    View Slide

  27. Choosing a partition key for customers
    customers
    PK: ?

    View Slide

  28. Choosing a partition key for customers
    customers
    PK: id

    View Slide

  29. Choosing a partition key for customers
    customers
    PK: id

    View Slide

  30. Next: product categories

    View Slide

  31. Product categories

    View Slide

  32. Product categories
    productCategories
    PK: ?

    View Slide

  33. Product categories
    productCategories
    PK: ?
    SELECT * FROM c

    View Slide

  34. Product categories
    productCategories
    PK: type

    View Slide

  35. Next: product tags

    View Slide

  36. Product tags

    View Slide

  37. Product tags
    productTags
    PK: ?

    View Slide

  38. Product tags
    productTags
    PK: ?

    View Slide

  39. Product tags
    productTags
    PK: type

    View Slide

  40. Next: products

    View Slide

  41. Products

    View Slide

  42. Products

    View Slide

  43. Products
    products
    PK: ?

    View Slide

  44. Products
    products
    PK: ?
    CategoryA CategoryC
    CategoryB
    SELECT * FROM c WHERE c.categoryId = 'CategoryA'

    View Slide

  45. Products
    products
    PK: categoryId
    category name?
    tag names?

    View Slide

  46. Products: how to return category and tag names?
    products
    SELECT * FROM c WHERE c.categoryId = 'CategoryA'
    productCategories
    SELECT c.name FROM c WHERE c.id = 'CategoryA'
    productTags
    SELECT * FROM c
    WHERE c.id IN ('', '', '')

    View Slide

  47. Introducing denormalization

    View Slide

  48. Products: denormalizing category and tag names
    products
    PK: categoryId

    View Slide

  49. Products: keeping everything in sync
    productCategories
    productTags
    products

    View Slide

  50. Cosmos DB's change feed

    View Slide

  51. Products: keeping everything in sync
    productCategories
    productTags
    products

    View Slide

  52. Next: sales orders

    View Slide

  53. Sales orders

    View Slide

  54. Sales orders

    View Slide

  55. Sales orders
    salesOrders
    PK: ?

    View Slide

  56. Sales orders
    salesOrders
    PK: ?

    View Slide

  57. Sales orders
    salesOrders
    PK: ?
    CustomerA CustomerC
    CustomerB
    SELECT * FROM c WHERE c.customerId = 'CustomerA'

    View Slide

  58. Sales orders
    salesOrders
    PK: customerId

    View Slide

  59. Sales orders
    salesOrders
    PK: customerId
    customers
    PK: id

    View Slide

  60. Mixing entities in the same container?

    View Slide

  61. Sales orders
    salesOrders
    PK: customerId
    customers
    PK: id

    View Slide

  62. Sales orders: mixing with customers
    customers
    PK: id

    View Slide

  63. Sales orders: mixing with customers
    customers
    PK: customerId

    View Slide

  64. Sales orders: mixing with customers
    customers
    PK: customerId

    View Slide

  65. Sales orders: mixing with customers
    CustomerA
    CustomerC
    CustomerB
    customer sales orders
    customers
    PK: customerId

    View Slide

  66. Sales orders
    customers
    PK: customerId
    SELECT * FROM c WHERE c.customerId = 'CustomerA'
    AND c.type = 'salesOrder'

    View Slide

  67. Sales orders
    customers
    PK: customerId

    View Slide

  68. Denormalizing the count of sales orders per customer

    View Slide

  69. Denormalizing the count of sales orders per customer

    View Slide

  70. Denormalizing the count of sales orders per customer
    CustomerA
    CustomerC
    CustomerB
    customer sales orders
    customers
    PK: customerId

    View Slide

  71. Denormalizing the count of sales orders per customer
    CustomerA
    CustomerC
    CustomerB
    update the customer add a sales order
    customers
    PK: customerId

    View Slide

  72. Denormalizing the count of sales orders per customer
    CustomerA
    CustomerC
    CustomerB
    update the customer add a sales order

    View Slide

  73. Sales orders
    customers
    PK: customerId
    SELECT * FROM c WHERE c.type = 'customer'
    ORDER BY c.salesOrderCount DESC

    View Slide

  74. Our final design
    customers
    PK: customerId
    productCategories
    PK: type
    productTags
    PK: type
    products
    PK: categoryId

    View Slide

  75. Our final design, optimized!
    customers
    PK: customerId
    productMeta
    PK: type
    products
    PK: categoryId

    View Slide

  76. Key takeaways

    View Slide

  77. Going further
    https://docs.microsoft.com/azure/cosmos-db/modeling-data
    https://docs.microsoft.com/azure/cosmos-db/how-to-model-partition-example
    https://devblogs.microsoft.com/cosmosdb/data-modeling-and-partitioning-for-relational-workloads/
    https://github.com/AzureCosmosDB/labs/blob/master/readme.md
    https://github.com/AzureCosmosDB/labs/blob/master/decks/Data-Modeling.pptx

    View Slide