Slide 1

Slide 1 text

Data modelling and partitioning in Azure Cosmos DB (Azure Cosmos DB でのデータ モデリングとパーティション分割)

Slide 2

Slide 2 text

Session's objectives

Slide 3

Slide 3 text

What is Azure Cosmos DB? Non-relational and horizontally scalable

Slide 4

Slide 4 text

What is Azure Cosmos DB? horizontally scalable

Slide 5

Slide 5 text

What is Azure Cosmos DB? non-relational

Slide 6

Slide 6 text

What is Azure Cosmos DB? non-relational and horizontally scalable

Slide 7

Slide 7 text

So is Azure Cosmos DB suitable for relational workloads?

Slide 8

Slide 8 text

Let's look at a concrete example

Slide 9

Slide 9 text

Identifying the operations we have to serve

Slide 10

Slide 10 text

Now let's implement this model on Azure Cosmos DB!

Slide 11

Slide 11 text

Starting with the Customer entity

Slide 12

Slide 12 text

Starting with the Customer entity

Slide 13

Slide 13 text

To embed or to reference?

Slide 14

Slide 14 text

To embed or to reference? - - - - - -

Slide 15

Slide 15 text

Our first entity: Customer

Slide 16

Slide 16 text

Customer customers PK: ?

Slide 17

Slide 17 text

What is partitioning?

Slide 18

Slide 18 text

What is partitioning? logical partitions

Slide 19

Slide 19 text

What is partitioning? Andrew Theo Mark Tim Deborah Luis

Slide 20

Slide 20 text

What is partitioning? Max size: 20 GB Max size: 2 MB

Slide 21

Slide 21 text

What is partitioning?

Slide 22

Slide 22 text

What is partitioning?

Slide 23

Slide 23 text

What is partitioning?

Slide 24

Slide 24 text

What is partitioning? Andrew Theo Mark Tim Deborah Luis SELECT * FROM c WHERE c.username = 'Mark' our partition key

Slide 25

Slide 25 text

What is partitioning? Andrew Theo Mark Tim Deborah Luis SELECT * FROM c WHERE c.favoriteColor = 'orange' ?

Slide 26

Slide 26 text

Choosing a partition key for customers customers PK: ?

Slide 27

Slide 27 text

Choosing a partition key for customers customers PK: ?

Slide 28

Slide 28 text

Choosing a partition key for customers customers PK: id

Slide 29

Slide 29 text

Choosing a partition key for customers customers PK: id

Slide 30

Slide 30 text

Next: product categories

Slide 31

Slide 31 text

Product categories

Slide 32

Slide 32 text

Product categories productCategories PK: ?

Slide 33

Slide 33 text

Product categories productCategories PK: ? SELECT * FROM c

Slide 34

Slide 34 text

Product categories productCategories PK: type

Slide 35

Slide 35 text

Next: product tags

Slide 36

Slide 36 text

Product tags

Slide 37

Slide 37 text

Product tags productTags PK: ?

Slide 38

Slide 38 text

Product tags productTags PK: ?

Slide 39

Slide 39 text

Product tags productTags PK: type

Slide 40

Slide 40 text

Next: products

Slide 41

Slide 41 text

Products

Slide 42

Slide 42 text

Products

Slide 43

Slide 43 text

Products products PK: ?

Slide 44

Slide 44 text

Products products PK: ? CategoryA CategoryC CategoryB SELECT * FROM c WHERE c.categoryId = 'CategoryA'

Slide 45

Slide 45 text

Products products PK: categoryId category name? tag names?

Slide 46

Slide 46 text

Products: how to return category and tag names? products SELECT * FROM c WHERE c.categoryId = 'CategoryA' productCategories SELECT c.name FROM c WHERE c.id = 'CategoryA' productTags SELECT * FROM c WHERE c.id IN ('', '', '')

Slide 47

Slide 47 text

Introducing denormalization

Slide 48

Slide 48 text

Products: denormalizing category and tag names products PK: categoryId

Slide 49

Slide 49 text

Products: keeping everything in sync productCategories productTags products

Slide 50

Slide 50 text

Cosmos DB's change feed

Slide 51

Slide 51 text

Products: keeping everything in sync productCategories productTags products

Slide 52

Slide 52 text

Next: sales orders

Slide 53

Slide 53 text

Sales orders

Slide 54

Slide 54 text

Sales orders

Slide 55

Slide 55 text

Sales orders salesOrders PK: ?

Slide 56

Slide 56 text

Sales orders salesOrders PK: ?

Slide 57

Slide 57 text

Sales orders salesOrders PK: ? CustomerA CustomerC CustomerB SELECT * FROM c WHERE c.customerId = 'CustomerA'

Slide 58

Slide 58 text

Sales orders salesOrders PK: customerId

Slide 59

Slide 59 text

Sales orders salesOrders PK: customerId customers PK: id

Slide 60

Slide 60 text

Mixing entities in the same container?

Slide 61

Slide 61 text

Sales orders salesOrders PK: customerId customers PK: id

Slide 62

Slide 62 text

Sales orders: mixing with customers customers PK: id

Slide 63

Slide 63 text

Sales orders: mixing with customers customers PK: customerId

Slide 64

Slide 64 text

Sales orders: mixing with customers customers PK: customerId

Slide 65

Slide 65 text

Sales orders: mixing with customers CustomerA CustomerC CustomerB customer sales orders customers PK: customerId

Slide 66

Slide 66 text

Sales orders customers PK: customerId SELECT * FROM c WHERE c.customerId = 'CustomerA' AND c.type = 'salesOrder'

Slide 67

Slide 67 text

Sales orders customers PK: customerId

Slide 68

Slide 68 text

Denormalizing the count of sales orders per customer

Slide 69

Slide 69 text

Denormalizing the count of sales orders per customer

Slide 70

Slide 70 text

Denormalizing the count of sales orders per customer CustomerA CustomerC CustomerB customer sales orders customers PK: customerId

Slide 71

Slide 71 text

Denormalizing the count of sales orders per customer CustomerA CustomerC CustomerB update the customer add a sales order customers PK: customerId

Slide 72

Slide 72 text

Denormalizing the count of sales orders per customer CustomerA CustomerC CustomerB update the customer add a sales order

Slide 73

Slide 73 text

Sales orders customers PK: customerId SELECT * FROM c WHERE c.type = 'customer' ORDER BY c.salesOrderCount DESC

Slide 74

Slide 74 text

Our final design customers PK: customerId productCategories PK: type productTags PK: type products PK: categoryId

Slide 75

Slide 75 text

Our final design, optimized! customers PK: customerId productMeta PK: type products PK: categoryId

Slide 76

Slide 76 text

Key takeaways

Slide 77

Slide 77 text

Going further https://docs.microsoft.com/azure/cosmos-db/modeling-data https://docs.microsoft.com/azure/cosmos-db/how-to-model-partition-example https://devblogs.microsoft.com/cosmosdb/data-modeling-and-partitioning-for-relational-workloads/ https://github.com/AzureCosmosDB/labs/blob/master/readme.md https://github.com/AzureCosmosDB/labs/blob/master/decks/Data-Modeling.pptx