Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB (Azure Cosmos DB でのデータモデリングとパーティション分割)
Search
SATO Naoki (Neo)
September 13, 2020
Technology
0
780
[第2回 Azure Cosmos DB 勉強会] Data modelling and partitioning in Azure Cosmos DB (Azure Cosmos DB でのデータモデリングとパーティション分割)
https://satonaoki.wordpress.com/2020/09/13/jcdug-cosmos-db-data-modeling/
SATO Naoki (Neo)
September 13, 2020
Tweet
Share
More Decks by SATO Naoki (Neo)
See All by SATO Naoki (Neo)
LLMOps with Azure Machine Learning prompt flow
satonaoki
1
190
マルチクラウド時代の企業における生成AIとデータベースの関係 (Oracle Technology Day)
satonaoki
0
420
Microsoft Copilot, your everyday AI companion (Machine Learning 15minutes! Broadcast #82)
satonaoki
0
810
Microsoft Build 2023 Updates – Copilot Stack and Azure OpenAI Service (Machine Learning 15minutes! Broadcast #78)
satonaoki
2
780
Microsoft + OpenAI: Recent Updates (Machine Learning 15minutes! Broadcast #74)
satonaoki
1
910
30分でわかるマイクロサービスアーキテクチャ 第2版
satonaoki
9
5.4k
[Machine Learning 15minutes! Broadcast #67] Azure AI - Build 2022 Updates and more...
satonaoki
0
290
[Machine Learning 15minutes! #61] Azure OpenAI Service
satonaoki
0
410
[第50回 Machine Learning 15minutes! Broadcast] Azure Machine Learning - Ignite 2020 Updates
satonaoki
0
700
Other Decks in Technology
See All in Technology
GitHub Actions Runner Controller
takesection
0
110
テストだけで品質は上がらない?! エセ自己組織化した品質組織からの脱却 / JaSST'24 Tokyo
visional_engineering_and_design
9
3k
Webエンジニアのためのデータエンジニアリング概説
mtoriyama000
5
400
PHPerKaigi 2024 - PHP 本体のバグを見つけたら適切に報告しよう
zeriyoshi
0
770
事業部を超えた 開発生産性向上に挑戦する
kentakozuka
2
220
The Disturbing Truth: Why Do Most Software Projects Suck?
lemiorhan
0
110
Command-line interface tool design / PHPerKaigi 2024
k1low
4
1k
あなたの知らないバグバウンティの世界
eurekaberry
1
1.4k
プロダクト開発ゼロイチの分類とロジックス事業がイチに至るまで
niwatakeru
0
100
【Cyber-sec+】ログの森で出会ったCloudTrail との奇妙な旅
hssh2_bin
1
230
How to Build a Strong Engineering Culture
alperhankendi
0
120
二刀流でWinActorを活用してみた話
tamai_63
0
120
Featured
See All Featured
GraphQLとの向き合い方2022年版
quramy
28
12k
The Illustrated Children's Guide to Kubernetes
chrisshort
28
46k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
28
5.9k
Fantastic passwords and where to find them - at NoRuKo
philnash
35
2.4k
Bootstrapping a Software Product
garrettdimon
PRO
302
110k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
56
13k
Producing Creativity
orderedlist
PRO
335
39k
CSS Pre-Processors: Stylus, Less & Sass
bermonpainter
352
28k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
272
12k
What's new in Ruby 2.0
geeforr
335
31k
Making Projects Easy
brettharned
106
5.4k
Into the Great Unknown - MozCon
thekraken
10
810
Transcript
Data modelling and partitioning in Azure Cosmos DB (Azure Cosmos
DB でのデータ モデリングとパーティション分割)
Session's objectives
What is Azure Cosmos DB? Non-relational and horizontally scalable
What is Azure Cosmos DB? horizontally scalable
What is Azure Cosmos DB? non-relational
What is Azure Cosmos DB? non-relational and horizontally scalable
So is Azure Cosmos DB suitable for relational workloads?
Let's look at a concrete example
Identifying the operations we have to serve
Now let's implement this model on Azure Cosmos DB!
Starting with the Customer entity
Starting with the Customer entity
To embed or to reference?
To embed or to reference? - - - - -
-
Our first entity: Customer
Customer customers PK: ?
What is partitioning?
What is partitioning? logical partitions
What is partitioning? Andrew Theo Mark Tim Deborah Luis
What is partitioning? Max size: 20 GB Max size: 2
MB
What is partitioning?
What is partitioning?
What is partitioning?
What is partitioning? Andrew Theo Mark Tim Deborah Luis SELECT
* FROM c WHERE c.username = 'Mark' our partition key
What is partitioning? Andrew Theo Mark Tim Deborah Luis SELECT
* FROM c WHERE c.favoriteColor = 'orange' ?
Choosing a partition key for customers customers PK: ?
Choosing a partition key for customers customers PK: ?
Choosing a partition key for customers customers PK: id
Choosing a partition key for customers customers PK: id
Next: product categories
Product categories
Product categories productCategories PK: ?
Product categories productCategories PK: ? SELECT * FROM c
Product categories productCategories PK: type
Next: product tags
Product tags
Product tags productTags PK: ?
Product tags productTags PK: ?
Product tags productTags PK: type
Next: products
Products
Products
Products products PK: ?
Products products PK: ? CategoryA CategoryC CategoryB SELECT * FROM
c WHERE c.categoryId = 'CategoryA'
Products products PK: categoryId category name? tag names?
Products: how to return category and tag names? products SELECT
* FROM c WHERE c.categoryId = 'CategoryA' productCategories SELECT c.name FROM c WHERE c.id = 'CategoryA' productTags SELECT * FROM c WHERE c.id IN ('<tagId1>', '<tagId2>', '<tagId3>')
Introducing denormalization
Products: denormalizing category and tag names products PK: categoryId
Products: keeping everything in sync productCategories productTags products
Cosmos DB's change feed
Products: keeping everything in sync productCategories productTags products
Next: sales orders
Sales orders
Sales orders
Sales orders salesOrders PK: ?
Sales orders salesOrders PK: ?
Sales orders salesOrders PK: ? CustomerA CustomerC CustomerB SELECT *
FROM c WHERE c.customerId = 'CustomerA'
Sales orders salesOrders PK: customerId
Sales orders salesOrders PK: customerId customers PK: id
Mixing entities in the same container?
Sales orders salesOrders PK: customerId customers PK: id
Sales orders: mixing with customers customers PK: id
Sales orders: mixing with customers customers PK: customerId
Sales orders: mixing with customers customers PK: customerId
Sales orders: mixing with customers CustomerA CustomerC CustomerB customer sales
orders customers PK: customerId
Sales orders customers PK: customerId SELECT * FROM c WHERE
c.customerId = 'CustomerA' AND c.type = 'salesOrder'
Sales orders customers PK: customerId
Denormalizing the count of sales orders per customer
Denormalizing the count of sales orders per customer
Denormalizing the count of sales orders per customer CustomerA CustomerC
CustomerB customer sales orders customers PK: customerId
Denormalizing the count of sales orders per customer CustomerA CustomerC
CustomerB update the customer add a sales order customers PK: customerId
Denormalizing the count of sales orders per customer CustomerA CustomerC
CustomerB update the customer add a sales order
Sales orders customers PK: customerId SELECT * FROM c WHERE
c.type = 'customer' ORDER BY c.salesOrderCount DESC
Our final design customers PK: customerId productCategories PK: type productTags
PK: type products PK: categoryId
Our final design, optimized! customers PK: customerId productMeta PK: type
products PK: categoryId
Key takeaways
Going further https://docs.microsoft.com/azure/cosmos-db/modeling-data https://docs.microsoft.com/azure/cosmos-db/how-to-model-partition-example https://devblogs.microsoft.com/cosmosdb/data-modeling-and-partitioning-for-relational-workloads/ https://github.com/AzureCosmosDB/labs/blob/master/readme.md https://github.com/AzureCosmosDB/labs/blob/master/decks/Data-Modeling.pptx