Slide 1

Slide 1 text

Azure #CosmosDB Lessons learnt from building a globally distributed database from the ground up Dharma Shukla (@dharmashukla), Distinguished Engineer, Microsoft

Slide 2

Slide 2 text

Cosmos DB

Slide 3

Slide 3 text

2010 Project Florence 2017 Cosmos DB Blood, sweat and tears Requirements (2010) • Turnkey global distribution • Low latency at the 99th percentile, worldwide • Guaranteed high availability • Programmable consistency • Elastically scale throughput and storage, globally, on demand • Operate at the lowest possible cost

Slide 4

Slide 4 text

Database hosted in the cloud != Database born in the cloud

Slide 5

Slide 5 text

Global Distribution Exploiting the cloud properties to the extreme IaaS hosted managed database offerings cannot match this! Millions of trans/sec Petabytes of data Elastic and unlimited scalability Cost efficiencies with fine grained multi-tenancy

Slide 6

Slide 6 text

Typical activity of an application

Slide 7

Slide 7 text

 Elastically scaling throughput from 10 to 100s of millions of transactions/sec across multiple regions  Fully resource governed stack  Highly responsive partition management  Modular, resource governed nested consensus  Multiple granularities of throughput (e.g. sec, min, hour) at different price points Elastically scaling throughput, anywhere, anytime 9 PM PST Less throughput More throughput More throughput Less throughput 11 PM PST

Slide 8

Slide 8 text

At global scale CREATE INDEX, DROP INDEX, ALTER TABLE

Slide 9

Slide 9 text

• Logical index layouts (inverted, tree, columnar, …) • Automatic and synchronous indexing of all ingested content • No schemas or secondary indices ever needed • Resource governed, write optimized database engine with latch free and log structured techniques Schema agnostic database engine locations headquarter exports 0 1 country Germany city Berlin country France city Paris city Moscow city Athens Belgium 0 1 { "locations": [ { "country": "Germany", "city": "Berlin" }, { "country": "France", "city": "Paris" } ], "headquarter": "Belgium", "exports":[{"city":"Moscow"}, {"city":"Athens"}] }

Slide 10

Slide 10 text

Real world consistency is not a binary choice

Slide 11

Slide 11 text

The wild west of consistency models…

Slide 12

Slide 12 text

The state of commercial databases Strong consistency High latency Eventual consistency, Low latency

Slide 13

Slide 13 text

Consistency models in Cosmos DB 5 well-defined consistency levels with clear tradeoffs Strong Bounded-stateless Session Consistent prefix Eventual Most real-life applications do not fall into these two extremes

Slide 14

Slide 14 text

Insights from production workloads 4 18 73 2 3 Usage (%) Strong Bounded Staleness Session Consistent Prefix Eventual 0 0.2 0.4 0.6 0.8 1 1.2 Throughput Consistency distribution among customers Consistency vs. Throughput

Slide 15

Slide 15 text

High availability SLA is not good enough

Slide 16

Slide 16 text

Microsoft Azure

Slide 17

Slide 17 text

 Global distribution, horizontal partitioning and fine-grained multi-tenancy cannot be an afterthought while building a cloud database  Schema agnostic database engine design is crucial for a globally distributed database  Intermediate consistency models are extremely useful  A globally distributed database must provide comprehensive SLAs beyond just high availability  Throughput, latency at 99th percentile, consistency and high availability Lessons learned

Slide 18

Slide 18 text

#Azure #CosmosDB We are Hiring