Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cosmos DB at Fifth Elephant 2017

Cosmos DB at Fifth Elephant 2017

Presented at the Fifth Elephant conference 7/27/2017

880ce227b67ff7e2710ffc64084649b6?s=128

Dharma Shukla

July 28, 2017
Tweet

Transcript

  1. #Azure #CosmosDB Dharma Shukla, @dharmashukla, Distinguished Engineer, Microsoft Lessons learnt

    from building a globally distributed database from the ground up
  2. 2010 Project Florence 2017 Cosmos DB Blood, Sweat and Tears

    Requirements (circa 2010) • Turnkey global distribution • Low latency at the 99th percentile, worldwide • Guaranteed high availability • Programmable consistency • Elastically scale throughput and storage, globally, on demand • Operate at low cost
  3. Cosmos DB

  4. Global Distribution Exploiting the cloud properties to the extreme IaaS

    hosted managed database offerings cannot match this! Millions of trans/sec Petabytes of data Elastic and unlimited scalability Cost efficiencies with fine grained multi-tenancy
  5. Global distribution from the ground-up • Cosmos DB as a

    foundational Azure service – Available in all Azure regions by default, including sovereign/government clouds • Automatic multi-region replication – Associate any number of regions with your database account – Policy based geo-fencing • Multi-homing APIs – Apps don’t need to be redeployed during regional failover • Allows for dynamically setting priorities to regions – Simulate regional disaster via API – Test the end to end availability for the entire app (beyond just the database) • First to offer comprehensive SLA for latency, throughput, availability and consistency
  6. • Globally distributed with reads and writes served from local

    region • Write optimized, latch-free database engine designed for SSDs and low latency access • Synchronous and automatic indexing at sustained ingestion rates Guaranteed low latency @ P99
  7. • System designed to independently scale storage and throughput •

    Transparent server side partition management and routing • Automatically indexed SSD storage • Automatic global distribution of data across any number of Azure regions • Optionally evict old data using built-in support for TTL Elastically scalable storage
  8. Typical activity of an application

  9.  Elastically scaling throughput from 10 to 100s of millions

    of transactions/sec across multiple regions  Fully resource governed stack  Highly responsive partition management  Modular, resource governed nested consensus  Multiple granularities of throughput (e.g. sec, min, hour) at different price points Elastically scaling throughput, anywhere, anytime 9 PM PST Less throughput More throughput More throughput Less throughput 11 PM PST
  10. Scaling throughput at different granularities

  11. US Open!

  12. Real world consistency is not a binary choice

  13. The wild west of consistency models…

  14. The state of commercial databases Strong consistency High latency Eventual

    consistency, Low latency
  15. Consistency models in Cosmos DB 5 well-defined consistency levels with

    clear tradeoffs Strong Bounded-stateless Session Consistent prefix Eventual Most real-life applications do not fall into these two extremes
  16. Insights from production workloads 4 18 73 2 3 Usage

    (%) Strong Bounded Staleness Session Consistent Prefix Eventual 0 0.5 1 1.5 Throughput Consistency distribution among customers Consistency vs. Throughput
  17. High availability SLA is not good enough

  18. Microsoft Azure

  19. Retailer - Black Friday/Cyber Monday (11/18-11/30) 2016 0 2000000 4000000

    6000000 8000000 10000000 12000000 10/27/2016 11/6/2016 11/16/2016 11/26/2016 12/6/2016 12/16/2016 12/26/2016 1/5/2017 Throughput (transactions/sec)
  20. Retailer - Black Friday/Cyber Monday (11/18-11/30) 2016 99.96 99.962 99.964

    99.966 99.968 99.97 99.972 99.974 99.976 99.978 99.98 99.982 99.984 99.986 99.988 99.99 99.992 99.994 99.996 99.998 100 0.00 500,000,000.00 1,000,000,000.00 1,500,000,000.00 2,000,000,000.00 2,500,000,000.00 3,000,000,000.00 Total Requests v/s Availability TotalRequests Availability
  21. Retailer - Black Friday/Cyber Monday (11/18-11/30) 2016 0 2 4

    6 8 10 12 14 11/1/2016 11/3/2016 11/5/2016 11/7/2016 11/9/2016 11/11/2016 11/13/2016 11/15/2016 11/17/2016 11/19/2016 11/21/2016 11/23/2016 11/25/2016 11/27/2016 11/29/2016 12/1/2016 12/3/2016 12/5/2016 12/7/2016 12/9/2016 12/11/2016 12/13/2016 12/15/2016 12/17/2016 12/19/2016 12/21/2016 12/23/2016 12/25/2016 12/27/2016 12/29/2016 12/31/2016 Latency (ms) P99 latency Read P99 latency Write
  22. At global scale CREATE INDEX, DROP INDEX, ALTER TABLE

  23. • Logical index layouts (inverted, tree, columnar, …) • Automatic

    and synchronous indexing of all ingested content • No schemas or secondary indices ever needed • Resource governed, write optimized database engine with latch free and log structured techniques Schema agnostic indexing locations headquarter exports 0 1 country Germany city Berlin country France city Paris city Moscow city Athens Belgium 0 1 { "locations": [ { "country": "Germany", "city": "Berlin" }, { "country": "France", "city": "Paris" } ], "headquarter": "Belgium", "exports":[{"city":"Moscow"}, {"city":"Athens"}] }
  24. Meet developers where they are

  25. • Database engine operates on atom-record-sequence (ARS) based type system

    • All data models are translated to ARS • API and wire protocols are supported via extensible modules • Instance of a given data model can be materialized as trees • Graph, documents, key-value, column-family, … more to come Native support for multiple data models SQL
  26. Azure Cosmos DB Global distribution Elastic scale out Guaranteed low

    latency Comprehensive SLAs Five consistency models SQL Key-Value Column-family Graph Documents Microsoft’s globally distributed, multi-model database service
  27. Specify, Verify, Test

  28. Running the Service • Weekly deployments of the entire stack

    worldwide • Quality gates • Chaos, component and functional test coverage • Automated performance, RG and consistency runs every 4 hours • 16+ hours of stress run every day • Full stack upgrades with customer workloads • Chaos tests • Automated linearizability checker and Jepsen tests • Invariant checks • All invariant violations are traced • SEV2 alerts on any invariant violation either pre or post production • Hot fix all invariant violation within 5 days • Transparently making all important metrics available to customers • SLA violations, workload metrics, PBS etc.
  29.  Global distribution, horizontal partitioning and fine-grained multi-tenancy cannot be

    an afterthought while building a cloud database  Schema agnostic database engine design is crucial for a globally distributed database  Intermediate consistency models are extremely useful  A globally distributed database must provide comprehensive SLAs beyond just high availability  Throughput, latency at 99th percentile, consistency and high availability Summary
  30. References • Getting started with Cosmos DB • cosmosdb.com •

    portal.azure.com • aka.ms/cosmosdb • Downloadable service emulator (aka.ms/CosmosDB-emulator) • Technical Overview -> https://azure.microsoft.com/en-us/blog/a-technical-overview-of- azure-cosmos-db/ • Schema Agnostic Indexing, VLDB 2015 -> http://www.vldb.org/pvldb/vol8/p1668- shukla.pdf • Follow #CosmosDB on Twitter • @azurecosmosdb • @dharmashukla • @rimmanehme
  31. Azure Cosmos DB We are just getting started… We are

    Hiring Bangalore, Redmond