Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Cosmos DB at Fifth Elephant 2017

Cosmos DB at Fifth Elephant 2017

Presented at the Fifth Elephant conference 7/27/2017

Dharma Shukla

July 28, 2017
Tweet

More Decks by Dharma Shukla

Other Decks in Technology

Transcript

  1. #Azure #CosmosDB Dharma Shukla, @dharmashukla, Distinguished Engineer, Microsoft Lessons learnt

    from building a globally distributed database from the ground up
  2. 2010 Project Florence 2017 Cosmos DB Blood, Sweat and Tears

    Requirements (circa 2010) • Turnkey global distribution • Low latency at the 99th percentile, worldwide • Guaranteed high availability • Programmable consistency • Elastically scale throughput and storage, globally, on demand • Operate at low cost
  3. Global Distribution Exploiting the cloud properties to the extreme IaaS

    hosted managed database offerings cannot match this! Millions of trans/sec Petabytes of data Elastic and unlimited scalability Cost efficiencies with fine grained multi-tenancy
  4. Global distribution from the ground-up • Cosmos DB as a

    foundational Azure service – Available in all Azure regions by default, including sovereign/government clouds • Automatic multi-region replication – Associate any number of regions with your database account – Policy based geo-fencing • Multi-homing APIs – Apps don’t need to be redeployed during regional failover • Allows for dynamically setting priorities to regions – Simulate regional disaster via API – Test the end to end availability for the entire app (beyond just the database) • First to offer comprehensive SLA for latency, throughput, availability and consistency
  5. • Globally distributed with reads and writes served from local

    region • Write optimized, latch-free database engine designed for SSDs and low latency access • Synchronous and automatic indexing at sustained ingestion rates Guaranteed low latency @ P99
  6. • System designed to independently scale storage and throughput •

    Transparent server side partition management and routing • Automatically indexed SSD storage • Automatic global distribution of data across any number of Azure regions • Optionally evict old data using built-in support for TTL Elastically scalable storage
  7.  Elastically scaling throughput from 10 to 100s of millions

    of transactions/sec across multiple regions  Fully resource governed stack  Highly responsive partition management  Modular, resource governed nested consensus  Multiple granularities of throughput (e.g. sec, min, hour) at different price points Elastically scaling throughput, anywhere, anytime 9 PM PST Less throughput More throughput More throughput Less throughput 11 PM PST
  8. Consistency models in Cosmos DB 5 well-defined consistency levels with

    clear tradeoffs Strong Bounded-stateless Session Consistent prefix Eventual Most real-life applications do not fall into these two extremes
  9. Insights from production workloads 4 18 73 2 3 Usage

    (%) Strong Bounded Staleness Session Consistent Prefix Eventual 0 0.5 1 1.5 Throughput Consistency distribution among customers Consistency vs. Throughput
  10. Retailer - Black Friday/Cyber Monday (11/18-11/30) 2016 0 2000000 4000000

    6000000 8000000 10000000 12000000 10/27/2016 11/6/2016 11/16/2016 11/26/2016 12/6/2016 12/16/2016 12/26/2016 1/5/2017 Throughput (transactions/sec)
  11. Retailer - Black Friday/Cyber Monday (11/18-11/30) 2016 99.96 99.962 99.964

    99.966 99.968 99.97 99.972 99.974 99.976 99.978 99.98 99.982 99.984 99.986 99.988 99.99 99.992 99.994 99.996 99.998 100 0.00 500,000,000.00 1,000,000,000.00 1,500,000,000.00 2,000,000,000.00 2,500,000,000.00 3,000,000,000.00 Total Requests v/s Availability TotalRequests Availability
  12. Retailer - Black Friday/Cyber Monday (11/18-11/30) 2016 0 2 4

    6 8 10 12 14 11/1/2016 11/3/2016 11/5/2016 11/7/2016 11/9/2016 11/11/2016 11/13/2016 11/15/2016 11/17/2016 11/19/2016 11/21/2016 11/23/2016 11/25/2016 11/27/2016 11/29/2016 12/1/2016 12/3/2016 12/5/2016 12/7/2016 12/9/2016 12/11/2016 12/13/2016 12/15/2016 12/17/2016 12/19/2016 12/21/2016 12/23/2016 12/25/2016 12/27/2016 12/29/2016 12/31/2016 Latency (ms) P99 latency Read P99 latency Write
  13. • Logical index layouts (inverted, tree, columnar, …) • Automatic

    and synchronous indexing of all ingested content • No schemas or secondary indices ever needed • Resource governed, write optimized database engine with latch free and log structured techniques Schema agnostic indexing locations headquarter exports 0 1 country Germany city Berlin country France city Paris city Moscow city Athens Belgium 0 1 { "locations": [ { "country": "Germany", "city": "Berlin" }, { "country": "France", "city": "Paris" } ], "headquarter": "Belgium", "exports":[{"city":"Moscow"}, {"city":"Athens"}] }
  14. • Database engine operates on atom-record-sequence (ARS) based type system

    • All data models are translated to ARS • API and wire protocols are supported via extensible modules • Instance of a given data model can be materialized as trees • Graph, documents, key-value, column-family, … more to come Native support for multiple data models SQL
  15. Azure Cosmos DB Global distribution Elastic scale out Guaranteed low

    latency Comprehensive SLAs Five consistency models SQL Key-Value Column-family Graph Documents Microsoft’s globally distributed, multi-model database service
  16. Running the Service • Weekly deployments of the entire stack

    worldwide • Quality gates • Chaos, component and functional test coverage • Automated performance, RG and consistency runs every 4 hours • 16+ hours of stress run every day • Full stack upgrades with customer workloads • Chaos tests • Automated linearizability checker and Jepsen tests • Invariant checks • All invariant violations are traced • SEV2 alerts on any invariant violation either pre or post production • Hot fix all invariant violation within 5 days • Transparently making all important metrics available to customers • SLA violations, workload metrics, PBS etc.
  17.  Global distribution, horizontal partitioning and fine-grained multi-tenancy cannot be

    an afterthought while building a cloud database  Schema agnostic database engine design is crucial for a globally distributed database  Intermediate consistency models are extremely useful  A globally distributed database must provide comprehensive SLAs beyond just high availability  Throughput, latency at 99th percentile, consistency and high availability Summary
  18. References • Getting started with Cosmos DB • cosmosdb.com •

    portal.azure.com • aka.ms/cosmosdb • Downloadable service emulator (aka.ms/CosmosDB-emulator) • Technical Overview -> https://azure.microsoft.com/en-us/blog/a-technical-overview-of- azure-cosmos-db/ • Schema Agnostic Indexing, VLDB 2015 -> http://www.vldb.org/pvldb/vol8/p1668- shukla.pdf • Follow #CosmosDB on Twitter • @azurecosmosdb • @dharmashukla • @rimmanehme