Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Azure Cosmos DB

Azure Cosmos DB

This is a presentation I borrowed from Rimma Nehme and customized for a Cosmos DB webinar. Enjoy.

Daron Yondem

June 15, 2017
Tweet

More Decks by Daron Yondem

Other Decks in Technology

Transcript

  1. Azure Cosmos DB
    Microsoft’s globally-distributed database service
    @daronyondem

    View Slide

  2. This is What a Typical Activity of an App Looks Like

    View Slide

  3. Lowest Cost
    Cosmos DB:
    Deeply Exploits Cloud Core Properties and Economies of Scale
    IaaS hosted managed
    offerings cannot beat this
    Millions of trans/sec
    Petabytes of data
    Scale-out Architecture
    Global Distribution from
    the Ground Up
    Fully-managed and Secure

    View Slide

  4. Azure Cosmos DB: Value to Customer
    Become more productive
    Save money
    Global Business
    Store
    Become more flexible
    Become more responsive
    Supplier
    Partner
    Become more innovative

    View Slide

  5. Turnkey global distribution
    1

    View Slide

  6. Globally-distribute data around the world
    Turn-key global
    distribution
    Automatically replicate all your data around the world – across more regions than Amazon and Google combined

    View Slide

  7. Global Distribution From The Ground-Up
    • Cosmos DB is a Foundational (Ring 0) Azure service
    – Available in all Azure regions by default, including sovereign/government clouds
    • Transparent and automatic multi-region replication
    – Associate any number of regions with your database account, at any time
    – Policy based geo-fencing
    • Multi-homing APIs
    – All endpoints are logical, by default
    – Apps don’t need to be redeployed during regional failover
    – Apps can also access physical endpoints if needed
    • Support for both manual and automatic failover
    • Designed for high availability
    – Allows for dynamically setting priorities to regions
    – Simulate regional disasters via API
    – Test the end-to-end availability for the entire app (beyond just the database)
    • Comprehensive SLAs
    – First and only to offer comprehensive SLA for latency, throughput, availability and consistency

    View Slide

  8. Guaranteed Single Digit ms
    Latency at the 99th Percentile
    2

    View Slide

  9.  Reads and writes served from local region
     Guaranteed millisecond latency worldwide
     Write optimized, latch-free database engine
     Automatically indexed SSD storage
     Synchronous and automatic indexing at sustained ingestion rates
     No schema or index management needed
     No schema versioning needed
     No schema migration needed
     All of this is highly relevant for rapidly evolving apps
    in a globally distributed setup
    Guaranteed Low Latency
    Reads (1KB) Indexed writes (1KB)
    Read < 2 ms
    Writes < 6 ms
    Read < 10 ms
    Writes < 15 ms
    99%
    50%

    View Slide

  10. Customer Application Example

    View Slide

  11. Customer Application Example

    View Slide

  12. Elastic Scaleout
    STORAGE
    THROUGHPUT
    3

    View Slide

  13. Transaction
    data
    Web/content data
    Social data/Machine-generated data
    KB
    Data variety/complexity
    Data volume
    Log 10 scale
    1 15
    Cosmos DB: Elastically Scalable Storage
    • Single machine is never a bottleneck
    • A single table can scale from GB-PBs, across many
    machines, and regions
    • Transparent server side partition management and
    routing
    • Optionally evict old data using built-in support for
    TTL
    • Policy based, automatic tiering to any HDFS compatible
    data lake (e.g. ADLS or Azure Storage)

    • Customers pay only for the throughput and storage
    they need

    View Slide

  14. Cosmos DB: Elastically Scalable Throughput
    • Elastically scale throughput from 10 to
    100s of millions of requests/sec across
    multiple regions
    • Support for requests/sec and
    requests/min for different workloads
    – This ensures that never have to provision
    for the peak
    • Customers pay only for the throughput
    and storage they need
    • Customers pay by the hour for the
    provisioned throughput

    View Slide

  15. Cosmos DB Total Cost of Ownership (TCO)
    4

    View Slide

  16. 46,920
    10,000
    100,000
    98,990
    92,323
    55,403
    100,000
    0
    20,000
    40,000
    60,000
    80,000
    100,000
    120,000
    sec 1
    sec 4
    sec 7
    sec 10
    sec 13
    sec 16
    sec 19
    sec 22
    sec 25
    sec 28
    sec 31
    sec 34
    sec 37
    sec 40
    sec 43
    sec 46
    sec 49
    sec 52
    sec 55
    sec 58
    sec 61
    sec 64
    sec 67
    sec 70
    sec 73
    sec 76
    sec 79
    sec 82
    sec 85
    sec 88
    RU/m - Predictable Performance For Unpredictable Needs
    RU Consumed RU/sec RU/min
    Second 29: 36,920 RUs
    consumed above
    provisioned RU/sec (10k).
    Remaining Budget
    RU/min: 55,403
    Second 61:
    RU/min
    budget reset
    to 100,000
    Cosmos DB – Lowest TCO
    Deeply exploits cloud core properties and economies of scale
    Azure Cosmos DB
    Cosmos DB
    Cosmos DB: 5-10X
    more cost-effective
    Customers save 60-73% in provisioning cost!
    • Commodity hardware
    • Fine-grained multi-tenancy
    • End to end resource governance
    • Optimal utilization of resources

    View Slide

  17. RU/m - Predictable Performance For Unpredictable Needs
    46,920
    10,000
    100,000
    98,990
    92,323
    55,403
    100,000
    0
    20,000
    40,000
    60,000
    80,000
    100,000
    120,000
    sec 1
    sec 3
    sec 5
    sec 7
    sec 9
    sec 11
    sec 13
    sec 15
    sec 17
    sec 19
    sec 21
    sec 23
    sec 25
    sec 27
    sec 29
    sec 31
    sec 33
    sec 35
    sec 37
    sec 39
    sec 41
    sec 43
    sec 45
    sec 47
    sec 49
    sec 51
    sec 53
    sec 55
    sec 57
    sec 59
    sec 61
    sec 63
    sec 65
    sec 67
    sec 69
    sec 71
    sec 73
    sec 75
    sec 77
    sec 79
    sec 81
    sec 83
    sec 85
    sec 87
    sec 89
    DocumentDB RU Consumption and Provisioning
    RU Consumed RU/sec RU/min
    Second 29: 36,920 RUs
    consumed above
    provisioned RU/sec
    (10k). Remaining Budget
    RU/min: 55,403
    Second 61: RU/min
    budget reset to 100,000
    Customers save 60-73% in provisioning cost!
    Guaranteed low latency for spiky
    workloads

    View Slide

  18. Programmable Data Consistencies
    5
    Navigating CAP theorem
    Consistent data worldwide

    View Slide

  19. Programmable Data Consistency
    Strong consistency
    High latency
    Eventual consistency,
    Low latency

    View Slide

  20. Programmable Data Consistency
    • Databases are divided into two categories
    – Provide extreme choices – strong vs. eventual consistency (e.g.,
    DynamoDB)
    – Leave everything for developers to configure (e.g., Cassandra)
    • Read repair, Hinted handoff, quorum sizes, replication topologies etc
    • Developers have to make precise tradeoffs between
    – Consistency and availability (during failures)
    – Consistency and latency (during steady state)
    – Consistency and throughput (this is important for TCO reasons)

    View Slide

  21. Choices of Consistency
    5 well-defined consistency levels for low latency and high availability
    Strong Bounded-stateless Session Consistent prefix Eventual
    Most real-life applications do not fall into these two extremes

    View Slide

  22. Azure Cosmos DB
    01
    Strong
    Bounded
    Staleness
    Session
    Consistent
    Prefix
    Eventual
    5 well-defined consistency models
    Clear Tradeoffs
    • Latency
    • Availability
    • Throughput

    View Slide

  23. Latency @ 99th
    percentile SLA
    Throughput SLA
    Consistency SLA
    Availability SLA
    2
    4
    3
    1
    Industry-Leading, Comprehensive SLAs
    6

    View Slide

  24. Comprehensive SLAs
    Globally distributed database needs to
    tackle
    1. latency vs. consistency tradeoffs
    (in steady state)
    2. availability vs. consistency
    tradeoff (during failures)
    3. throughput vs. consistency
    tradeoffs during all times
    4. throughput vs. latency tradeoffs
    during all times
    Simply offering high availability SLAs
    are not sufficient!
    Cosmos DB:
    – 99.99% HA within a single region
    – 99.999% across regions
    – 99.99 SLA throughput, latency,
    consistency all at the 99th
    percentile

    View Slide

  25. High Availability
    Performance Latency
    Performance Throughput
    Data Consistency
    Only database with comprehensive SLAs across 4 dimensions
    Microsoft Azure

    View Slide

  26. Schema-agnostic, automatic
    indexing
    7

    View Slide

  27. Schema-agnostic, automatic indexing
    • At global scale, schema/index management is hard
    • Automatic and synchronous indexing of all ingested content - hash, range, geo-
    spatial, and columnar
    – No schemas or secondary indices ever needed
    • Resource governed, write optimized database engine with latch free and log
    structured techniques
    • Online and in-situ index transformations
    • While the database is fully schema-agnostic, schema-extraction is built in
    – Customers can get Avro schemas from the database

    View Slide

  28. Native Multi-Model
    8

    View Slide

  29. Why Multi-Model?
    Transaction
    data
    Web/content data
    Social data/Machine-generated data
    KB
    Data variety/complexity
    Data volume
    Log 10 scale
    1 15
    Who Wants to Have 3-5 Different Backend Databases?

    View Slide

  30. Global Distribution
    from the ground-up
    Limitless Scale Extremely Low Latency Multiple Consistency Levels ARS model Comprehensive SLAs
    Planet-Scale
    Multi-Model
    Multi-API
    Versatile Workloads
    Operational
    Workloads
    Analytical
    Workloads
    Key-Value Tabular Graph
    Documents
    Azure Cosmos DB
    Relational
    ANSI SQL

    View Slide

  31. Native Support for Multiple Data Models
    • Database engine operates on atom-record-sequence (ARS) based type system
    – All data models are efficiently translated to ARS
    • API and wire protocols are supported via extensible modules
    • Instance of a given data model can be materialized as trees
    • Graph, documents, key-value, column-family, … more to come
    KEY-VALUE COLUMN-FAMILY DOCUMENT GRAPH

    View Slide

  32. View Slide

  33. Tables API in Azure Cosmos DB
    ✓Premium experience (low latency, well-defined consistency)
    ✓Globally Distributed
    ✓Secondary Indexes for user-defined queries
    ✓Millisecond latency, Guaranteed throughput
    ✓We heard you – “Top user voice asks”
    Azure Cosmos DB:
    Table API
    Azure Storage:
    Standard Table API
    Azure Storage SDKs
    100% Backwards compatible, Seamless experience
    Azure Cosmos DB:
    Table API
    Azure Storage SDKs
    Coming Soon : Update for standard Tables, optimized for storage
    Seamless migration

    View Slide

  34.  Model the real world
     Relationship as first-class entities
     Optimized for graph storage & traversal
     Gremlin standard
    Gremlin API in Azure Cosmos DB
    Azure Cosmos DB:
    Graph API

    View Slide

  35. Globally distributed, elastically scalable, low latency,
    auto-indexed service
    Independently scalable graph engine (using Tinkerpop
    framework)
    Gremlin and SQL query languages
    Native Graph Processing

    View Slide

  36. Security, Encryption, Compliance
    9

    View Slide

  37. Security & Compliance
    Enterprise grade security
    Encryption at Rest
    • Always encrypted at rest and in motion
    • Data, index, backups, and attachments encrypted
    Encryption is enabled automatically by default
    • No impact on performance, throughput or availability
    • Transparent to your application
    Comprehensive Azure compliance certification
    • ISO 27001, ISO 27018, EUMC, HIPAA, PCI
    • SOC1 and SOC2 (Audit complete, Certification in Q2 2017)
    • FedRAMP
    , IRS 1075, UK Official (IL2) (Q2 2017)
    • HITRUST (H2 2017)

    View Slide

  38. Thank You

    View Slide

  39. Getting Started
     Web
     cosmosdb.com
     portal.azure.com
     aka.ms/cosmosdb
     aka.ms/cosmosdb-Tables
     aka.ms/cosmosdb-Graph
     aka.ms/cosmosdb-MongoDB
     aka.ms/cosmosdb-DocumentDB
     cosmosdb.com/capacityplanner
     Download
     aka.ms/CosmosDB-emulator
     Re-visit Build session recordings on Channel 9.
     Continue your education at
    Microsoft Virtual Academy online.

    View Slide