Upgrade to Pro — share decks privately, control downloads, hide ads and more …

CosmosDB: Jack Of All Trades, Master Of Many

Daron Yondem
November 04, 2017

CosmosDB: Jack Of All Trades, Master Of Many

This is the slide deck I used during SQL Saturday event in Istanbul.
Last updated on 12/21/2017

Daron Yondem

November 04, 2017
Tweet

More Decks by Daron Yondem

Other Decks in Technology

Transcript

  1. 1-Click Global Replication • Ring 0 Service • Multi-Homing •

    Priorities for regions • Manual or automatic failover
  2. 99.99% SLA for Low latency reads + writes • Reads

    and writes served from local region • Guaranteed millisecond latency worldwide • Write optimized, latch-free database engine • Automatically indexed SSD storage Reads (1KB) Indexed writes (1KB) Read < 2 ms Writes < 6 ms Read < 10 ms Writes < 15 ms 99% 50% • Synchronous and automatic indexing at sustained ingestion rates • No schema or index management needed • No schema versioning needed • No schema migration needed
  3. What is RU? • Request Unit • Not all requests

    are equal. • A normalized quantity of request unit based on the amount of computation (CPU, memory, and IOPS) required to serve the request.
  4. How to calculate? Item Size Reads/second Writes/second Request units 1

    KB 500 100 (500 * 1) + (100 * 5) = 1,000 RU/s 1 KB 500 500 (500 * 1) + (500 * 5) = 3,000 RU/s 4 KB 500 100 (500 * 1.3) + (100 * 7) = 1,350 RU/s 4 KB 500 500 (500 * 1.3) + (500 * 7) = 4,150 RU/s 64 KB 500 100 (500 * 10) + (100 * 48) = 9,800 RU/s 64 KB 500 500 (500 * 10) + (500 * 48) = 29,000 RU/s See: https://www.documentdb.com/capacityplanner
  5. For example • SELECT * FROM c • (2.87 RU)

    • SELECT * FROM c where Contains (c.Name, "Sample") • (2.45 RU)
  6. public static async Task<IEnumerable<T>> GetItemsAsync(Expression<Func<T, bool>> predicate) { double queryCost

    = 0; IDocumentQuery<T> query = client.CreateDocumentQuery<T>( UriFactory.CreateDocumentCollectionUri(DatabaseId, CollectionId), new FeedOptions { MaxItemCount = -1 }) .Where(predicate) .AsDocumentQuery(); List<T> results = new List<T>(); while (query.HasMoreResults) { var response = await query.ExecuteNextAsync<T>(); queryCost += response.RequestCharge; results.AddRange(response); } Debug.WriteLine(queryCost.ToString()); return results; }
  7. Request Unit Management Single Partition Container Partitioned Container Minimum Throughput

    400 RU/sec 1.000 RU/sec Maximum Throughput 10.000 RU/sec Unlimited Offer offer = client.CreateOfferQuery() .Where(r => r.ResourceLink == collection.SelfLink) .AsEnumerable().SingleOrDefault(); offer = new OfferV2(offer, 12000); client.ReplaceOfferAsync(offer); A partition key is required to scale your collection's throughput beyond 2,500 request units in the future
  8. Affecting RUs are; • Item size • Item property count

    (Indexing) • Data consistency (Strong or Bounded Staleness) • Indexed properties (lazy indexing can help) • Document indexing (Disable if you don’t need) • Query patterns (predicates, UDFs, data source size) • Script usage (SPs, triggers)
  9. Choose Your Consistency Level 01 Strong Bounded Staleness Session Consistent

    Prefix Eventual Clear Tradeoffs • Latency • Availability • Throughput Lower latency, higher availability, better read scalability.
  10. Bounded Staleness 01 Strong Bounded Staleness Session Consistent Prefix Eventual

    When choosing bounded staleness, the "staleness" can be configured in two ways: number of versions K of the item by which the reads lag behind the writes, and the time interval t Lower latency, higher availability, better read scalability.
  11. Consistent Prefix 01 Strong Bounded Staleness Session Consistent Prefix Eventual

    Consistent prefix guarantees that reads never see out of order writes. If writes were performed in the order A, B, C, then a client sees either A, A,B, or A,B,C, but never out of order like A,C or B,A,C. Lower latency, higher availability, better read scalability.
  12. Native Support for Multiple Data Models • Database engine operates

    on atom-record-sequence (ARS) based type system • All data models are efficiently translated to ARS • API and wire protocols are supported via extensible modules • Instance of a given data model can be materialized as trees • Graph, documents, key-value, column-family, … more to come KEY-VALUE COLUMN-FAMILY DOCUMENT GRAPH
  13. Auto Indexing • IndexingMode • Consistent (Collection Consistency applies) •

    Lazy (ingest now, query later) • None (EnableScanInQuery) • DataTypes • String, Number, Point, Polygon • Index Types • Hash (joins) • Range (<, >) • Spatial • Precision can be defined.
  14. 4 Axis SLA Latency @ 99th percentile SLA Throughput SLA

    Consistency SLA Availability SLA 2 4 3 1 Cosmos DB: 99.99% HA within a single region 99.999% across regions 99.99 SLA throughput, latency, consistency all at the 99th percentile
  15. Security • Documents and backups are encrypted at rest •

    IP-based access controls • Role-based access controls • Automated online backups • Attack monitoring • Geo-fencing
  16. Disclaimer • Cosmos DB is not a SQL Database, no

    complex table joins. (you are doing it wrong) • Other NoSQL databases are good at doing one or two things really well but not native to Cloud.