Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Databases, Databases Everywhere. Which One Do I Choose?!?

Databases, Databases Everywhere. Which One Do I Choose?!?

Taylor Riggan, Sr. Specialist Solutions Architect for Graph and In-Memory Databases, outlines a purpose-built strategy for databases, where you choose the right tool for the job. We explain why your application should drive the requirements of a database, not the other way around. We introduce AWS databases that are purpose-built for your application use cases. Learn why you should select different database services to solve different aspects of an application, and watch a demonstration in which application use cases lend themselves well to specific data services. If you're a developer building modern applications that require high performance, scale, and functional databases, and you're trying to determine which relational and non-relational data services to use, this tech talk is for you.

Felipe Garcia

May 30, 2019
Tweet

More Decks by Felipe Garcia

Other Decks in Technology

Transcript

  1. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Taylor Riggan Sr. Solutions Architect Graph and In-Memory Databases Databases, Databases Everywhere. Which One Do I Choose?!? May 30th, 2019
  2. Agenda 1. Common data categories and how we got here

    2. Discuss the purpose and use cases of each data model 3. Demo 4. Where to learn more
  3. Data categories and common use cases Relational Key value Document

    In-memory Graph Referential integrity, ACID transactions, schema- on-write Low-latency, key lookups with high throughput and fast ingestion of data Indexing and storing documents with support for query on any attribute Microseconds latency, key- based queries, and specialized data structures Creating and navigating data relations easily and quickly Lift and shift, EMR, CRM, finance Real-time bidding, shopping cart, social Content management, personalization, mobile Leaderboards, real-time analytics, caching Fraud detection, social networking, recommendation engine Search Indexing and searching semistructured logs and data Product catalog, help, and FAQs, full text Time series Ledger Collect, store, and process data sequenced by time IoT applications, event tracking Complete, immutable, and verifiable history of all changes to application data Systems of record, supply chain, healthcare, registrations, financial
  4. AWS: Purpose-built databases Relational Key value Document In-memory Graph Search

    Amazon DynamoDB Amazon Neptune Amazon RDS Aurora Commercial Community Amazon ElastiCache Amazon Elasticsearch Service Amazon DocumentDB Time series Ledger Amazon Timestream Amazon Quantum Ledger Database Memcached Redis
  5. 1970 1980 1990 2000 Oracle DB2 SQL Server MySQL PostgreSQL

    DynamoDB Redis MongoDB Elasticsearch Neptune Cassandra Access Aurora 2010 Timestream QLDB Amazon DocumentDB
  6. Two fundamental areas of focus “Lift and shift” existing apps

    to the cloud Quickly build new apps in the cloud
  7. AWS Database Migration Service (AWS DMS) M i g r

    a t i n g d a t a b a s e s t o A W S Migrate between on-premises and AWS Migrate between databases Automated schema conversion Data replication for migration with zero downtime 100,000+ databases migrated
  8. Modern apps create new requirements Users: 1 million+ Data volume:

    TB–PB–EB Locality: Global Performance: Milliseconds–microseconds Request rate: Millions Access: Web, mobile, IoT, devices Scale: Up-down, Out-in Economics: Pay for what you use Developer access: No assembly required Social media Ride hailing Media streaming Dating
  9. Airbnb uses different databases based on the purpose User search

    history: Amazon DynamoDB • Massive data volume • Need quick lookups for personalized search Session state: Amazon ElastiCache • In-memory store for submillisecond site rendering Relational data: Amazon RDS • Referential integrity • Primary transactional database
  10. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T Challenge Wanted to enable anyone to learn a language for free. Solution Purpose-built databases from AWS: • DynamoDB: 31B items tracking which language exercises completed • Aurora: Primary transactional database for user data • ElastiCache: Instant access to common words and phrases Result More people learning a language on Duolingo than entire US school system 300M total users 7B exercises per month
  11. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Relational
  12. Relational model Data model • Data is stored in rows

    and tables • Data is normalized • Strict schema • Relationships established via keys enforced by the system • Data accuracy and consistency • Complex queries Patient * Patient ID First Name Last Name Gender DOB * Doctor ID Visit * Visit ID * Patient ID * Hospital ID Date * Treatment ID Medical Treatment * Treatment ID Procedure How Performed Adverse Outcome Contraindication Doctor * Doctor ID First Name Last Name Medical Specialty * Hospital Affiliation Hospital * Hospital ID Name Address Rating
  13. Relational model Patient * Patient ID First Name Last Name

    Gender DOB * Doctor ID Visit * Visit ID * Patient ID * Hospital ID Date * Treatment ID Medical Treatment * Treatment ID Procedure How Performed Adverse Outcome Contraindication Doctor * Doctor ID First Name Last Name Medical Specialty * Hospital Affiliation Hospital * Hospital ID Name Address Rating Query model: SQL SELECT d.first_name, d.last_name, count(*) FROM visit as v, hospital as h, doctor as d WHERE v.hospital_id = h.hospital_id AND h.hospital_id = d.hospital AND v.t_date > date_trunc('week’, CURRENT_TIMESTAMP - interval '1 week') GROUP BY d.first_name, d.last_name;
  14. Use cases Amazon.com is the world’s leading online retailer. The

    Amazon Transaction Risk Management Services (TRMS) team migrated more than 100 on-premises Oracle databases to Amazon Aurora. Enterprise applications Software as a service (SaaS) applications Web and mobile gaming “We are excited to announce the availability of Remedy ITSM on AWS Cloud. Our customers can now benefit from best-in-class cloud service, installation time that’s three times faster, and lower cost of ownership by supporting migration to Aurora PostgreSQL." “At the UN, we operate multiple websites with global reach that require mission-critical reliability and consistent performance. We were able to achieve superb performance, even with Amazon Aurora’s smallest database engine.”
  15. Amazon Aurora MySQL and PostgreSQL-compatible relational database built for the

    cloud Performance and availability of commercial-grade databases at 1/10 the cost Performance and scalability Availability and durability Highly secure Fully managed 5x throughput of standard MySQL and 3x of standard PostgreSQL; scale-out up to 15 read replicas Fault-tolerant, self-healing storage; six copies of data across three Availability Zones; continuous backup to Amazon S3 Network isolation, encryption at rest/transit Managed by Amazon RDS: No hardware provisioning, software patching, setup, configuration, or backups
  16. Amazon Relational Database Service (Amazon RDS) Managed relational database service

    with a choice of six popular database engines Easy to administer Available and durable Highly scalable Fast and secure No need for infrastructure provisioning, installing, and maintaining DB software Automatic Multi-AZ data replication; automated backup, snapshots, failover Scale database compute and storage with a few clicks with no app downtime SSD storage and guaranteed provisioned I/O; data encryption at rest and in transit
  17. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Key value
  18. Key-value data • Simple key-value pairs • Partitioned by keys

    • Resilient to failure • High throughput, low- latency reads and writes • Consistent performance at scale Table 1 … … Partitions … Highly partitionable data
  19. Gamers Primary Key Attributes Gamer Tag Type Hammer57 Rank Level

    Points Tier 87 4050 Elite Status Health Progress 90 30 Weapon Class Damage Range Taser 87% 50 FluffyDuffy Rank Level Points Tier 5 1072 Trainee Status Health Progress 37 8 // Status of Hammer57 GET { TableName:"Gamers", Key: { "GamerTag":"Hammer57", "Type":"Status” } } // Return all Hammer57 Gamers GamerTag = :a :a Hammer57 Key-value data
  20. Use cases for key-value data Social media Capital One uses

    DynamoDB to reduce latency for its mobile applications by moving its mainframe transactions to a serverless architecture for unbound scale Lyft leverages the scalability of DynamoDB for multiple data stores, including a ride-tracking system that stores GPS coordinates for all rides Snap migrated its largest storage workload, Snapchat Stories, to DynamoDB and improved performance while reducing costs Mobile IoT
  21. Amazon DynamoDB Fast and flexible key value database service for

    any scale Performance at scale Serverless Comprehensive security Global database for global users and apps Consistent, single-digit millisecond response times at any scale; built applications with virtually unlimited throughput No server provisioning, software patching, or upgrades; scales up or down automatically; continuously backs up your data Encrypts all data by default and fully integrates with AWS identify and Access Management for robust security Build global applications with fast access to local data by easily replicating tables across multiple AWS regions
  22. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Document
  23. Document databases • Data is stored in JSON-like documents •

    Documents map naturally to how humans model data • Flexible schema and indexing • Expressive query language built for documents (ad hoc queries and aggregations) JSON documents are first-class objects of the database { id: 1, name: "sue", age: 26, email: "[email protected]", promotions: ["new user", "5%", "dog lover"], memberDate: 2018-2-22, shoppingCart: [ {product:"abc", quantity:2, cost:19.99}, {product:"edf", quantity:3, cost: 2.99} ] }
  24. Evolution of document databases JSON became the de facto data

    interchange format Friction when converting JSON to the relational model Object-relational mappings (ORMs) were created to help with this friction Document databases solved the problem (Client) (App) (Database) JSON != Relational JSON
  25. Use cases for document data User profiles { id: 181276,

    username: "sue1942", name: {first: "Susan", last: "Benoit"} } { id: 181276, username: "sue1942", name: {first: "Susan", last: "Benoit"} } { id: 181276, username: "sue1942", name: {first: "Susan", last: "Benoit"}, ExploidingSnails: { hi_score: 3185400, global_rank: 5139, bonus_levels: true }, promotions: ["new user","5%","snail lover"] } { id: 181276, username: "sue1942", name: {first: "Susan", last: "Benoit"}, ExploidingSnails: { hi_score: 3185400, global_rank: 5139, bonus_levels: true } }
  26. Use cases for document data Mobile Retail and marketing User

    profiles Catalog Content management Personalization
  27. Amazon DocumentDB Fast, scalable, and fully managed MongoDB-compatible database service

    Fast Scalable Fully managed MongoDB compatible Millions of requests per second with millisecond latency; twice the throughput of MongoDB Separation of compute and storage enables both layers to scale independently; scale out to 15 read replicas in minutes Managed by AWS: no hardware provisioning; auto patching, quick setup, secure, and automatic backups Compatible with MongoDB 3.6; use the same SDKs, tools, and applications with Amazon DocumentDB
  28. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark In-memory
  29. In-memory • No persistence, in-memory • Microsecond performance • Simple

    commands for manipulating in memory data structures • Strings, hashes, lists, sets, and sorted sets Database Memory (buffer pool) Disk Query processor Get/Put APIs Memory Milliseconds to microseconds (10x faster) Storage engine
  30. In-memory set a "hello" // Set key "a" with a

    string value and no expiration OK get a // Get value for key "a" "hello" get b // Get value for key "b" results in miss (nil) set b "Good-bye" EX 5 // Set key "b" with a string value and a 5 second expiration "Good-bye" get b // Get value for key "b" "Good-bye" // wait >= 5 seconds get b (nil) // key has expired, nothing returned
  31. Use cases for in-memory data Caching Reducing our database queries

    up to 95% with simple caching McDonald’s uses a number of AWS services, including Amazon EC2, Elastic Load Balancing, Amazon EBS, and Amazon ElastiCache, to support its global POS system, including 200,000 registers and 300,000 POS devices Airbnb uses ElastiCache for site-wide caching Real-time bidding Real time
  32. Amazon ElastiCache Redis and Memcached compatible, in-memory data store and

    cache Redis and Memcached compatible Extreme performance Secure and reliable Easily scalable Fully compatible with open source Redis and Memcached In-memory data store and cache for microsecond response times Network isolation, encryption at-rest/in- transit, HIPAA, PIC, FedRAMP, Multi-AZ, and automatic failover Scales writes and reads with sharding and replicas
  33. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Search
  34. Search The bright blue butterfly hangs on the breeze It’s

    best to forget the great sky and to retire from every wind. Under blue sky, in bright sunlight, one need search around. Document 1 Document 2 Document 3 a and around every for from in is it not on one the to under ID Term Document 1 best 2 2 blue 1, 3 3 bright 1,3 4 breeze 1 5 butterfly 1 6 forget 2 7 great 2 8 hangs 1 9 need 3 10 retire 2 11 search 3 12 sky 2, 3 13 wind 2 Inverted index Stopword list
  35. Search _search?q=house "hits": { "total": 85, "max_score": 6.6137657, "hits": [{

    "_index": "movies", "_type": "movie", "_id": "tt0077975", "_score": 6.6137657, "_source": { "directors": [ "John Landis" ], "release_date": "1978-07-27T00:00:00Z", "rating": 7.5, "genres": [ "Comedy", "Romance" ], "image_url": "http://ia.jpg "plot": "At a 1962 College, Dean Vernon Wormer…", "title": "Animal House", "rank": 527, "running_time_secs": 6540, "actors": [ "John Belushi","Karen Allen","Tom Hulce" ], "year": 1978, "id": "tt0077975" } },
  36. Search use cases Log analytics Adobe uses Amazon Elasticsearch Service

    to cost- effectively analyze and visualize large amount of log data for its developer platform, which at peak receives over 200K API calls per second. Full text search Clickstream analytics Hearst Corporation built a clickstream analytics platform using Amazon Elasticsearch Service, Amazon Kinesis Data Streams, and Amazon Kinesis Data Firehose to transmit and process 30 terabytes of data a day from 300+ Hearst websites worldwide. MirrorWeb uses Amazon Elasticsearch Service to make the UK Government and UK Parliament’s web archives searchable. With Amazon Elasticsearch Service, MirrorWeb indexed 1.4B documents for just $337 and indexed 146M docs per hour—14x faster than the previously used technology.
  37. Amazon Elasticsearch Service Fully managed, reliable, and scalable Elasticsearch service

    Easy to use Scalable Highly available Secure Deploy a production- ready Elasticsearch cluster in minutes Resize your cluster with a few clicks or a single API call Replicate across Availability Zones, with monitoring and automated self-healing Deploy into your VPC and restrict access using security groups and IAM policies
  38. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Graph
  39. Graph data • Relationships are first-class objects • Data is

    modeled and queries as a graph • Vertices connected by Edges • Creating and navigating relations between data easily and quickly Purchased Purchased Follows Purchased Knows Product Sport Follows
  40. Graph Social networking Life sciences Network & IT operations Fraud

    detection Recommendations Knowledge graphs
  41. Graph use case // Product recommendation to a user PU

    RCH ASED PURCHASED PURCHASED PURCHASED PURCHASED KNOWS BOOK #1 BOOK #2 PURCHASED BOOK #3
  42. Amazon Neptune Fully managed graph database Fast Reliable Easy Open

    Query billions of relationships with millisecond latency Six replicas of your data across three AZs with fully backup and restore Build powerful queries easily with Gremlin and SPARQL Supports Apache TinkerPop & W3C RDF graph models
  43. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Demo
  44. Retail demo application Demo application: 1. Available today 2. On

    GitHub: /aws-samples/aws- bookstore-demo-app 3. One-click AWS CloudFormation deployment Search Indexing and searching semistructured logs and data Product search Amazon Neptune Amazon Elasticsearch Service Key-value High throughput, Low-latency reads and writes, endless scale Shopping cart, user profile Graph Quickly and easily create and navigate relationships between data Product recommendation In-memory Query by key with microsecond latency Product leaderboard DynamoDB ElastiCache
  45. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Time series
  46. Time series data What is time series data? What is

    special about a time series database? A sequence of data points recorded over a time interval Time is the single primary axis of the data model t
  47. Time series use case Application events IoT sensor readings DevOps

    data Humidity % Water vapor 91.0 94.0 86.0 93.0
  48. Existing time-series databases Relational databases Difficult to maintain high availability

    Difficult to scale Limited data lifecycle management Inefficient time series data processing Unnatural for time series data Rigid schema inflexible for fast moving time series data Building with time series data is challenging
  49. Amazon Timestream (sign up for the preview) Fast, scalable, fully

    managed time-series database 1,000x faster and 1/10 the cost of relational databases Collect data at the rate of millions of inserts per second (10M/second) Trillions of daily events Adaptive query processing engine maintains steady, predictable performance Time-series analytics Built-in functions for interpolation, smoothing, and approximation Serverless Automated setup, configuration, server provisioning, software patching
  50. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Ledger
  51. Common customer use cases Ledgers with centralized control Healthcare Verify

    and track hospital equipment inventory Manufacturers Track distribution of a recalled product HR & payroll Track changes to an individual’s profile Government Track vehicle title history
  52. Challenges with building ledgers Adds unnecessary complexity Blockchain RDBMS –

    audit tables Difficult to maintain Hard to use and slow Hard to build Custom audit functionality using triggers or stored procedures Impossible to verify No way to verify changes made to data by sys admins
  53. Ledger database concepts C | H J Journal C |

    H Current | History Current | History Journal Ledger comprises J L Ledger database L Journal determines Current | History
  54. ID Manufacturer Model Year VIN Owner ID Version Start End

    Manufacturer Model Year VIN Owner How it works ID Manufacturer Model Year VIN Owner 1 Tesla Model S 2012 123456789 Traci Russell INSERT INTO cars << { 'Manufacturer': 'Tesla', 'Model': 'Model S', 'Year': '2012', 'VIN': '123456789', 'Owner': 'Traci Russel' } >> FROM cars WHERE VIN = '123456789' UPDATE owner = 'Ronnie Nash' FROM cars WHERE VIN = '123456789' UPDATE owner = 'Elmer Hubbard' J ID Version Start End Manufacturer Model Year VIN Owner 1 1 07/16/2012 NULL Tesla Model S 2012 123456789 Traci Russell current.cars C history.cars H ID Version Start End Manufacturer Model Year VIN Owner 1 1 07/16/2012 08/03/2013 Tesla Model S 2012 123456789 Traci Russell 1 2 08/03/2013 NULL Tesla Model S 2012 123456789 Ronnie Nash ID Version Start End Manufacturer Model Year VIN Owner 1 1 07/16/2012 08/03/2013 Tesla Model S 2012 123456789 Traci Russell 1 2 08/03/2013 09/02/2016 Tesla Model S 2012 123456789 Ronnie Nash 1 3 09/02/2016 NULL Tesla Model S 2012 123456789 Elmer Hubbard ID Manufacturer Model Year VIN Owner 1 Tesla Model S 2012 123456789 Ronnie Nash ID Manufacturer Model Year VIN Owner 1 Tesla Model S 2012 123456789 Elmer Hubbard INSERT cars ID:1 Manufacturer: Tesla Model: Model S Year: 2012 VIN: 123456789 Owner: Traci Russell Metadata: { Date:07/16/2012 } H (x) UPDATE cars ID:1 Owner: Ronnie Nash Metadata: { Date:08/03/2013 } H (x) UPDATE cars ID:1 Owner: Elmer Hubbard Metadata: { Date: 09/02/2016 } H (x)
  55. Amazon Quantum Ledger Database (Amazon QLDB) (preview) Fully managed ledger

    database Track and verify history of all changes made to your application’s data Immutable Maintains a sequenced record of all changes to your data, which cannot be deleted or modified; you can to query and analyze the full history Cryptographically verifiable Uses cryptography to generate a secure output file of your data’s history Easy to use Easy to use, letting you use familiar database capabilities like SQL APIs for querying the data Highly scalable Executes 2–3x as many transactions than ledgers in common blockchain frameworks
  56. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Purpose-built
  57. Team Internet AWS re:Invent 2017: Running Lean Architectures: How to

    Optimize for Cost Efficiency (ARC303) “For every single purpose within our application, we our now using different databases” “…now we can pick the right tool for every job we have”
  58. AWS: Purpose-built databases Relational Key-value Document In-memory Graph Search Amazon

    DynamoDB Amazon Neptune Amazon RDS Aurora Commercial Community Amazon ElastiCache Amazon Elasticsearch Service Amazon DocumentDB Time series Ledger Amazon Timestream Amazon QLDB Memcached Redis
  59. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark More information
  60. Additional resources Andy Jassy’s re:Invent 2017/2018 keynote: Databases 2017: https://www.youtube.com/watch?v=1IxDLeFQKPk&feature=youtu.be&t=37m47s

    2018: https://youtu.be/ZOIkOnW640A?t=3238 Werner Vogel’s blog: A one size fits all database doesn't fit anyone https://www.allthingsdistributed.com/2018/06/purpose-built-databases-in- aws.html https://aws.amazon.com/products/databases/ AWS Databases
  61. Additional resources https://www.youtube.com/watch?v=hwnNbLXN4vA https://www.youtube.com/watch?v=-pb-DkD6cWg AWS re:Invent 2018: Databases on AWS:

    The Right Tool for the Right Job (DAT205-R1) AWS re:Invent 2018: Building with AWS Databases: Match Your Workload to the Right Database (DAT301)
  62. © 2019, Amazon Web Services, Inc. or its affiliates. All

    rights reserved. S U M M I T © 2019, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Thank you! Taylor Riggan triggan@