Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Amazon Aurora - Let's Talk About Performance

Amazon Aurora - Let's Talk About Performance

Presentation from AWS User Group (2015.09.24) in Warsaw by Danilo Poccia from Amazon Web Services.

AWS User Group Poland

September 24, 2015
Tweet

More Decks by AWS User Group Poland

Other Decks in Technology

Transcript

  1. ©2015,  Amazon  Web  Services,  Inc.  or  its  affiliates.  All  rights

     reserved. Amazon Aurora Let's Talk About Performance Danilo Poccia, AWS Technical Evangelist @danilop danilop
  2. Current DB architectures are monolithic Multiple layers of functionality all

    on a single box SQL Transactions Caching Logging
  3. Current DB architectures are monolithic Even when you scale it

    out, you’re still replicating the same stack SQL Transactions Caching Logging SQL Transactions Caching Logging Application
  4. Current DB architectures are monolithic SQL Transactions Caching Logging SQL

    Transactions Caching Logging Application Even when you scale it out, you’re still replicating the same stack
  5. Current DB architectures are monolithic SQL Transactions Caching Logging SQL

    Transactions Caching Logging Storage Application Even when you scale it out, you’re still replicating the same stack
  6. Reimagining the relational database What if you were inventing the

    database today? You wouldn’t design it the way we did in 1970. At least not entirely. You’d build something that can scale out, that is self-healing, and that leverages existing AWS services.
  7. Relational databases reimagined for the cloud Speed and availability of

    high-end commercial databases Simplicity and cost-effectiveness of open source databases Drop-in compatibility with MySQL Simple pay as you go pricing Delivered as a managed service
  8. A service-oriented architecture applied to the database •  Moved the

    logging and storage layer into a multi-tenant, scale-out database-optimized storage service •  Integrated with other AWS services like Amazon EC2, Amazon VPC, Amazon DynamoDB, Amazon SWF, and Amazon Route 53 for control plane operations •  Integrated with Amazon S3 for continuous backup with 99.999999999% durability Control Plane Data Plane Amazon DynamoDB Amazon SWF Amazon Route 53 Logging + Storage SQL Transactions Caching Amazon S3
  9. Simple pricing •  No licenses •  No lock-in •  Pay

    only for what you use Discounts •  Up to 45% with a 1-year RI •  Up to 66% with a 3-year RI vCPU Mem Hourly Price db.r3.large 2 15.25 $0.29 db.r3.xlarge 4 30.5 $0.58 db.r3.2xlarge 8 61 $1.16 db.r3.4xlarge 16 122 $2.32 db.r3.8xlarge 32 244 $4.64 •  Storage consumed, up to 64 TB, is $0.10/GB-month •  IOs consumed are billed at $0.20 per million I/O •  Prices are for Virginia Enterprise grade, open source pricing
  10. Establishing our ecosystem Business Intelligence Data Integration Query and Monitoring

    SI and Consulting “It is great to see Amazon Aurora remains MySQL compatible; MariaDB connectors work with Aurora seamlessly. Today, customers can take MariaDB Enterprise with MariaDB MaxScale drivers and connect to Aurora, MariaDB, or MySQL without worrying about compatibility. We look forward to working with the Aurora team in the future to further accelerate innovation within the MySQL ecosystem.”— Roger Levy, VP Products, MariaDB
  11. 1 – Establish baseline a)  MySQL dump/import b)  RDS MySQL

    to Aurora DB snapshot migration 2 – Catch-up changes a)  Binlog replication b)  Tungsten Replicator Aurora MySQL 2 - Replication 1 - Baseline Achieving near zero downtime migration to Aurora
  12. Aurora storage •  Highly available by default – 6-way replication across

    3 AZs – 4 of 6 write quorum •  Automatic fallback to 3 of 4 if an Availability Zone (AZ) is unavailable – 3 of 6 read quorum •  SSD, scale-out, multi-tenant storage – Seamless storage scalability – Up to 64 TB database size – Only pay for what you use •  Log-structured storage – Many small segments, each with their own redo logs – Log pages used to generate data pages – Eliminates chatter between database and storage SQL Transactions AZ 1 AZ 2 AZ 3 Caching Amazon S3
  13. Self-healing, fault-tolerant •  Lose two copies or an AZ failure

    without read or write availability impact •  Lose three copies without read availability impact •  Automatic detection, replication, and repair SQL Transaction AZ 1 AZ 2 AZ 3 Caching SQL Transaction AZ 1 AZ 2 AZ 3 Caching Read and write availability Read availability
  14. Traditional databases •  Have to replay logs since the last

    checkpoint •  Single-threaded in MySQL; requires a large number of disk accesses Amazon Aurora •  Underlying storage replays redo records on demand as part of a disk read •  Parallel, distributed, asynchronous Checkpointed Data Redo Log Crash at T0 requires a re-application of the SQL in the redo log since last checkpoint T0 T0 Crash at T0 will result in redo logs being applied to each segment on demand, in parallel, asynchronously Instant crash recovery
  15. Survivable caches •  We moved the cache out of the

    database process •  Cache remains warm in the event of a database restart •  Lets you resume fully loaded operations much faster •  Instant crash recovery + survivable cache = quick and easy recovery from DB failures SQL Transactions Caching SQL Transactions Caching SQL Transactions Caching Caching process is outside the DB process and remains warm across a database restart
  16. Multiple failover targets—no data loss Page cache invalidation Aurora Master

    30% Read 70% Write Aurora Replica 100% New Reads Shared Multi-AZ Storage MySQL Master 30% Read 70% Write MySQL Replica 30% New Reads 70% Write Single-threaded binlog apply Data Volume Data Volume MySQL read scaling •  Replicas must replay logs •  Replicas place additional load on master •  Replica lag can grow indefinitely •  Failover results in data loss
  17. Faster, more predictable failover Failure Detection DNS Propagation Recovery Recovery

    App running DB Failure Failure Detection Recovery App running DB Failure ? DNS Propagation
  18. Simulate failures using SQL •  To cause the failure of

    a component at the database node: ALTER SYSTEM CRASH [{INSTANCE | DISPATCHER | NODE}] •  To simulate the failure of disks: ALTER SYSTEM SIMULATE percent_failure DISK failure_type IN [DISK index | NODE index] FOR INTERVAL interval •  To simulate the failure of networking: ALTER SYSTEM SIMULATE percent_failure NETWORK failure_type [TO {ALL | read_replica | availability_zone}] FOR INTERVAL interval
  19. Write performance (console screenshot) •  MySQL Sysbench •  R3.8XL with

    32 cores and 244 GB RAM •  4 client machines with 1,000 threads each
  20. Read performance (console screenshot) •  MySQL Sysbench •  R3.8XL with

    32 cores and 244 GB RAM •  Single client with 1,000 threads
  21. Read replica lag (console screenshot) •  Aurora Replica with 7.27

    ms replica lag at 13.8 K updates per second •  MySQL 5.6 on the same hardware has ~2 s lag at 2 K updates per second
  22. Writes scale with table count - 10 20 30 40

    50 60 70 10 100 1,000 10,000 Thousands of writes per second Number of tables Write performance and table count Aurora MySQL on I2.8XL MySQL on I2.8XL with RAM Disk RDS MySQL with 30,000 IOPS (Single AZ) Tables Amazon Aurora MySQL I2.8XL local SSD MySQL I2.8XL RAM disk RDS MySQL 30K IOPS (single AZ) 10 60,000 18,000 22,000 25,000 100 66,000 19,000 24,000 23,000 1,000 64,000 7,000 18,000 8,000 10,000 54,000 4,000 8,000 5,000 Write-only workload 1,000 connections Query cache (default on for Amazon Aurora, off for MySQL)
  23. Better concurrency - 20 40 60 80 100 120 50

    500 5,000 Thousands of writes per second Concurrent connections Write performance and concurrency Aurora RDS MySQL with 30,000 IOPS (Single AZ) Connections Amazon Aurora RDS MySQL 30K IOPS (single AZ) 50 40,000 10,000 500 71,000 21,000 5,000 110,000 13,000 OLTP Workload Variable connection count 250 tables Query cache (default on for Amazon Aurora, off for MySQL)
  24. Replicas have up to 400 times less lag 2.6 3.4

    3.9 5.4 1,000 2,000 5,000 10,000 0 50,000 100,000 150,000 200,000 250,000 300,000 Updates per second Read replica lag in milliseconds Read replica lag Aurora RDS MySQL;30,000 IOPS (Single AZ) Updates per second Amazon Aurora RDS MySQL 30K IOPS (single AZ) 1,000 2.62 ms 0 s 2,000 3.42 ms 1 s 5,000 3.94 ms 60 s 10,000 5.38 ms 300 s Write workload 250 tables Query cache on for Amazon Aurora, off for MySQL (best settings)
  25. Simplify database management •  Create a database in minutes • 

    Automated patching •  Push-button scale compute •  Continuous backups to Amazon S3 •  Automatic failure detection and failover Amazon RDS
  26. Simplify storage management •  Read replicas are available as failover

    targets—no data loss •  Instantly create user snapshots—no performance impact •  Continuous, incremental backups to Amazon S3 •  Automatic storage scaling up to 64 TB—no performance or availability impact •  Automatic restriping, mirror repair, hot spot management, encryption
  27. Simplify data security •  Encryption to secure data at rest

    available soon –  AES-256; hardware accelerated –  All blocks on disk and in Amazon S3 are encrypted –  Key management via AWS KMS •  SSL to secure data in transit •  Network isolation via Amazon VPC by default •  No direct access to nodes •  Supports industry standard security and data protection certifications Storage SQL Transactions Caching Amazon S3 Application
  28. Business Value •  Enables new applications and features •  Improved

    developer and IT productivity •  Simple, easy to manage infrastructure with less downtime •  Quick recovery from failures •  Low cost (1/10th of commercial databases)
  29. Use Cases •  New Applications •  All MySQL applications • 

    High traffic web applications using relational databases •  Read/Write intensive databases •  SaaS applications/Multitenant products •  Existing RDS customers with volume size limitations (up to 64TB with Aurora) •  Commercial database applicaitons that do not heavily rely on vendor specific functionality (PL/SQL)
  30. ©2015,  Amazon  Web  Services,  Inc.  or  its  affiliates.  All  rights

     reserved. Amazon Aurora Let's Talk About Performance Danilo Poccia, AWS Technical Evangelist @danilop danilop