Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Why Databases Do Not 'Cloud' Well

David Stokes
September 05, 2023

Why Databases Do Not 'Cloud' Well

Containers are great for ephemeral applications where you do not need to save state. Databases, as a general rule, are not useful if ephemeral. So try to set up persistent storage for these databases, where you can 'claim' disk storage that is hopefully somewhat independent of the pod lifecycle.

But what about those cases where you are starting with no active database instance? Or do you need to bring in a new database server into a database cluster and make sure it is populated with the latest and greatest data? How do you accelerate these new systems into being productive? There are ways to seed caches so that there is a minimal warm-up period that can greatly reduce the time needed to have a productive system. And there are tools like MySQL's InnoDB Clone plug-in that can rapidly populate a barebones system.

And you need a way to observe the performance of these database instances. What factors do you need to watch such as cache hits and disk throughput are obvious performance factors. But there are other factors such as queries not using indexes that are performance killers or redundant data structures that consume memory excessively.

This is a database in the cloud talk for folks who are not database folks. You will discover the wild and wooly world of databases where they can be tamed to work in the cloud. There are a lot of factors that need to be considered for efficient database operations if you want smooth and fast performance from your database instance

David Stokes

September 05, 2023
Tweet

More Decks by David Stokes

Other Decks in Technology

Transcript

  1. © Copyright 2023 Percona® LLC. All rights reserved Why Databases

    Don't "cloud" Well and What To Do About It! Containers are great for ephemeral applications where you do not need to save state. Databases, as a general rule, are not useful if ephemeral. So try to set up persistent storage for these databases, where you can 'claim' disk storage that is hopefully somewhat independent of the pod lifecycle. But what about those cases where you are starting with no active database instance? Or do you need to bring in a new database server into a database cluster and make sure it is populated with the latest and greatest data? How do you accelerate these new systems into being productive? There are ways to seed caches so that there is a minimal warm-up period that can greatly reduce the time needed to have a productive system. And there are tools like MySQL's InnoDB Clone plug-in that can rapidly populate a barebones system. And you need a way to observe the performance of these database instances. What factors do you need to watch such as cache hits and disk throughput are obvious performance factors. But there are other factors such as queries not using indexes that are performance killers or redundant data structures that consume memory excessively. This is a database in the cloud talk for folks who are not database folks. You will discover the wild and wooly world of databases where they can be tamed to work in the cloud. There are a lot of factors that need to be considered for efficient database operations if you want smooth and fast performance from your database instance 3
  2. © Copyright 2023 Percona® LLC. All rights reserved About Me

    and About Percona 4 [email protected] Technology Evangelist Percona is an open source database software, support, and services company that helps make databases and applications run better.
  3. © Copyright 2023 Percona® LLC. All rights reserved You will

    discover the wild and wooly world of databases where they can be tamed to work in the cloud. There are a lot of factors that need to be considered for efficient database operations if you want smooth and fast performance from your database instance This is a database in the cloud talk for folks who are not database folks. 5
  4. © Copyright 2023 Percona® LLC. All rights reserved Audience Participation

    Time! What is the FIRST answer if you ask Wikipedia about cloud computing???????
  5. © Copyright 2023 Percona® LLC. All rights reserved 10 Cloud

    Computing: What does it mean? If we look up the term on Wikipedia we find this -> (after the horse) Cloud computing[1] is the on-demand availability of computer system resources, especially data storage (cloud storage) and computing power, without direct active management by the user.[2] Large clouds often have functions distributed over multiple locations, each of which is a data center. Cloud computing relies on sharing of resources to achieve coherence and typically uses a pay-as-you-go model, which can help in reducing capital expenses but may also lead to unexpected operating expenses for users.[3] https://en.wikipedia.org/wiki/Cloud_computing
  6. © Copyright 2023 Percona® LLC. All rights reserved 11 Cloud

    architecture,[86] the systems architecture of the software systems involved in the delivery of cloud computing, typically involves multiple cloud components communicating with each other over a loose coupling mechanism such as a messaging queue. Elastic provision implies intelligence in the use of tight or loose coupling as applied to mechanisms such as these and others.
  7. © Copyright 2023 Percona® LLC. All rights reserved 12 In

    software engineering, containerization is operating system-level virtualization or application-level virtualization over multiple network resources so that software applications can run in isolated user spaces called containers in any cloud or non-cloud environment, regardless of type or vendor.[1] Kubernetes (/ˌk(j)uːbərˈnɛtɪs, -ˈneɪtɪs, -ˈneɪtiːz, -ˈnɛtiːz/, commonly abbreviated K8s[3]) is an open-source container orchestration system for automating software deployment, scaling, and management.[4][5] Originally designed by Google, the project is now maintained by the Cloud Native Computing Foundation. Containers and Kubernetes
  8. © Copyright 2023 Percona® LLC. All rights reserved Databases are

    not supposed to be stateless, ephemeral Use either a) Persistent storage or b) DBaaS or c) A Really Good Operator & Persistent Storage 14 Databases do not cloud container well
  9. © Copyright 2023 Percona® LLC. All rights reserved 16 Originated

    in the 1970s Permanent store Data broken down into relations - customer info, sale info, product info Very complex systems Transactional Relational Databases
  10. © Copyright 2023 Percona® LLC. All rights reserved Permanence -

    You want your data to ‘stick around’ Need to warm up, seed caches ‘Query Plans’, how to most efficient way to retrieve data, needs statistics on location • This can drastically change as data churns • Outdated statistics lead to bad query plans • Improper indexing of rows can swell execution times Making new copies of instances not instantaneous Sharding, replication need reliable infrastructure Database issues 17
  11. © Copyright 2023 Percona® LLC. All rights reserved 18 Lorry

    / Database Analogy Heavy Load handled easily Takes some time to accelerate to speed limit Needs room to slow down non-catastrophically Changing metadata at speed is very tricky
  12. © Copyright 2023 Percona® LLC. All rights reserved 19 HA

    = Continuous operation of database systems != 100% uptime • No SPOF, Redundancy, load balancing, Failure detection, & Failover FT = No Downtime • Redundant redundants, alternative environments, EXPENSIVE High Availability OR Fault Tolerance
  13. © Copyright 2023 Percona® LLC. All rights reserved 20 This

    is MySQL’s InnoDB cluster it uses a two-phase commit - this takes time If a primary fails, a secondary can take over Your applications need to check return codes to make sure that the data was uploaded.
  14. © Copyright 2023 Percona® LLC. All rights reserved 22 [mysqld]

    innodb_buffer_pool_dump_pct=40 innodb_buffer_pool_load_at_startup is enabled by default Save at shutdown Load at startup Caches
  15. © Copyright 2023 Percona® LLC. All rights reserved The clone

    plugin, introduced in MySQL 8.0.17, permits cloning data locally or from a remote MySQL server instance. Cloned data is a physical snapshot of data stored in InnoDB that includes schemas, tables, tablespaces, and data dictionary metadata. The cloned data comprises a fully functional data directory, which permits using the clone plugin for MySQL server provisioning. 23 MySQL’s Clone Plug-in
  16. © Copyright 2023 Percona® LLC. All rights reserved 24 Joe

    16 Mary 22 Wu 12 Taha 7 Mac 99 Taha 7 Wu 12 Joe 16 Mary 22 Mac 99 Logical representation Two shards
  17. © Copyright 2023 Percona® LLC. All rights reserved 25 As

    the complexity grows … The harder it is to keep track of all the pieces!
  18. © Copyright 2023 Percona® LLC. All rights reserved Quiz Time!

    Which query runs faster? Find all customers purchases, sort by descending order select customer_id, sum(price) as sales from customer group by customer_id order by sales desc; select customer_id, sum(price) as sales from customer group by customer_id order by sales desc LIMIT 10; Find the top 10 customer purchases, sort by descending order 28
  19. © Copyright 2023 Percona® LLC. All rights reserved Any aggregate

    work takes come clock time Use KEYs Does an Ad Hoc query during the middle of the day make sense? (read only replicas?) 29 1. Be reasonable about your queries
  20. © Copyright 2023 Percona® LLC. All rights reserved Slow Chews

    up disk space If you need to do it - do it! 30 2. Schema changes are ex-pen-sive!
  21. © Copyright 2023 Percona® LLC. All rights reserved Remove redundant

    indexes Check for popular queries w/o indexes Check query plans from time-to-time Have great backups 3. Housekeeping 31
  22. © Copyright 2023 Percona® LLC. All rights reserved Rebuild statistics

    Minor system impact 3. Run ANALYZE on your tables 32
  23. © Copyright 2023 Percona® LLC. All rights reserved Lock records

    for as short as possible time Look for long running transactions and eliminate them 4. Transactions 33
  24. © Copyright 2023 Percona® LLC. All rights reserved You can

    not watch your database instances 7x24x265 by yourself. A tool like Percona Monitoring and Management lets you look back in time, will send alerts, and provide insights into the future. 5. Monitor 34