Upgrade to Pro — share decks privately, control downloads, hide ads and more …

When Less is More: Scaling Postgres Without Bre...

Richard Yen
December 08, 2021

When Less is More: Scaling Postgres Without Breaking the Bank

When an application scales up, the underlying database often needs to be tweaked to keep up with the incoming demand. Intuitively sysadmins and database admins scale up the hardware and tune configuration parameters to meet higher demand. Unfortunately, it's not always the case that more hardware and resources will lead to better performance. In this talk, we'll cover why scaling up isn't just about putting in more money and proportionally allocating more resources.

Richard Yen

December 08, 2021
Tweet

More Decks by Richard Yen

Other Decks in Technology

Transcript

  1. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved • Support

    Engineer at EDB • Working with a range of customers • Financial institutions • Government organizations • High-tech corporations • Previously worked as a Web developer and PostgreSQL DBA About Me 2
  2. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Effects of

    Scaling Up • CPU concurrency • Disk I/O • Memory Usage • Network card 3
  3. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Natural solution:

    Level up 4 • Bigger CPU • More RAM • Better storage
  4. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Limits of

    CPU Scaling • Xeon Cooper Lake Platinum 8380H has 28 cores for up to 8 sockets (224 cores): $13K each • Cannot expect 200 or even 400 CPU cores to handle 10K concurrent queries • EDB’s Rule of Thumb: 4 clients per CPU • Beware of NUMA cache line contention 5
  5. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Limits of

    Disk I/O Scaling • Even the fastest disks will still suffer from lock contention on high-concurrency tables • Limited performance improvement on scans of very large tables and indexes • Performance will suffer at large checkpoints and vacuums 6
  6. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Limits of

    Memory Scaling • OS cache and Postgres cache contention • Large working sets will continue to be affected by invalidation • Large amounts of cached memory will affect performance at checkpoint time 7
  7. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved A Bigger

    Server Isn’t Always the Best Idea • NUMA cache-line contention • OS scheduler and Postmaster overhead • Database lock-up due to transactionID wraparound • Lock contention for writes 8 What are the Risks?
  8. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Horizontal Scaling:

    Distribute & Diversify 10 • Architecture level (replication) • Systems level (Physical volumes, connection pooling) • Database level (FDW, loose keying) • Data level (partitioning)
  9. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Tip #1:

    Use replication to your advantage 11 • Read-only replicas • HA Proxy, pgpool-II, or some other kind of multiplexer service • Also serves as safeguard against accidental writes
  10. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Tip #2:

    Get a connection/transaction pooler 12 • Idle connections are not weightless • pgbouncer transaction pooling • Set max_connections lower • More work_mem per session
  11. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Tip #3:

    Use multiple physical volumes 13 • pg_wal • log_directory • Tablespaces
  12. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Tip #4:

    Partition your data 14 • Smaller indexes • Partition pruning • More concurrency • Faster backups • Faster vacuums • Faster sequential scans
  13. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Tip #5:

    De-normalize with Foreign Data Wrappers / dblink 15 • Distributes processing to other machines • Reduces size of a single database • Faster backups • Possible performance gains for scans • Can use other technologies to improve scan performance (i.e., Redis) • Performance dependent upon network speed • Distribution of data can make autovacuum-for-wrapround less of a risk • zheap may make this tip moot FDW = FTW!
  14. 2021 Copyright © EnterpriseDB Corporation All Rights Reserved Summary 16

    • Big databases sound impressive, but come withddfdffffffff inherent risks • Scale horizontally: Bigger machines for bigger databases doesn’t always lead to better performance • Use partitioning to your advantage • Partition tables • Partition databases (i.e., FDW/dblink, replication) • Split off into tablespaces where possible • Set up transaction pooling where possible