When Less is More: Scaling Postgres Without Breaking the Bank

Slide 1

Slide 1 text

When Less is More Scaling Postgres without Breaking the Bank Richard Yen | 2021 Postgres Build

Slide 2

Slide 2 text

2021 Copyright © EnterpriseDB Corporation All Rights Reserved • Support Engineer at EDB • Working with a range of customers • Financial institutions • Government organizations • High-tech corporations • Previously worked as a Web developer and PostgreSQL DBA About Me 2

Slide 3

Slide 3 text

Slide 4

Slide 4 text

Slide 5

Slide 5 text

2021 Copyright © EnterpriseDB Corporation All Rights Reserved Limits of CPU Scaling • Xeon Cooper Lake Platinum 8380H has 28 cores for up to 8 sockets (224 cores): $13K each • Cannot expect 200 or even 400 CPU cores to handle 10K concurrent queries • EDB’s Rule of Thumb: 4 clients per CPU • Beware of NUMA cache line contention 5

Slide 6

Slide 6 text

2021 Copyright © EnterpriseDB Corporation All Rights Reserved Limits of Disk I/O Scaling • Even the fastest disks will still suffer from lock contention on high-concurrency tables • Limited performance improvement on scans of very large tables and indexes • Performance will suffer at large checkpoints and vacuums 6

Slide 7

Slide 7 text

2021 Copyright © EnterpriseDB Corporation All Rights Reserved Limits of Memory Scaling • OS cache and Postgres cache contention • Large working sets will continue to be affected by invalidation • Large amounts of cached memory will affect performance at checkpoint time 7

Slide 8

Slide 8 text

2021 Copyright © EnterpriseDB Corporation All Rights Reserved A Bigger Server Isn’t Always the Best Idea • NUMA cache-line contention • OS scheduler and Postmaster overhead • Database lock-up due to transactionID wraparound • Lock contention for writes 8 What are the Risks?

Slide 9

Slide 9 text

"Don't put all your eggs in one basket"

Slide 10

Slide 10 text

2021 Copyright © EnterpriseDB Corporation All Rights Reserved Horizontal Scaling: Distribute & Diversify 10 ● Architecture level (replication) ● Systems level (Physical volumes, connection pooling) ● Database level (FDW, loose keying) ● Data level (partitioning)

Slide 11

Slide 11 text

2021 Copyright © EnterpriseDB Corporation All Rights Reserved Tip #1: Use replication to your advantage 11 ● Read-only replicas ● HA Proxy, pgpool-II, or some other kind of multiplexer service ● Also serves as safeguard against accidental writes

Slide 12

Slide 12 text

2021 Copyright © EnterpriseDB Corporation All Rights Reserved Tip #2: Get a connection/transaction pooler 12 ● Idle connections are not weightless ● pgbouncer transaction pooling ● Set max_connections lower ● More work_mem per session

Slide 13

Slide 13 text

Slide 14

Slide 14 text

2021 Copyright © EnterpriseDB Corporation All Rights Reserved Tip #4: Partition your data 14 ● Smaller indexes ● Partition pruning ● More concurrency ● Faster backups ● Faster vacuums ● Faster sequential scans

Slide 15

Slide 15 text

2021 Copyright © EnterpriseDB Corporation All Rights Reserved Tip #5: De-normalize with Foreign Data Wrappers / dblink 15 ● Distributes processing to other machines ● Reduces size of a single database ● Faster backups ● Possible performance gains for scans ● Can use other technologies to improve scan performance (i.e., Redis) ● Performance dependent upon network speed ● Distribution of data can make autovacuum-for-wrapround less of a risk ● zheap may make this tip moot FDW = FTW!

Slide 16

Slide 16 text

2021 Copyright © EnterpriseDB Corporation All Rights Reserved Summary 16 ● Big databases sound impressive, but come withddfdffffffff inherent risks ● Scale horizontally: Bigger machines for bigger databases doesn’t always lead to better performance ● Use partitioning to your advantage ● Partition tables ● Partition databases (i.e., FDW/dblink, replication) ● Split off into tablespaces where possible ● Set up transaction pooling where possible