Visualizing Postgres I/O Performance for Development

Visualizing Postgres I/O Performance for Development Melanie Plageman

Total TPS != User Experience Total TPS 22,029 21,861

View Performance Metrics over Time

Use Multiple Systems and Tools to Gather Information

Storage Stack Layers

Metrics Sources - Postgres - pg_stat_io - pg_buffercache_summary - pg_stat_wal
- pg_stat_activity waits - pg_total_relation_size() - Operating System - /proc/meminfo - pidstat - iostat - Benchmark - pgbench latency - pgbench TPS

Benchmark Setup For Scenarios • 16 core, 32 thread AMD
CPU • Linux 5.19 • Sabrent Rocket NVMe 4.0 2TB (seq r/w 5000/4400 MBps, random r/w 750000 IOPS) • ext4 w noatime,data=writeback • 64 GB RAM • 2 MB huge pages • Postgres compiled from source at O2 • pgbench

Using Metrics Together to Understand the Why

backend_flush_after

backend_flush_after 1MB finishes faster pgbench, 10 MB file COPY 16
clients 700 transactions 20 GB shared buffers

More backend writebacks

Latency spikes without backend_flush_after as queue fills up

Kernel writing out dirty data

Initial TPS dip likely caused by memory pressure. Free memory
hits 0

Second dip coincides with checkpoint

Using Metrics to Clarify other Metrics

wal_compression

Fewer Transactions without wal_compression pgbench, TPCB-like built-in, mode=prepared data scale
4000 16 clients 600 seconds 20 GB shared buffers

Higher latency and lower TPS without WAL compression

Fewer Full Page Writes without wal_compression because fewer transactions

Fewer writes/second without WAL compression

Higher write throughput without WAL compression, so more larger writes

WAL bytes higher without WAL compression, so the increased writes
were WAL I/O

WAL syncs much higher without compression, so additional flush requests
are WAL

Backends doing fewer writes and reads without compression. Bottlenecked on
WAL I/O

Benchmark Setup For Correct Comparisons

initdb before every benchmark

Without doing initdb first, 2000 COPY FROMs complete sooner pgbench,
1 MB file COPY 16 clients 2000 transactions 20 GB shared buffers

More flush requests issued after just having done initdb

Waiting for WAL Init Sync after having done intidb

TPS dip at 40 seconds corresponds with running out of
system memory

Increase wal_segment_size to 1GB, COPY FROMs take much longer, TPS
is very spikey after initdb

Fewer flush requests because each one takes longer and WAL
file allocation takes longer with bigger WAL segment size

With reduced min_wal_size and pause after loading data, performance without
initdb is similar to with initdb

The number of flush requests is the same as with
initdb

WAL Init Sync Waits with pause and decreased min_wal_size

Benchmark Configuration Choices

prepared vs simple

Higher TPS with prepared query mode vs simple

Additional CPU usage with simple query mode

Benchmark Choice and Reflecting Customer Workloads

data access distribution

Gaussian data access distribution often performs better than uniform random
access and is similar to real workloads pgbench, TPCB-like built-in and custom, mode=prepared, sync commit = off data scale 4200 16 clients 500 seconds 20 GB shared buffers

Uniform random access does more reads and writes because working
set doesn’t fit in memory

Usage count is low for random data access distribution

Backend cache hit ratio is worse for uniform random access

More evictions of shared buffers

Backends are doing more reads and writes

Determine when System Configurations Matter

readahead

read_ahead_kb target readahead = sequential BW * latency

Larger read_ahead_kb finishes slightly sooner pgbench, SELECT * FROM large_table
5 GB table 1 client 3 transactions 8 GB shared buffers

Read request size is much larger

With 1ms added latency via dmsetup delay, run with read_ahead_kb
2048 finishes in 30 seconds

Large request size and large read throughput

Questioning Your Assumptions

autovacuum_vacuum_cost_delay

TPS starts high and gradually goes down with autovacuum_vacuum_cost_delay >
0 pgbench, TPCB-like@1 + INSERT/DELETE@9 , mode=prepared data scale 4300 32 clients 600 seconds 16 GB shared buffers

Latency increases proportionally

% time I/O requests being issued is much lower with
higher cost delay

Autovacuum mostly waiting

Spike in reads not from autovacuum

Size of the relations being thrashed increasing and backend cache
hit ratio is plummeting

System CPU usage is increasing. Potentially caused by swapping

Comparing only autovacuum_vacuum_cost_delay 2ms (default) vs 0

Relation size relatively constant for delay = 0

More autovacuum cache hits and fewer reads with cost delay
0

More shared buffer evictions by autovacuum with default cost delay

Autovacuum cleaning buffers and putting them on the freelist so
more unused buffers

No backend flushes required because there are clean buffers

Finding the Real Root Cause

wal_buffers

COPY FROMs with larger wal_buffers finish faster pgbench, 20MB file,
COPY FROM 16 clients 100 transactions 10 GB shared buffers

wal_buffers are full less often

Smaller wal_buffers end up contending the WALInsert lock meaning they
are waiting much more often

Smaller wal_buffers causes those runs to do less I/O overall

Smaller wal_buffers fill up and then cause waiting for WAL
Sync

Much higher throughput with larger wal_buffers but how can dips
be explained

At 20 seconds, start doing more smaller writes

Fewer write merges and more requests in the queue

Dirty data has built up, then it starts being flushed
by kernel before second slowdown

Shared buffers fills up around 20 seconds, faster with larger
wal_buffers

System memory fills up at 40 seconds explaining the second
dip

Needed pages are being swapped out and have to be
read back in

COPY FROM workload impacted by wal_buffers but a transactional workload
would not be

Benchmarking as a Developer • Not just configuring databases but
identifying bottlenecks that can be addressed with code • Understanding system interactions when designing new features and performance enhancements • Designing scenarios that put the right things under test

Visualizing Postgres I/O Performance for Develo...

Visualizing Postgres I/O Performance for Development

More Decks by Melanie

Other Decks in Technology

Featured

Transcript