Postgres at any scale

Postgres at any scale Craig Kerstiens @ Citus Data

Who am I? • Craig Kerstiens • Run fully managed
database-as-a-service • Curate Postgres Weekly • Blog at craigkerstiens.com • Previously Heroku, Truviso, Accenture

Itty bitty little data

Database is just a hash

Postgres • Datatypes • GIS • Index types • Full
text search • JSONB

Datatypes • Varchar(255) • Timestamp vs. Timestamptz • JSON vs.
JSONB

Constraints • Django/SQL Alchemy good by default • Is this
string really under 255 characters? • Is the IP address really an IP address? • Does the foreign key exist? • What about if you delete a user, do you clean up the rest of the records?

Backups Backups let you 1. recover data, but also 2.
verify quality of your data Backups don’t exist unless you test them

Medium

Conﬁgure your system • Setup your logs • Conﬁgure your
memory • Tweak vacuum • Adjust checkpoints

Don’t learn it all http://pgtune.leopard.in.ua/

Cache rules everything around me

Cache hit rate SELECT 'index hit rate' as name, (sum(idx_blks_hit)
- sum(idx_blks_read)) / sum(idx_blks_hit + idx_blks_read) as ratio FROM pg_statio_user_indexes union all SELECT 'cache hit rate' as name, case sum(idx_blks_hit) when 0 then 'NaN'::numeric else to_char((sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit + idx_blks_read), '99.99')::numeric end as ratio FROM pg_statio_user_indexes)

Cache hit rate name | ratio ----------------+------------------------ cache hit rate
| 0.99

Index hit rate SELECT relname, 100 * idx_scan / (seq_scan
+ idx_scan), n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC;

Index hit rate relname | percent_of_times_index_used | rows_in_table ---------------------+-----------------------------+--------------- events
| 0 | 669917 app_infos_user_info | 0 | 198218 app_infos | 50 | 175640 user_info | 3 | 46718 rollouts | 0 | 34078 favorites | 0 | 3059

Rough guidelines Cache hit rate >= 99% Index hit rate
>= 95% where on > 10,000 rows

Understanding Speciﬁc Query Performance

Understanding Speciﬁc Query Performance SELECT last_name FROM employees WHERE salary
>= 50000;

Understanding Speciﬁc Query Performance EXPLAIN SELECT last_name FROM employees WHERE
salary >= 50000; QUERY PLAN -------------------------------------------- ------ Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) Filter: (salary >= 50000) (3 rows)

Understanding Speciﬁc Query Performance EXPLAIN ANALYZE SELECT last_name FROM employees
WHERE salary >= 50000; QUERY PLAN -------------------------------------------------- Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) (actual time=2.401..295.247 rows=1428 loops=1) Filter: (salary >= 50000) Total runtime: 295.379 (3 rows)

Rough guidelines Page response times < 100 ms Common queries
< 10ms Rare queries < 100ms

Understanding Speciﬁc Query Performance EXPLAIN ANALYZE SELECT last_name FROM employees
WHERE salary >= 50000; QUERY PLAN -------------------------------------------------- Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) (actual time=2.401..295.247 rows=1428 loops=1) Filter: (salary >= 50000) Total runtime: 295.379 (3 rows)

Indexes CREATE INDEX idx_emps ON employees (salary);

Indexes EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >=
50000; QUERY PLAN -------------------------------------------------- Index Scan using idx_emps on employees (cost=0.00..8.49 rows=1 width=6) (actual time = 0.047..1.603 rows=1428 loops=1) Index Cond: (salary >= 50000) Total runtime: 1.771 ms (3 rows)

pg_stat_statements $ select * from pg_stat_statements where query ~ 'from
users where email'; userid â”‚ 16384 dbid â”‚ 16388 query â”‚ select * from users where email = ?; calls â”‚ 2 total_time â”‚ 0.000268 rows â”‚ 2 shared_blks_hit â”‚ 16 shared_blks_read â”‚ 0 shared_blks_dirtied â”‚ 0 shared_blks_written â”‚ 0 local_blks_hit â”‚ 0 local_blks_read â”‚ 0 local_blks_dirtied â”‚ 0 local_blks_written â”‚ 0 ...

pg_stat_statements SELECT (total_time / 1000 / 60) as total, (total_time/calls)
as avg, query FROM pg_stat_statements ORDER BY 1 DESC LIMIT 100;

pg_stat_statements total | avg | query --------+--------+------------------------- 295.76 | 10.13
| SELECT id FROM users... 219.13 | 80.24 | SELECT * FROM ... (2 rows)

Indexes

Indexes B-Tree Generalized Inverted Index (GIN) Generalized Search Tree (GIST)
K Nearest Neighbors (KNN) Space Partitioned GIST (SP-GIST) Block Range Index (BRIN)

Which do you use?

BTree You usually want this

Generalized Inverted Index (GIN) Use with multiple values in 1
column Array/hStore

Generalized Search Tree (GIST) Full text search Shapes

Indexes B-Tree Generalized Inverted Index (GIN) Generalized Search Tree (GIST)
K Nearest Neighbors (KNN) Space Partitioned GIST (SP-GIST) Block Range Index (BRIN)

Other index tips

Be speciﬁc • Composite indexes • Functional indexes • Conditional
indexes

Faster writes? Faster reads?

Every write you make 1. Figure out query plan 2.
Write to disk 3. Wait for ack 4. Return

Write to disk 3. Wait for ack 4. Update the index 5. Return

Write to disk 3. Wait for ack 4. Update the index 5. Update the other index 6. Return

Write to disk 3. Wait for ack 4. Update the index 5. Update the other index 6. Update the other other index 7. Return

Unused indexes? SELECT schemaname || '.' || relname AS table,
indexrelname AS index, pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size, idx_scan as index_scans FROM pg_stat_user_indexes ui JOIN pg_index i ON ui.indexrelid = i.indexrelid WHERE NOT indisunique AND idx_scan < 50 AND pg_relation_size(relid) > 5 * 8192 ORDER BY pg_relation_size(i.indexrelid) / nullif(idx_scan, 0) DESC NULLS FIRST, pg_relation_size(i.indexrelid) DESC;

Migrations not null is good except in a migration

Proper migrations 1 becomes 3: 1. Allow nulls, set a
default value though 2. Gradually backﬁll old data 3. Add your constraint

Large scale

Index creation CREATE INDEX CONCURRENTLY ... roughly 2-3x slower Doesn’t
lock table

Connection pooling Application/Framework layer Stand alone daemon

Pooler options pgBouncer pgpool

Scaling cache

Ofﬂoad read trafﬁc to a replica

Replication slony londiste bucardo pgpool wal-e barman

Replication slony londiste bucardo pgpool wal-e/wal-g barman

Sharding • Split up large tables • Split up data
by customer • 1 database per customer • 1 schema per customer • Shard within your application

Sharding • Scalable options • Split out large tables •
Multi-tenant application level sharding

Logical backup pg_dump can be human readable, is portable

Physical backup The bytes on disk Base backup

Logical Good across architectures Good for portability Has load on
DB Works < 50 GB More initial setup Less portability Limited load on system Use above 50 GB Physical

Small • Leverage datatypes, it’s not a dumb hash •
Test your backups • Take time to

Medium • Tune Postgres well • Watch cache hit ratio
• Be generous with indexes

Large • Move away from pg_dump • Setup connection pooling
• If you need to shard, invest in the right way or use a tool to help

Thanks! Questions?

Postgres at any scale

Postgres at any scale

More Decks by Craig Kerstiens

Other Decks in Technology

Featured

Transcript