Upgrade to Pro — share decks privately, control downloads, hide ads and more …

How to Scale Postgres - Automation, Tuning, & S...

How to Scale Postgres - Automation, Tuning, & Sharding | Postgres Vision 2020 | Lukas Fittl

the right action at the right time. In this talk we'll look at three core areas:

- Automation: How to avoid doing manual efforts when you manage many databases. We'll look at how to solve the common challenge of automating failover for high availability purposes, using open-source tools.

- Tuning: What are the most important tuning aspects of Postgres to look at. We'll review best practices for config settings, how to identify problematic queries, as well as talk about what's new in Postgres 13 for performance tuning.

- Sharding: How to scale beyond a single Postgres server. At a certain scale it's no longer sufficient to simply create more read replicas. We'll take a practical look at sharding using the Citus extension, as well as review which workloads can shard more easily than others.

This session is aimed at the intermediate user of Postgres who understands the basics and is curious how to push the boundaries of Postgres.

More Decks by Azure Database for PostgreSQL

Other Decks in Technology

Transcript

  1. Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,

    pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in
  2. Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,

    pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in Read Replicas Built-in
  3. Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,

    pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in Read Replicas Built-in Backups Built-in
  4. Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,

    pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in Read Replicas Built-in Backups Built-in Connection Pooling Manual Setup
  5. Cloud PaaS Self-Managed VM Synchronized Configuration Terraform Parameter Groups ?

    Access Control (Roles, pg_hba.conf) Roles: Terraform HBA: Built-in ? Automatic Failover (for HA & Planned Updates) Built-in ? Read Replicas Built-in ? Backups Built-in ? Connection Pooling Manual Setup ?
  6. Cloud PaaS Self-Managed VM Synchronized Configuration Terraform Parameter Groups ?

    Access Control (Roles, pg_hba.conf) Roles: Terraform HBA: Built-in ? Automatic Failover (for HA & Planned Updates) Built-in pg_auto_failover Read Replicas Built-in ? Backups Built-in ? Connection Pooling Manual Setup ?
  7. Temporary Files Written (Per Query) log_temp_files = 0 Jan 20

    09:18:58pm PST 28847 LOG: temporary file: path "base/pgsql_ pgsql_tmp28847.9", size 50658332 Jan 20 09:18:58pm PST 28847 STATEMENT: WITH servers AS ( SELECT …
  8. autovacuum => SELECT pid, query FROM pg_stat_activity WHERE query LIKE

    'autovacuum: %'; 10469 | autovacuum: VACUUM ANALYZE public.schema_columns 12848 | autovacuum: VACUUM public.replication_follower_stats 28626 | autovacuum: VACUUM public.schema_index_stats | (to prevent wraparound) (3 rows) pg_stat_activity
  9. autovacuum pg_stat_progress_vacuum relid: OID of the table phase: current VACUUM

    phase heap_blks_total: Heap Blocks Total heap_blks_scanned: Heap Blocks Scanned heap_blks_vacuumed: Heap Blocks Vacuumed …
  10. Reduce autovacuum_vacuum_cost_delay To Increase VACUUM Speed 80 MB/s 8 MB/s

    (20ms) (2ms) PG 12+ Older PG Default OS / Disk Reads
  11. 16688 LOG: checkpoint starting: xlog xlog = WAL exceeded max_wal_size,

    checkpoint has to happen quickly time = checkpoint_timeout reached, checkpoint impact spread over time
  12. Checkpoint Statistics pg_stat_bgwriter checkpoints_timed: # of scheduled checkpoints checkpoints_req: #

    of requested checkpoints 1. Time Between Checkpoints 2. % of Timed Checkpoints
  13. Tune checkpoint_completion_target To Control I/O Impact of Timed Checkpoints (Often

    0.9 is a good value, but depends on I/O Subsystem & Workload)
  14. Select from table Coordinator Table metadata Select from table_1001 Select

    from table_1003 Select from table_1002 Select from table_1004 Data node N Data node 2 Data node 1 Table_1001 Table_1003 Table_1002 Table_1004 Each node PostgreSQL with Citus installed 1 shard = 1 PostgreSQL table Sharding data across multiple nodes