How to Scale Postgres - Automation, Tuning, & Sharding | Postgres Vision 2020 | Lukas Fittl

How to Scale Postgres - Automation, Tuning, & Sharding | Postgres Vision 2020 | Lukas Fittl

the right action at the right time. In this talk we'll look at three core areas:

- Automation: How to avoid doing manual efforts when you manage many databases. We'll look at how to solve the common challenge of automating failover for high availability purposes, using open-source tools.

- Tuning: What are the most important tuning aspects of Postgres to look at. We'll review best practices for config settings, how to identify problematic queries, as well as talk about what's new in Postgres 13 for performance tuning.

- Sharding: How to scale beyond a single Postgres server. At a certain scale it's no longer sufficient to simply create more read replicas. We'll take a practical look at sharding using the Citus extension, as well as review which workloads can shard more easily than others.

This session is aimed at the intermediate user of Postgres who understands the basics and is curious how to push the boundaries of Postgres.

143117954187136b825331f24da0e201?s=128

Azure Postgres

June 23, 2020
Tweet

Transcript

  1. @LukasFittl How to Scale Postgres: Automation, Tuning & Sharding

  2. @LukasFittl

  3. Scaling Postgres

  4. Scaling Postgres

  5. Automation Handling 100s of database servers

  6. Consistency is key

  7. Infrastructure as Code

  8. Postgres Infrastructure as Code

  9. Demo: Managing Configuration using Terraform

  10. Cloud PaaS Synchronized Configuration Terraform

  11. Cloud PaaS Synchronized Configuration Terraform Access Control (Roles, pg_hba.conf) Terraform

  12. Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,

    pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in
  13. Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,

    pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in Read Replicas Built-in
  14. Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,

    pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in Read Replicas Built-in Backups Built-in
  15. Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,

    pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in Read Replicas Built-in Backups Built-in Connection Pooling Manual Setup
  16. Cloud PaaS Self-Managed VM Synchronized Configuration Terraform Parameter Groups ?

    Access Control (Roles, pg_hba.conf) Roles: Terraform HBA: Built-in ? Automatic Failover (for HA & Planned Updates) Built-in ? Read Replicas Built-in ? Backups Built-in ? Connection Pooling Manual Setup ?
  17. Cloud PaaS Self-Managed VM Synchronized Configuration Terraform Parameter Groups ?

    Access Control (Roles, pg_hba.conf) Roles: Terraform HBA: Built-in ? Automatic Failover (for HA & Planned Updates) Built-in pg_auto_failover Read Replicas Built-in ? Backups Built-in ? Connection Pooling Manual Setup ?
  18. pg_auto_failover: Simple, automated failover

  19. pg_auto_failover

  20. Demo: Postgres HA using pg_auto_failover

  21. Tuning Making The Most Of Your Database Server

  22. work_mem tuning

  23. Out Of Memory vs Operations Spill To Disk

  24. Temporary Files Written pg_stat_statements.temp_blks_written pg_stat_database.temp_bytes

  25. Temporary Files Written (Per Query) log_temp_files = 0 Jan 20

    09:18:58pm PST 28847 LOG: temporary file: path "base/pgsql_ pgsql_tmp28847.9", size 50658332 Jan 20 09:18:58pm PST 28847 STATEMENT: WITH servers AS ( SELECT …
  26. When Sorts Spill To Disk, Increase work_mem However, be aware

    of OOMs!
  27. When you get a lot of Out of Memory Errors

    Reduce work_mem!
  28. VACUUM

  29. autovacuum => SELECT pid, query FROM pg_stat_activity WHERE query LIKE

    'autovacuum: %'; 10469 | autovacuum: VACUUM ANALYZE public.schema_columns 12848 | autovacuum: VACUUM public.replication_follower_stats 28626 | autovacuum: VACUUM public.schema_index_stats | (to prevent wraparound) (3 rows) pg_stat_activity
  30. autovacuum pg_stat_progress_vacuum relid: OID of the table phase: current VACUUM

    phase heap_blks_total: Heap Blocks Total heap_blks_scanned: Heap Blocks Scanned heap_blks_vacuumed: Heap Blocks Vacuumed …
  31. Reduce autovacuum_vacuum_cost_delay To Increase VACUUM Speed 80 MB/s 8 MB/s

    (20ms) (2ms) PG 12+ Older PG Default OS / Disk Reads
  32. Use Table Partitioning For Append-Only + Delete Workloads (e.g. Timeseries)

  33. Checkpoints

  34. Data Directory WAL WAL WAL Buffer Cache Checkpointer WAL Checkpoints

    Are Important For I/O Tuning
  35. 16688 LOG: checkpoint starting: xlog xlog = WAL exceeded max_wal_size,

    checkpoint has to happen quickly time = checkpoint_timeout reached, checkpoint impact spread over time
  36. Checkpoint Statistics pg_stat_bgwriter checkpoints_timed: # of scheduled checkpoints checkpoints_req: #

    of requested checkpoints 1. Time Between Checkpoints 2. % of Timed Checkpoints
  37. Increase max_wal_size / Reduce checkpoint_timeout To Have More Timed Checkpoints

    (but be careful with recovery times)
  38. Tune checkpoint_completion_target To Control I/O Impact of Timed Checkpoints (Often

    0.9 is a good value, but depends on I/O Subsystem & Workload)
  39. Demo: Postgres 13 WAL Monitoring

  40. Sharding Scaling Beyond The Limits of a Single Server

  41. Citus: Extension for Sharding Postgres

  42. Select from table Coordinator Table metadata Select from table_1001 Select

    from table_1003 Select from table_1002 Select from table_1004 Data node N Data node 2 Data node 1 Table_1001 Table_1003 Table_1002 Table_1004 Each node PostgreSQL with Citus installed 1 shard = 1 PostgreSQL table Sharding data across multiple nodes
  43. Demo: Hyperscale (Citus) on Kubernetes with Azure Arc

  44. Thank you! lukas@fittl.com @LukasFittl