Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
How to Scale Postgres - Automation, Tuning & Sh...
Search
Lukas Fittl
June 23, 2020
Technology
0
600
How to Scale Postgres - Automation, Tuning & Sharding
Talk at Postgres Vision 2020
Lukas Fittl
June 23, 2020
Tweet
Share
More Decks by Lukas Fittl
See All by Lukas Fittl
What's Missing for Postgres Monitoring
lfittl
0
210
A Map For Monitoring PostgreSQL
lfittl
2
370
Monitoring Postgres at Scale
lfittl
1
390
Monitoring PostgreSQL at Scale
lfittl
4
230
Postgres Performance for App Developers
lfittl
2
270
GraphQL ❤ PostgreSQL -- P.S. aka BeatQL
lfittl
1
570
Hacking PostgreSQL to Gain SQL Parsing Superpowers
lfittl
1
540
PostgreSQL at a Web Startup
lfittl
3
580
Advanced pg_stat_statements: Filtering, Regression Testing & more
lfittl
4
740
Other Decks in Technology
See All in Technology
Active Directory攻防
cryptopeg
PRO
8
5k
Snowflakeの開発・運用コストをApache Icebergで効率化しよう!~機能と活用例のご紹介~
sagara
1
280
Oracle Database Technology Night #87-1 : Exadata Database Service on Exascale Infrastructure(ExaDB-XS)サービス詳細
oracle4engineer
PRO
1
100
IAMポリシーのAllow/Denyについて、改めて理解する
smt7174
2
180
【内製開発Summit 2025】イオンスマートテクノロジーの内製化組織の作り方/In-house-development-summit-AST
aeonpeople
1
480
あれは良かった、あれは苦労したB2B2C型SaaSの新規開発におけるCloud Spanner
hirohito1108
2
890
Swiftの “private” を テストする / Testing Swift "private"
yutailang0119
0
140
「正しく」失敗できる チームの作り方 〜リアルな事例から紐解く失敗を恐れない組織とは〜 / A team that can fail correctly
i35_267
2
700
クラウドサービス事業者におけるOSS
tagomoris
3
970
デスクトップだけじゃないUbuntu
mtyshibata
0
600
次世代KYC活動報告 / 20250219-BizDay17-KYC-nextgen
oidfj
0
460
プロダクトエンジニア 360°フィードバックを実施した話
hacomono
PRO
0
130
Featured
See All Featured
Why Our Code Smells
bkeepers
PRO
336
57k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
46
2.3k
YesSQL, Process and Tooling at Scale
rocio
172
14k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
133
33k
How to train your dragon (web standard)
notwaldorf
91
5.9k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
330
21k
The Pragmatic Product Professional
lauravandoore
32
6.4k
The Language of Interfaces
destraynor
156
24k
Reflections from 52 weeks, 52 projects
jeffersonlam
348
20k
Imperfection Machines: The Place of Print at Facebook
scottboms
267
13k
Large-scale JavaScript Application Architecture
addyosmani
511
110k
Statistics for Hackers
jakevdp
797
220k
Transcript
@LukasFittl How to Scale Postgres: Automation, Tuning & Sharding
@LukasFittl
Scaling Postgres
Scaling Postgres
Automation Handling 100s of database servers
Consistency is key
Infrastructure as Code
Postgres Infrastructure as Code
Demo: Managing Configuration using Terraform
Cloud PaaS Synchronized Configuration Terraform
Cloud PaaS Synchronized Configuration Terraform Access Control (Roles, pg_hba.conf) Terraform
Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,
pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in
Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,
pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in Read Replicas Built-in
Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,
pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in Read Replicas Built-in Backups Built-in
Cloud PaaS Synchronized Configuration Terraform Parameter Groups Access Control (Roles,
pg_hba.conf) Roles: Terraform HBA: Built-in Automatic Failover (for HA & Planned Updates) Built-in Read Replicas Built-in Backups Built-in Connection Pooling Manual Setup
Cloud PaaS Self-Managed VM Synchronized Configuration Terraform Parameter Groups ?
Access Control (Roles, pg_hba.conf) Roles: Terraform HBA: Built-in ? Automatic Failover (for HA & Planned Updates) Built-in ? Read Replicas Built-in ? Backups Built-in ? Connection Pooling Manual Setup ?
Cloud PaaS Self-Managed VM Synchronized Configuration Terraform Parameter Groups ?
Access Control (Roles, pg_hba.conf) Roles: Terraform HBA: Built-in ? Automatic Failover (for HA & Planned Updates) Built-in pg_auto_failover Read Replicas Built-in ? Backups Built-in ? Connection Pooling Manual Setup ?
pg_auto_failover: Simple, automated failover
pg_auto_failover
Demo: Postgres HA using pg_auto_failover
Tuning Making The Most Of Your Database Server
work_mem tuning
Out Of Memory vs Operations Spill To Disk
Temporary Files Written pg_stat_statements.temp_blks_written pg_stat_database.temp_bytes
Temporary Files Written (Per Query) log_temp_files = 0 Jan 20
09:18:58pm PST 28847 LOG: temporary file: path "base/pgsql_ pgsql_tmp28847.9", size 50658332 Jan 20 09:18:58pm PST 28847 STATEMENT: WITH servers AS ( SELECT …
When Sorts Spill To Disk, Increase work_mem However, be aware
of OOMs!
When you get a lot of Out of Memory Errors
Reduce work_mem!
VACUUM
autovacuum => SELECT pid, query FROM pg_stat_activity WHERE query LIKE
'autovacuum: %'; 10469 | autovacuum: VACUUM ANALYZE public.schema_columns 12848 | autovacuum: VACUUM public.replication_follower_stats 28626 | autovacuum: VACUUM public.schema_index_stats | (to prevent wraparound) (3 rows) pg_stat_activity
autovacuum pg_stat_progress_vacuum relid: OID of the table phase: current VACUUM
phase heap_blks_total: Heap Blocks Total heap_blks_scanned: Heap Blocks Scanned heap_blks_vacuumed: Heap Blocks Vacuumed …
Reduce autovacuum_vacuum_cost_delay To Increase VACUUM Speed 80 MB/s 8 MB/s
(20ms) (2ms) PG 12+ Older PG Default OS / Disk Reads
Use Table Partitioning For Append-Only + Delete Workloads (e.g. Timeseries)
Checkpoints
Data Directory WAL WAL WAL Buffer Cache Checkpointer WAL Checkpoints
Are Important For I/O Tuning
16688 LOG: checkpoint starting: xlog xlog = WAL exceeded max_wal_size,
checkpoint has to happen quickly time = checkpoint_timeout reached, checkpoint impact spread over time
Checkpoint Statistics pg_stat_bgwriter checkpoints_timed: # of scheduled checkpoints checkpoints_req: #
of requested checkpoints 1. Time Between Checkpoints 2. % of Timed Checkpoints
Increase max_wal_size / Reduce checkpoint_timeout To Have More Timed Checkpoints
(but be careful with recovery times)
Tune checkpoint_completion_target To Control I/O Impact of Timed Checkpoints (Often
0.9 is a good value, but depends on I/O Subsystem & Workload)
Demo: Postgres 13 WAL Monitoring
Sharding Scaling Beyond The Limits of a Single Server
Citus: Extension for Sharding Postgres
Select from table Coordinator Table metadata Select from table_1001 Select
from table_1003 Select from table_1002 Select from table_1004 Data node N Data node 2 Data node 1 Table_1001 Table_1003 Table_1002 Table_1004 Each node PostgreSQL with Citus installed 1 shard = 1 PostgreSQL table Sharding data across multiple nodes
Demo: Hyperscale (Citus) on Kubernetes with Azure Arc
Thank you! lukas@fittl.com @LukasFittl