Postgres Performance for App Developers

Postgres Performance for App Developers

Talk at PGConf US 2017

27b304f67c0cadfa2f37a19f01af8f89?s=128

Lukas Fittl

March 30, 2017
Tweet

Transcript

  1. PostgreSQL Performance for Application Developers @LukasFittl

  2. Introduction

  3. Skilled Developer Amateur Hacker @LukasFittl

  4. None
  5. Hosted Postgres Query Monitoring 350 million unique queries tracked pganalyze.com

  6. 1. Query Optimization 2. App <-> DB 3. Scaling Out

  7. Identifying slow queries EXPLAINing slow queries Indexing Query Optimization

  8. Identifying slow queries

  9. LOG: duration: 32378.502 ms execute <unnamed>: SELECT * FROM x

    WHERE y = 123 AND z IN (‘a’, ‘b’, ‘c’) log_min_duration_statement = 30s
  10. userid | 10 dbid | 1397527 query | SELECT *

    FROM x WHERE y = ? calls | 5 total_time | 15.249 rows | 0 shared_blks_hit | 451 shared_blks_read | 41 shared_blks_dirtied | 26 shared_blks_written | 0 local_blks_hit | 0 local_blks_read | 0 local_blks_dirtied | 0 local_blks_written | 0 temp_blks_read | 0 temp_blks_written | 0 blk_read_time | 0 blk_write_time | 0 pg_stat_statements
  11. Supported on cloud platforms

  12. SELECT * FROM pg_stat_statements LIMIT 1; userid | 10 dbid

    | 17025 queryid | 1720234670 query | SELECT * FROM x WHERE y = ? calls | 14 total_time | 0.151 rows | 28 shared_blks_hit | 14 shared_blks_read | 0 shared_blks_dirtied | 0 shared_blks_written | 0 local_blks_hit | 0 local_blks_read | 0 local_blks_dirtied | 0 local_blks_written | 0 temp_blks_read | 0 temp_blks_written | 0 blk_read_time | 0 blk_write_time | 0
  13. queryid | 1720234670 query | SELECT * FROM x WHERE

    y = ? calls | 5 total_time | 15.249 Query + No. of Calls + Avg Time
  14. shared_blks_hit | 2447215 shared_blks_read | 55335 Avg. Shared Buffer Hit

    Rate: 97% hit_rate = shared_blks_hit / (shared_blks_hit + shared_blks_read)
  15. blk_read_time | 14.594 blk_write_time | 465.661 Time spent reading/writing to

    disk track_io_timing = on
  16. EXPLAINing slow queries

  17. psql EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM snapshots WHERE collected_at

    BETWEEN now() - '24 hours'::interval AND now(); QUERY PLAN -------------------------------------------------------------------------------- ------------- Index Scan using index_snapshots_on_collected_at on snapshots (cost=0.30..8.32 rows=1 width=116) (actual time=0.078..0.078 rows=0 loops=1) Index Cond: ((collected_at >= (now() - '24:00:00'::interval)) AND (collected_at <= now())) Buffers: shared hit=5 Planning time: 0.286 ms Execution time: 0.115 ms (5 rows)
  18. explain.depesz.com

  19. Pev http://tatiyants.com/postgres-query-plan-visualization/

  20. Indexing

  21. Explain Analyze # EXPLAIN ANALYZE SELECT last_name FROM employees WHERE

    salary >= 50000; QUERY PLAN -------------------------------------------------- Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) (actual time=2.401..295.247 rows=1428 loops=1) Filter: (salary >= 50000) Total runtime: 295.379 (3 rows) Filter: (salary >= 50000) (3 rows) startup time max time rows return actual time 2.401..295.247 rows=1428 295.379
  22. # CREATE INDEX idx_emps ON employees (salary); Indexes!

  23. Indexes! EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >=

    50000; QUERY PLAN -------------------------------------------------- Index Scan using idx_emps on employees (cost=0.00..8.49 rows=1 width=6) (actual time = 0.047..1.603 rows=1428 loops=1) Index Cond: (salary >= 50000) Total runtime: 1.771 ms (3 rows)
  24. Indexes B-Tree Generalized Inverted Index (GIN) Generalized Search Tree (GIST)

    K Nearest Neighbors (KNN) Space Partitioned GIST (SP-GIST)
  25. Indexes Which do I use?

  26. BTree This is what you usually want

  27. Generalized Inverted Index (GIN) Use with multiple values in 1

    column Array/hStore
  28. Generalized Search Tree (GIST) Full text search Shapes

  29. Indexes Conditional Functional Concurrent creation

  30. Conditional > SELECT * FROM places WHERE population > 10000;

    name | population ----------------------------------- ARAB | 13650
  31. Conditional > CREATE INDEX idx_large_population ON places(name) where population >

    10000;
  32. Functional > SELECT * FROM places; data ----------------------------------- {"city": "ACMAR",

    "pop": 6055} {"city": "ARAB", "pop": 13650}
  33. > SELECT * FROM places WHERE get_numeric('pop', data) > 10000;

    data ----------------------------------- {"city": "ARAB", "pop": 13650} Functional
  34. > CREATE INDEX idx_large_population ON places(get_numeric('pop', data)); Functional

  35. CREATE INDEX CONCURRENTLY ... roughly 2-3x slower Doesn’t lock table

    One more thing
  36. Lock Contention Schema Changes Connection Limits App <-> DB

  37. Lock Contention

  38. Long Held Locks in Transactions Rails Counter Cache & Timestamps

    BEGIN SELECT 1 AS one FROM "post_votes" WHERE (…) LIMIT 1 SELECT "posts".* FROM "posts" WHERE "posts"."id" = $1 LIMIT 1 INSERT INTO "notifications" (…) VALUES (…) RETURNING "id" SELECT "users".* FROM "users" WHERE "users"."id" = $1 LIMIT 1 UPDATE "users" SET "updated_at" = ? WHERE "users"."id" = ? INSERT INTO "post_votes" (…) VALUES (…) RETURNING "id" UPDATE "posts" SET "votes" = COALESCE("votes", 0) + 1 WHERE "posts"."id" = ? UPDATE "posts" SET "credible_post_votes_count" = … WHERE "posts"."id" = ? UPDATE "users" SET "updated_at" = ? WHERE "users"."id" = ? UPDATE "posts" SET "updated_at" = ? WHERE "posts"."id" = ? COMMIT
  39. log_lock_waits = on deadlock_timeout = 1000 ms process 20679 still

    waiting for ExclusiveLock on tuple (566,1) of relation 16421 after 1000.115 ms process 20678 still waiting for ExclusiveLock on tuple (566,1) of relation 16421 after 1000.126 ms process 15533 still waiting for ExclusiveLock on tuple (566,1) of relation 16421 1000.129 ms process 20663 still waiting for ExclusiveLock on tuple (566,1) of relation 16421 1000.100 ms process 15537 still waiting for ExclusiveLock on tuple (566,1) of relation 16421 1000.130 ms process 15536 still waiting for ExclusiveLock on tuple (566,1) of relation 16421 1000.222 ms process 20734 still waiting for ExclusiveLock on tuple (566,1) of relation 16421 1000.130 ms process 15538 still waiting for ExclusiveLock on tuple (566,1) of relation 16421 1000.136 ms process 15758 still waiting for ShareLock on transaction 250175899 after 1000.073 ms
  40. pg_stat_activity lock information https://github.com/postgrespro/pg_wait_sampling

  41. Idle Transactions

  42. RED: Long Queries Pid Duration Query ----- --------------- ----------------------------------------------------------------------------------------- 31868

    00:10:52.165883 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 28835 01:06:16.505554 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 28836 01:06:16.486394 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 28837 01:06:16.41853 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 28838 01:06:16.380474 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 31869 00:10:26.068636 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 31870 00:08:47.790245 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 23781 01:12:14.603475 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 31862 00:10:47.98641 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 31863 00:11:05.921372 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 27648 01:17:38.773909 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 27098 01:28:51.216489 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 27106 01:27:18.455351 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 31880 00:10:52.877779 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 31881 00:10:52.168877 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 RED: Idle in Transaction 23729 01:37:12.566497 UPDATE "posts" SET "link_unique_visits" = COALESCE("link_unique_visits", 0) + 1 WHERE "posts"."id" = $1 RED: Blocking Queries 23760 UPDATE "posts" SET "link_unique_visits" = COALESCE("link_unique_visits", 0) + 1 WHERE "posts"."id" = $1 01:37:12.646868 23729 UPDATE "posts" SET "link_visits" = COALESCE("link_visits", 0) + 1 WHERE "posts"."id" = $1 01:31:59.088208 Dead Background Worker caused silent queue back-up for 1hr+
  43. Schema Changes

  44. 1. Don’t remove columns on large tables 2. Don’t rename

    columns 3. Always index CONCURRENTLY 4. Carefully change column type 5. Carefully add columns with a DEFAULT 6. Carefully add NOT NULL columns
  45. 1. Don’t remove columns on large tables 2. Don’t rename

    columns 3. Always index CONCURRENTLY 4. Carefully change column type 5. Carefully add columns with a DEFAULT 6. Carefully add NOT NULL columns
  46. 1. Don’t remove columns on large tables 2. Don’t rename

    columns 3. Always index CONCURRENTLY 4. Carefully change column type 5. Carefully add columns with a DEFAULT 6. Carefully add NOT NULL columns
  47. 1. Don’t remove columns on large tables 2. Don’t rename

    columns 3. Always index CONCURRENTLY 4. Carefully change column type 5. Carefully add columns with a DEFAULT 6. Carefully add NOT NULL columns
  48. 1. Don’t remove columns on large tables 2. Don’t rename

    columns 3. Always index CONCURRENTLY 4. Carefully change column type 5. Carefully add columns with a DEFAULT 6. Carefully add NOT NULL columns
  49. 1. Don’t remove columns on large tables 2. Don’t rename

    columns 3. Always index CONCURRENTLY 4. Carefully change column type 5. Carefully add columns with a DEFAULT 6. Carefully add NOT NULL columns
  50. 1. Don’t remove columns on large tables 2. Don’t rename

    columns 3. Always index CONCURRENTLY 4. Carefully change column type 5. Carefully add columns with a DEFAULT 6. Carefully add NOT NULL columns
  51. Watch out for slow queries conflicting with DDL!

  52. Connection Limits

  53. Threaded Application Servers

  54. Rolling Deploys

  55. e.g. 200 max_connections = 10 connections / Server, 10 App

    Servers, Rolling Deploy (2x)
  56. Pooling!

  57. PgBouncer

  58. Cache & Index Hit Rate Optimizing Your Cache Scaling Out

  59. Cache & Index Hit Rate

  60. SELECT sum(heap_blks_hit) / nullif(sum(heap_blks_hit + heap_blks_read),0) FROM pg_statio_user_tables Table Cache

    Hit Rate Target: >= 99%
  61. SELECT relname, n_live_tup, seq_scan + idx_scan, 100 * idx_scan /

    (seq_scan + idx_scan) FROM pg_stat_user_tables ORDER BY n_live_tup DESC Index Hit Rate Target: >= 95% on large, active tables
  62. relname | n_live_tup | scans | index_hit_rate ---------------------------------+------------+------------+---------------- query_fingerprints |

    347746140 | 513262821 | 99 queries | 346575911 | 22379253 | 99 schema_table_events | 100746488 | 1459 | 99 queries_schema_tables | 62194571 | 7754 | 99 log_lines | 46629937 | 2 | 0 issue_states | 31861134 | 3 | 0 schema_columns | 31849719 | 6688381553 | 99 query_overview_stats | 26029247 | 13831 | 99 schema_index_stats_2d_20170329 | 18274023 | 1592 | 99 schema_index_stats_2d_20170328 | 18164132 | 6917 | 99 snapshot_benchmarks | 13094945 | 2315069 | 99 schema_index_stats_60d_20170329 | 9818030 | 69 | 20 schema_index_stats_60d_20170328 | 9749146 | 110 | 30 schema_index_stats_60d_20170323 | 9709723 | 103 | 40 schema_index_stats_60d_20170327 | 9702565 | 103 | 33 schema_index_stats_60d_20170324 | 9672853 | 64 | 48 schema_index_stats_60d_20170322 | 9651125 | 141 | 46 schema_index_stats_60d_20170325 | 9647832 | 23 | 69 schema_index_stats_60d_20170326 | 9636532 | 39 | 53 schema_index_stats_60d_20170303 | 9538898 | 174 | 63 schema_index_stats_60d_20170321 | 9522712 | 170 | 49 schema_index_stats_60d_20170309 | 9492844 | 126 | 57 schema_index_stats_60d_20170304 | 9491850 | 64 | 82 schema_index_stats_60d_20170320 | 9486945 | 104 | 56 schema_index_stats_60d_20170319 | 9466378 | 47 | 74 Index Hit Rate
  63. relname | n_live_tup | scans | index_hit_rate ---------------------------------+------------+------------+---------------- schema_columns |

    31849719 | 6688670158 | 99 schema_indices | 1164034 | 1341718091 | 99 schema_constraints | 4474681 | 969363158 | 99 schema_tables | 560726 | 596941518 | 99 query_fingerprints | 347747524 | 513267842 | 99 system_snapshots | 1085113 | 41108836 | 99 servers | 4549 | 38061571 | 99 postgres_roles | 9088 | 25771530 | 0 postgres_settings | 455750 | 24006216 | 99 queries | 346577295 | 22379253 | 99 databases | 8846 | 22026619 | 99 issues | 109618 | 18192187 | 99 api_keys | 8005 | 18096925 | 0 reports | 5 | 11530280 | 0 users | 3452 | 10079069 | 99 snapshots | 1245372 | 9182566 | 99 system_infos | 1623 | 2373577 | 99 postgres_versions | 2291 | 2336714 | 99 snapshot_benchmarks | 13094977 | 2315100 | 99 schema_functions | 902307 | 2090925 | 99 database_collaborators | 1159 | 260977 | 0 query_stats_2d_20170328 | 7835103 | 249022 | 99 query_stats_2d_20170329 | 7713757 | 247589 | 99 report_runs | 299 | 213526 | 4 query_stats_60d_20170328 | 7430331 | 121121 | 99 Index Hit Rate
  64. Optimizing Your Cache

  65. pg_buffercache

  66. Scaling Up r3.xlarge r3.4xlarge

  67. Scaling Out Shard in your application Use a sharded database

    (e.g. Citus)
  68. Thanks!
 @LukasFittl