Slide 1

Slide 1 text

P!"#r$% P$rf&r'()*$ f&r H+'()% @*r(,#-$r%",$)%

Slide 2

Slide 2 text

S.('$/$%% p/+#% http://www.postgresweekly.com http://www.craigkerstiens.com http://www.postgresguide.com http://www.postgresapp.com http://postgres.heroku.com

Slide 3

Slide 3 text

Postgres - TLDR

Slide 4

Slide 4 text

Postgres - TLDR Datatypes Conditional Indexes Transactional DDL Foreign Data Wrappers Concurrent Index Creation Extensions Common Table Expressions Fast Column Addition Listen/Notify Table Inheritance Per Transaction sync replication Window functions NoSQL inside SQL Momentum

Slide 5

Slide 5 text

OLTP vs OLAP

Slide 6

Slide 6 text

OLTP W$b(pp%

Slide 7

Slide 7 text

Postgres Setup/Config

Slide 8

Slide 8 text

Postgres Setup/Config On Amazon

Slide 9

Slide 9 text

Postgres Setup/Config On Amazon Use Heroku OR ‘postgresql when its not your dayjob’

Slide 10

Slide 10 text

Postgres Setup/Config On Amazon Use Heroku OR ‘postgresql when its not your dayjob’ Other clouds

Slide 11

Slide 11 text

Postgres Setup/Config On Amazon Use Heroku OR ‘postgresql when its not your dayjob’ Other clouds ‘postgresql when its not your dayjob’

Slide 12

Slide 12 text

Postgres Setup/Config On Amazon Use Heroku OR ‘postgresql when its not your dayjob’ Other clouds ‘postgresql when its not your dayjob’ Real hardware

Slide 13

Slide 13 text

Postgres Setup/Config On Amazon Use Heroku OR ‘postgresql when its not your dayjob’ Other clouds ‘postgresql when its not your dayjob’ Real hardware High performance PostgreSQL

Slide 14

Slide 14 text

Postgres Setup/Config On Amazon Use Heroku OR ‘postgresql when its not your dayjob’ Other clouds ‘postgresql when its not your dayjob’ Real hardware High performance PostgreSQL http://thebuild.com/blog/2012/06/04/postgresql-when-its-not-your-job-at-djangocon-europe/

Slide 15

Slide 15 text

C(*.$

Slide 16

Slide 16 text

80/20 r+/$

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Cache Hit Rate SELECT 'index hit rate' as name, (sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit + idx_blks_read) as ratio FROM pg_statio_user_indexes union all SELECT 'cache hit rate' as name, case sum(idx_blks_hit) when 0 then 'NaN'::numeric else to_char((sum(idx_blks_hit) - sum(idx_blks_read)) / sum(idx_blks_hit + idx_blks_read), '99.99')::numeric end as ratio FROM pg_statio_user_indexes)

Slide 19

Slide 19 text

Cache Hit Rate name | ratio ----------------+------------------------ cache hit rate | 0.99

Slide 20

Slide 20 text

Index Hit Rate SELECT relname, 100 * idx_scan / (seq_scan + idx_scan), n_live_tup FROM pg_stat_user_tables ORDER BY n_live_tup DESC;

Slide 21

Slide 21 text

Index Hit Rate relname | percent_of_times_index_used | rows_in_table ---------------------+-----------------------------+--------------- events | 0 | 669917 app_infos_user_info | 0 | 198218 app_infos | 50 | 175640 user_info | 3 | 46718 rollouts | 0 | 34078 favorites | 0 | 3059 schema_migrations | 0 | 2 authorizations | 0 | 0 delayed_jobs | 23 | 0

Slide 22

Slide 22 text

S.&r"*+"%

Slide 23

Slide 23 text

$ cat ~/.psqlrc \set ON_ERROR_ROLLBACK interactive -- automatically switch between extended and normal \x auto -- always show how long a query takes \timing \set show_slow_queries 'SELECT (total_time / 1000 / 60) as total_minutes, (total_time/calls) as average_time, query FROM pg_stat_statements ORDER BY 1 DESC LIMIT 100;' psql

Slide 24

Slide 24 text

$ cat ~/.psqlrc \set ON_ERROR_ROLLBACK interactive -- automatically switch between extended and normal \x auto -- always show how long a query takes \timing \set show_slow_queries 'SELECT (total_time / 1000 / 60) as total_minutes, (total_time/calls) as average_time, query FROM pg_stat_statements ORDER BY 1 DESC LIMIT 100;' psql

Slide 25

Slide 25 text

datascope https://github.com/will/datascope

Slide 26

Slide 26 text

U)0$r%"()0,)# Sp$*,1* Q+$r2 P$rf&r'()*$

Slide 27

Slide 27 text

Understanding Query Performance

Slide 28

Slide 28 text

Understanding Query Performance SELECT last_name FROM employees WHERE salary >= 50000;

Slide 29

Slide 29 text

Explain # EXPLAIN SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN -------------------------------------------------- Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) Filter: (salary >= 50000) (3 rows)

Slide 30

Slide 30 text

Explain # EXPLAIN SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN -------------------------------------------------- Seq Scan on employees width=6) Filter: (salary >= 50000) (3 rows) startup time max time rows return (cost=0.00..35811.00 rows=1

Slide 31

Slide 31 text

Explain Analyze # EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN -------------------------------------------------- Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) (actual time=2.401..295.247 rows=1428 loops=1) Filter: (salary >= 50000) Total runtime: 295.379 (3 rows) Filter: (salary >= 50000) (3 rows) startup time max time rows return actual time 2.401..295.247 rows=1428 295.379

Slide 32

Slide 32 text

Rough guidelines Page response times < 100 ms Common queries < 10ms Rare queries < 100ms

Slide 33

Slide 33 text

Explain Analyze # EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN -------------------------------------------------- Seq Scan on employees (cost=0.00..35811.00 rows=1 width=6) (actual time=2.401..295.247 rows=1428 loops=1) Filter: (salary >= 50000) Total runtime: 295.379 (3 rows) Filter: (salary >= 50000) (3 rows) startup time max time rows return actual time 2.401..295.247 rows=1428 295.379

Slide 34

Slide 34 text

# CREATE INDEX idx_emps ON employees (salary); Indexes!

Slide 35

Slide 35 text

Indexes! EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN -------------------------------------------------- Index Scan using idx_emps on employees (cost=0.00..8.49 rows=1 width=6) (actual time = 0.047..1.603 rows=1428 loops=1) Index Cond: (salary >= 50000) Total runtime: 1.771 ms (3 rows)

Slide 36

Slide 36 text

Indexes! EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN -------------------------------------------------- Index Scan using idx_emps on employees (cost=0.00..8.49 rows=1 width=6) (actual time = 0.047..1.603 rows=1428 loops=1) Index Cond: (salary >= 50000) Total runtime: 1.771 ms (3 rows)

Slide 37

Slide 37 text

Indexes! EXPLAIN ANALYZE SELECT last_name FROM employees WHERE salary >= 50000; QUERY PLAN -------------------------------------------------- Index Scan using idx_emps on employees (cost=0.00..8.49 rows=1 width=6) (actual time = 0.047..1.603 rows=1428 loops=1) Index Cond: (salary >= 50000) Total runtime: 1.771 ms (3 rows)

Slide 38

Slide 38 text

pg_stat_statements

Slide 39

Slide 39 text

pg_stat_statements $ select * from pg_stat_statements where query ~ 'from users where email'; userid │ 16384 dbid │ 16388 query │ select * from users where email = ?; calls │ 2 total_time │ 0.000268 rows │ 2 shared_blks_hit │ 16 shared_blks_read │ 0 shared_blks_dirtied │ 0 shared_blks_written │ 0 local_blks_hit │ 0 local_blks_read │ 0 local_blks_dirtied │ 0 local_blks_written │ 0 ...

Slide 40

Slide 40 text

pg_stat_statements

Slide 41

Slide 41 text

SELECT (total_time / 1000 / 60) as total, (total_time/calls) as avg, query FROM pg_stat_statements ORDER BY 1 DESC LIMIT 100; pg_stat_statements

Slide 42

Slide 42 text

pg_stat_statements

Slide 43

Slide 43 text

total | avg | query --------+--------+------------------------- 295.76 | 10.13 | SELECT id FROM users... 219.13 | 80.24 | SELECT * FROM ... (2 rows) pg_stat_statements

Slide 44

Slide 44 text

I)03$%

Slide 45

Slide 45 text

Indexes B-Tree Generalized Inverted Index (GIN) Generalized Search Tree (GIST) K Nearest Neighbors (KNN) Space Partitioned GIST (SP-GIST)

Slide 46

Slide 46 text

Indexes Which do I use?

Slide 47

Slide 47 text

BTree This is what you usually want

Slide 48

Slide 48 text

Generalized Inverted Index (GIN) Use with multiple values in 1 column Array/hStore

Slide 49

Slide 49 text

Generalized Search Tree (GIST) Full text search Shapes

Slide 50

Slide 50 text

Indexes B-Tree Generalized Inverted Index (GIN) Generalized Search Tree (GIST) K Nearest Neighbors (KNN) Space Partitioned GIST (SP-GIST)

Slide 51

Slide 51 text

M&r$ ,)03$%

Slide 52

Slide 52 text

Indexes Conditional Functional Concurrent creation

Slide 53

Slide 53 text

Conditional > SELECT * FROM places; name | population ----------------------------------- ACMAR | 6055 ARAB | 13650

Slide 54

Slide 54 text

Conditional > SELECT * FROM places WHERE population > 10000; name | population ----------------------------------- ARAB | 13650

Slide 55

Slide 55 text

Conditional > SELECT * FROM places WHERE population > 10000; name | population ----------------------------------- ARAB | 13650

Slide 56

Slide 56 text

Conditional > CREATE INDEX idx_large_population ON places(name) where population > 10000;

Slide 57

Slide 57 text

Conditional > CREATE INDEX idx_large_population ON places(name) where population > 10000;

Slide 58

Slide 58 text

Functional > SELECT * FROM places; data ----------------------------------- {"city": "ACMAR", "pop": 6055} {"city": "ARAB", "pop": 13650}

Slide 59

Slide 59 text

> SELECT * FROM places WHERE get_numeric('pop', data) > 10000; data ----------------------------------- {"city": "ARAB", "pop": 13650} Functional

Slide 60

Slide 60 text

> SELECT * FROM places WHERE get_numeric('pop', data) > 10000; data ----------------------------------- {"city": "ARAB", "pop": 13650} Functional

Slide 61

Slide 61 text

> CREATE INDEX idx_large_population ON places(get_numeric('pop', data)); Functional

Slide 62

Slide 62 text

> CREATE INDEX idx_large_population ON places(get_numeric('pop', data)); Functional

Slide 63

Slide 63 text

Conditional and Functional > CREATE INDEX idx_large_population ON places(data) WHERE get_numeric('pop', data) > 10000;

Slide 64

Slide 64 text

Conditional and Functional > CREATE INDEX idx_large_population ON places(data) WHERE get_numeric('pop', data) > 10000;

Slide 65

Slide 65 text

Conditional and Functional > CREATE INDEX idx_large_population ON places(data) WHERE get_numeric('pop', data) > 10000;

Slide 66

Slide 66 text

CREATE INDEX CONCURRENTLY ... roughly 2-3x slower Doesn’t lock table One more thing

Slide 67

Slide 67 text

D4()#&

Slide 68

Slide 68 text

Connections

Slide 69

Slide 69 text

Connections django-postgrespool djorm-ext-pool django-db-pool

Slide 70

Slide 70 text

django-postgrespool import dj_database_url import django_postgrespool DATABASE = { 'default': dj_database_url.config() } DATABASES['default']['ENGINE'] = 'django_postgrespool' SOUTH_DATABASE_ADAPTERS = { 'default': 'south.db.postgresql_psycopg2' }

Slide 71

Slide 71 text

django-postgrespool import dj_database_url import django_postgrespool DATABASE = { 'default': dj_database_url.config() } DATABASES['default']['ENGINE'] = 'django_postgrespool' SOUTH_DATABASE_ADAPTERS = { 'default': 'south.db.postgresql_psycopg2' }

Slide 72

Slide 72 text

pgbouncer pgpool Other options

Slide 73

Slide 73 text

A00,)# C(*.$

Slide 74

Slide 74 text

Replication options

Slide 75

Slide 75 text

Replication options slony bucardo pgpool

Slide 76

Slide 76 text

Replication options slony bucardo pgpool wal-e barman

Slide 77

Slide 77 text

Replication options slony bucardo pgpool wal-e barman

Slide 78

Slide 78 text

Update settings

Slide 79

Slide 79 text

effective_cache_size shared_buffers work_mem maintenance_work_mem Update settings

Slide 80

Slide 80 text

B(*-+p%

Slide 81

Slide 81 text

Logical pg_dump can be human readable, is portable

Slide 82

Slide 82 text

Physical The bytes on disk Base backup

Slide 83

Slide 83 text

Logical Good across architectures Good for portability Has load on DB Works < 50 GB Physical More initial setup Less portability Limited load on system Use above 50 GB

Slide 84

Slide 84 text

R$*(p

Slide 85

Slide 85 text

OLAP Whole other talk Disk IO is important Order on disk is helpful (pg-reorg) MPP solutions on top of Postgres Recap

Slide 86

Slide 86 text

OLTP (webapps) Ensure bulk of data is cache Efficient use of indexes When cache sucks, throw more at it Recap

Slide 87

Slide 87 text

Q+$%",&)%