Healthy Database

C43af8c5615e17a1d534b7fbfec46656?s=47 Keiko Oda
October 04, 2017

Healthy Database

From Grailed (https://www.grailed.com/) Engineering Hangout
Talking about how to keep the database healthy. Postgres, Heroku Postgres, tools.

C43af8c5615e17a1d534b7fbfec46656?s=128

Keiko Oda

October 04, 2017
Tweet

Transcript

  1. HEALTHY DATABASE Keiko Oda Heroku, Salesforce

  2. HI! I’M KEIKO ODA ➤ Heroku Data team (DoD) ➤

    Joined Heroku Apr 2013 ➤ Heroku Postgres/Redis team ➤ Former support engineer (Platform, Data embedded) ➤ San Francisco (from Japan) ➤ @keiko713
  3. TABLE OF CONTENTS ➤ Healthy Database ➤ Dealing with Expensive

    Queries ➤ Understanding Locks ➤ Let the database focus on its main job ➤ Other tips ➤ Heroku Postgres what’s new & what’s coming ➤ Heroku PGSettings ➤ Heroku Postgres Credentials ➤ Server Side PGBouncer
  4. HEALTHY DATABASE? WHY MY DATABASE CAN’T SCALE? WHY MY DATABASE

    IS SLOW?
  5. EXPENSIVE QUERIES

  6. THE POWER OF INDEX ➤ Lack of good indexes can

    cause expensive queries ➤ Expensive queries are very good at making other healthy/simple queries running slow (e.g. high resource usage, locks, etc.) ➤ Always think about adding/revisiting indexes when you create new tables, introducing and/or modifying new columns, queries ➤ Watch out unused indexes, but also keep it in your mind that these indexes may be used in followers ➤ the cost of maintaining the index v.s. the benefits the index provides for queries ➤ I/O is expensive (make sure cache hit rate is high!), good indexes will help reducing disk I/O and keep good cache
  7. REAL LIFE EXAMPLE 19:30 Deploy, no schema change but this

    changed one look up query (add one select condition) 20:15 Added one index, run create index concurrently Connection counts Database load HTTP response time HTTP status code
  8. HOW TO FIND EXPENSIVE QUERIES? ➤ Check outliers with heroku

    pg:outliers or in the log, or Heroku Expensive Queries dashboard ➤ high sync_io_time is a good culprit (likely seq scan) ➤ Outside of Heroku ➤ install pg_stat_statements extension and run a query ➤ tweak log_min_duration_statement setting to see long running queries ➤ EXPLAIN the query (EXPLAIN (ANALYZE, BUFFERS)) ➤ Check out any avoidable seq scan, non ideal index usage ➤ Run ANALYZE to update the query plan
  9. EXAMPLES OF FINDING A CULPRIT ➤ heroku pg:outliers ➤ Takes

    indeed a lot of time to run 3mins+ ~2mins Second time is faster because of cache
  10. EXAMPLES OF FINDING A CULPRIT (CONT) ➤ EXPLAIN (ANALYZE, BUFFERS)

    I/O Timings is so high Using Seq Scan (Thanks to Postgres, it’s at least Parallel)
  11. EXAMPLES OF FINDING A CULPRIT (CONT) ➤ CREATE INDEX CONCURRENTLY

    towns_code_index ON towns USING btree (code); Using Index Scan! 1.7ms to run (even faster second time, because it’s easily put on the cache and most of 1.7ms was spent on I/O)
  12. ➤ What kind of indexes will be a good fit?

    ➤ https://devcenter.heroku.com/articles/postgresql-indexes ➤ Partial Indexes ➤ covers just a subset of a table’s data (e.g. deleted_at = NULL, flagged = TRUE) ➤ smaller size, easier to maintain, faster to scan ➤ Expression Indexes ➤ matches on some function of the data (e.g. lower(email)) ➤ Multi-column Indexes ➤ e.g. WHERE a = x AND b = y ➤ covers WHERE a = x, but doesn’t cover WHERE b = y (if you use both of them, create two individual indexes) ADDING INDEXES TO FIX EXPENSIVE QUERIES
  13. LOCKS

  14. UNDERSTANDING LOCKS ➤ Locks can make the database in very

    bad shape ➤ Simple queries can take forever if they’re waiting for acquiring a lock ➤ Typical lock issues ➤ ALTER TABLE is taking forever ➤ It may be just waiting for a lock (needs ACCESS EXCLUSIVE lock) ➤ ALTER TABLE itself becomes a lock, causing other queries to wait for a lock (ACCESS EXCLUSIVE lock conflicts with ALL queries) ➤ https://www.postgresql.org/docs/current/static/explicit-locking.html ➤ How to find locks? ➤ pg_locks table, heroku pg:locks or heroku pg:blocking
  15. EXAMPLE OF LOCKS (DEMO) > \d users Table "public.users" Column

    │ Type │ Modifiers ────────────┼───────────────────────┼─────────── id │ integer │ first_name │ character varying(50) │ last_name │ character varying(50) │ email │ character varying(50) │ BEGIN; SELECT * FROM users where id = 1; ALTER TABLE users ADD COLUMN nickname varchar(50); SELECT count(*) FROM users; ACCESS SHARE (holding a lock because of open transaction) ACCESS EXCLUSIVE (waiting for ACCESS SHARE to release the lock) ACCESS SHARE (waiting for ACCESS EXCLUSIVE) waiting queue SELECT count(*) FROM users; ACCESS SHARE ACCESS SHARE doesn’t conflict with ACCESS SHARE
  16. FOCUSING ON MAIN JOB

  17. DATABASE IS BUSY FOR SOMETHING ELSE? ➤ Are you running

    analytics/dataclips with it? ➤ Use followers ➤ Are you running a logical backup (pg_dump) with it? ➤ Nope, stop it ➤ When you need to take a backup, use fork database ➤ Are you making connections super often? Do you have many open connections? ➤ Use connection pooler, PGBouncer ➤ Are you having lots of logs? ➤ Reduce logs
  18. OTHER TIPS

  19. OTHER TIPS ➤ ALWAYS consider using the latest version of

    Postgres! ➤ Free performance improvement! ➤ Use the “right size” instances (plans) ➤ Each instance has its own limit (cpu core, memory size, IOPS) ➤ Heroku Postgres lower plans have “burstable performance” ➤ https://devcenter.heroku.com/articles/heroku-postgres- production-tier-technical-characterization ➤ Tuning tips ➤ https://devcenter.heroku.com/articles/heroku-postgres- database-tuning
  20. USE PG:DIAGNOSE TO DIAGNOSE YOUR DATABASE ➤ Heroku CLI tool

    to show the things like… ➤ Connection Count ➤ Long Queries ➤ Idle in Transaction: very bad (unnecessary locks), do not have ➤ Indexes ➤ Bloat: haven’t touched in this talk but another important thing to monitor ➤ Hit Rate ➤ Blocking Queries ➤ Load ➤ Sequences: don’t run out of integer! ➤ https://devcenter.heroku.com/articles/heroku-postgresql#pg-diagnose
  21. HEROKU POSTGRES WHAT’S NEW & WHAT’S COMING

  22. HEROKU PGSETTINGS (NEW!) ➤ Heroku PGSettings ➤ https://devcenter.heroku.com/articles/heroku-postgres-settings ➤ logging

    setting tweaks ➤ log-lock-waits (default: on for 1000ms+ lock wait) ➤ reduce noisy logs temporary ➤ log-min-duration-statement (default: 2000ms+ to execute) ➤ get more “long running queries” like 1000ms+, or reduce noisy logs ➤ log-statement (default: ddl e.g. CREATE, ALTER and DROP) ➤ all/mod for debugging, auditing ➤ none for reducing noisy logs (e.g. create temp tables) ➤ More to come (also feature requests are welcome!)
  23. HEROKU POSTGRES CREDENTIALS (NEW!) ➤ Heroku Postgres Credentials ➤ https://devcenter.heroku.com/articles/heroku-postgresql-credentials

    ➤ Allow you to create Postgres roles so that you can control access permissions ➤ Common use case ➤ read-only user for analytics team ➤ with a shared database between micro services, give different users (roles) for each app db User table name, password User view name role (credential) analytics: read-only + only has an access to User view
  24. SERVER SIDE PGBOUNCER (COMING) ➤ PGBouncer: connection pooler for Postgres,

    helps get around database connection limits ➤ Have been helping lots of customers with connection limits Postgres app app app dyno dyno dyno Heroku Postgres Server Clients Client side (in-dyno) PGBouncer
  25. SERVER SIDE PGBOUNCER (COMING) ➤ Introducing Postgres server-side PGBouncer ➤

    Uniform across dynos/apps, less app side setup ➤ Simple concept (one bouncer needs to think about the connection limit) Postgres app app app dyno dyno dyno Heroku Postgres Server Clients Server side PGBouncer NEW!
  26. THANK YOU!