Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Top developer's mistakes when working with PostgreSQL

Top developer's mistakes when working with PostgreSQL

Slides from my talk about most frequent mistakes that developers do when working with PostgreSQL. Highload++ 2018 Moscow, Russia

Alexey Lesovsky

November 08, 2018
Tweet

More Decks by Alexey Lesovsky

Other Decks in Programming

Transcript

  1. 3

  2. 4

  3. Строка 1 Строка 2 Строка 3 Строка 4 0 2

    4 6 8 10 12 Столбец 1 Столбец 2 Столбец 3 5 Проблемы есть всегда
  4. 9 plproxy pgq jsonb foreign tables procedural languages extensions hstore

    plpgsql parallel queries postgresql CTE LEFT JOIN RIGHT JOIN LATERAL window functions aggregate functions subqueries arrays enums intarray
  5. 10 full text search plproxy pgq postgis jsonb listen/notify partitioning

    inheritance streaming replication logical replication foreign tables data wrappers procedural languages extensions WAL archiving hstore xml2 plpgsql parallel queries postgresql CTE LEFT JOIN RIGHT JOIN LATERAL window functions aggregate functions subqueries arrays custom types enums intarray
  6. 11 full text search plproxy pgq londiste postgis jsonb listen/notify

    partitioning inheritance streaming replication logical replication foreign tables data wrappers procedural languages extensions pgbouncer pgpool-II patroni stolon repmgr pgbarman pg_basebackup WAL archiving hstore xml2 plpgsql parallel queries postgresql pg_prewarm CTE LEFT JOIN RIGHT JOIN LATERAL window functions aggregate functions subqueries arrays custom types enums intarray
  7. 12 full text search plproxy pgq londiste postgis jsonb listen/notify

    partitioning inheritance streaming replication logical replication foreign tables data wrappers procedural languages extensions pgbouncer pgpool-II patroni stolon repmgr pgbarman pg_basebackup WAL archiving hstore xml2 plpgsql parallel queries postgresql pg_prewarm CTE LEFT JOIN RIGHT JOIN LATERAL window functions aggregate functions subqueries arrays custom types enums intarray tablespaces pgadmin pgBackRest row level security SSL temporary tables serializable savepoints prepared transactions publications subscriptions OLAP analytics audit logging OLTP ltree bloom gist gin dblink uuid cube pgcrypto isn tcn tablefunc citext pg_repack triggers pg_upgrade slony BDR bucardo
  8. 13

  9. 15

  10. 16

  11. 17

  12. 33

  13. List of relations Schema | Name | Type | Owner

    | Size | Description --------+------------------+-------+----------+---------+------------- public | history_log | table | postgres | 61 GB | public | profiling | table | postgres | 785 MB | public | symbols | table | postgres | 580 MB | public | module_deps | table | postgres | 505 MB | 36
  14. List of relations Schema | Name | Type | Owner

    | Size | Description --------+------------------+-------+----------+---------+------------- public | history_log | table | postgres | 61 GB | public | profiling | table | postgres | 785 MB | public | symbols | table | postgres | 580 MB | public | module_deps | table | postgres | 505 MB | 39
  15. SELECT MAX(loc_pp.PriceStd) FROM C_BPartner_Location bpl, M_PriceList_Vers WHERE bpl.C_BPartner_Location_ID = p_bp_location_id

    AND loc_plv.M_PriceList_ID = bpl.M_PriceList_ID AND loc_plv.ValidFrom = (SELECT MAX(plv2.ValidFrom) FROM M_PriceList_Version plv2, M_ProductPrice loc_pp2 WHERE plv2.M_PriceList_ID = bpl.M_PriceList_ID AND plv2.IsActive=? AND plv2.ValidFrom <= NOW() AND loc_pp2.M_Product_ID = loc_pp.M_Product_ID AND loc_pp2.M_PriceList_Version_ID = plv2.M_PriceL AND loc_pp2.IsActive=?) AND loc_pp.M_PriceList_Version_ID = loc_plv.M_PriceList_ AND loc_pp.M_Product_ID = p_product_id 42 Разработчик составил запрос Пример 3.
  16. SELECT MAX(loc_pp.PriceStd) FROM C_BPartner_Location bpl, M_PriceList_Vers WHERE bpl.C_BPartner_Location_ID = p_bp_location_id

    AND loc_plv.M_PriceList_ID = bpl.M_PriceList_ID AND loc_plv.ValidFrom = (SELECT MAX(plv2.ValidFrom) FROM M_PriceList_Version plv2, M_ProductPrice loc_pp2 WHERE plv2.M_PriceList_ID = bpl.M_PriceList_ID AND plv2.IsActive=? AND plv2.ValidFrom <= NOW() AND loc_pp2.M_Product_ID = loc_pp.M_Product_ID AND loc_pp2.M_PriceList_Version_ID = plv2.M_PriceL AND loc_pp2.IsActive=?) AND loc_pp.M_PriceList_Version_ID = loc_plv.M_PriceList_ AND loc_pp.M_Product_ID = p_product_id 43 Вычитывает M таблиц Параллельно в N потоков X тысяч раз в секунду ОРМ Разработчик составил запрос
  17. 44

  18. Можно вспомнить про 45 Тормоза фоновых процессов СУБД Накладные расходы

    от виртуализации Хранилище от китайского noname Дефолтные конфигурации ...
  19. 50

  20. 54 =# \d+ beautiful_logs Table "public.beautiful_logs" Column | Type |

    ------------+-----------------------------+ id | integer | md5 | character varying(255) | data | json | created_at | timestamp without time zone | lessons | integer[] | =# SELECT length(data::text) -# FROM beautiful_logs -# ORDER BY beautiful_logs.id -# DESC LIMIT 1; length --------- 8818942
  21. 57 =# DELETE FROM history_log -# WHERE created_at < '2018-10-01';

    DELETE 165517399 Time: 585478.451 ms =# DROP TABLE history_log_2018_10; DROP TABLE Time: 4.225 ms
  22. 59 Мониторинг Storage -- Latency, Utilization PostgreSQL Highload++ Siberia: PostgreSQL

    Monitoring https://youtu.be/Hbi2AFhd4nY PostgreSQL Monitoring Wiki https://wiki.postgresql.org/wiki/Monitoring Подключенные клиенты Ошибки Запросы (statemestatements)
  23. – Дайте мне адрес базы, дальше я сам. (С) Anonymous

    developer 63 Неинтересно как работает база База в виде строки в конфиге Единственная точка входа в базу Фантазии при написании запросов, дают ошеломляющие эффекты
  24. 65 OLTP – транзакции • Быстрые • Короткие • Легкие

    OLAP – аналитика • Медленные • Долгие • Тяжелые
  25. 66 OLTP – транзакции • Быстрые • Короткие • Легкие

    OLAP – аналитика • Медленные • Долгие • Тяжелые
  26. 85

  27. Idle transactions 87 Снижение производительности Блокировки и дедлоки Источник HTTP

    50* Все становится интереснее, когда появляются очереди
  28. 92 MVCC – движок СУБД Необходим сборщик мусора Долгие транзакции

    – источник мусора Дольше транзакции – больше мусора, ниже производительность
  29. 94 Давайте сходим во внешний источник Нет обработки ошибок «Человеческий

    фактор» Что делать? Откуда они берутся? Алерты в мониторинге pg_terminate_backend(pid) Рефакторинг приложения
  30. 99 Самописные очереди Table "queues.background_jobs" Column | Type | -----------------------------+-----------------------------+-

    id | bigint | created_at | timestamp without time zone | updated_at | timestamp without time zone | run_at | timestamp without time zone | priority | integer | queue | character varying(256) | manager | character varying(256) | n_attempts | integer | last_error | character varying(1024) | error_at | timestamp without time zone | locked_at | timestamp without time zone | locked_by | character varying(256) |
  31. 100

  32. 102 pos:1 avg_time: 24656.41ms query: SELECT COUNT(*) AS count_all, priority

    AS priority FROM background_jobs WHERE (run_at <= $1 and error_at is NULL) GROUP BY priority pos:2 avg_time: 21241.07ms query: SELECT COUNT(*) FROM background_jobs WHERE (error_at is not NULL) pos:3 avg_time: 19573.19ms query: SELECT COUNT(*) FROM background_jobs WHERE (locked_by is not NULL) pos:4 avg_time: 17396.60ms query: SELECT COUNT(*) AS count_all, queue AS queue FROM background_jobs WHERE (run_at <= '2018-10-20 21:00:00.597019' and error_at is NULL) GROUP BY queue pos:5 avg_time: 46.81ms query: UPDATE background_jobs SET locked_at = $1, locked_by = $2 WHERE id IN (SELECT background_jobs.id FROM background_jobs WHERE ... pos:6 avg_time: 21931.48ms query: UPDATE background_jobs SET locked_by = NULL, locked_at = NULL WHERE background_jobs.locked_by = 'host:6f342155 pid:1'
  33. 103 pos:1 avg_time: 24656.41ms query: SELECT COUNT(*) AS count_all, priority

    AS priority FROM background_jobs WHERE (run_at <= $1 and error_at is NULL) GROUP BY priority pos:2 avg_time: 21241.07ms query: SELECT COUNT(*) FROM background_jobs WHERE (error_at is not NULL) pos:3 avg_time: 19573.19ms query: SELECT COUNT(*) FROM background_jobs WHERE (locked_by is not NULL) pos:4 avg_time: 17396.60ms query: SELECT COUNT(*) AS count_all, queue AS queue FROM background_jobs WHERE (run_at <= '2018-10-20 21:00:00.597019' and error_at is NULL) GROUP BY queue pos:5 avg_time: 46.81ms query: UPDATE background_jobs SET locked_at = $1, locked_by = $2 WHERE id IN (SELECT background_jobs.id FROM background_jobs WHERE ... pos:6 avg_time: 21931.48ms query: UPDATE background_jobs SET locked_by = NULL, locked_at = NULL WHERE background_jobs.locked_by = 'host:6f342155 pid:1'
  34. 104 pos:1 avg_time: 24656.41ms query: SELECT COUNT(*) AS count_all, priority

    AS priority FROM background_jobs WHERE (run_at <= $1 and error_at is NULL) GROUP BY priority pos:2 avg_time: 21241.07ms query: SELECT COUNT(*) FROM background_jobs WHERE (error_at is not NULL) pos:3 avg_time: 19573.19ms query: SELECT COUNT(*) FROM background_jobs WHERE (locked_by is not NULL) pos:4 avg_time: 17396.60ms query: SELECT COUNT(*) AS count_all, queue AS queue FROM background_jobs WHERE (run_at <= '2018-10-20 21:00:00.597019' and error_at is NULL) GROUP BY queue pos:5 avg_time: 46.81ms query: UPDATE background_jobs SET locked_at = $1, locked_by = $2 WHERE id IN (SELECT background_jobs.id FROM background_jobs WHERE ... pos:6 avg_time: 21931.48ms query: UPDATE background_jobs SET locked_by = NULL, locked_at = NULL WHERE background_jobs.locked_by = 'host:6f342155 pid:1'
  35. 111

  36. 122 Stateful Нужно где-то хранить Open Source: CEPH, GlusterFS, DRBD,

    ... Оно не будет быстро Дополнительный саппорт кластерной FS
  37. 124 Если сильно хочется Local volumes Streaming replication PostgreSQL операторы

    Zalando https://github.com/zalando-incubator/postgres-operator Crunchy https://github.com/CrunchyData/postgres-operator
  38. 125

  39. 127 Планирование и мониторинг • Не жмитесь на SSD •

    Не пишите в базу всё подряд • Мониторинг нужен даже консалтерам • https://youtu.be/Hbi2AFhd4nY
  40. 128 Масштабирование • Разносите нагрузку – Postgres is ready •

    Streaming replication • Publications, subscriptions • Foreign Tables, Declarative partitioning