Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The Sad State of PostgreSQL Monitoring

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for deric deric
March 26, 2018

The Sad State of PostgreSQL Monitoring

PostgreSQL monitoring tools talks given at https://www.meetup.com/Prague-PostgreSQL-Meetup/events/tbrslpyxfbjc/ on 26th March 2018

Avatar for deric

deric

March 26, 2018
Tweet

Other Decks in Technology

Transcript

  1. PostgreSQL monitoring • PostgreSQL is an awesome database! We try

    to focus on premise open-source solutions.
  2. PostgreSQL monitoring • PostgreSQL is an awesome database! • PostgreSQL

    monitoring tools are not so awesome We try to focus on premise open-source solutions.
  3. PostgreSQL monitoring • PostgreSQL is an awesome database! • PostgreSQL

    monitoring tools are not so awesome • Database developers are usually not good with UX (or developing frontend) We try to focus on premise open-source solutions.
  4. PostgreSQL monitoring • PostgreSQL is an awesome database! • PostgreSQL

    monitoring tools are not so awesome • Database developers are usually not good with UX (or developing frontend) • (Too) Many monitoring tools are available We try to focus on premise open-source solutions.
  5. From PostgreSQL wiki: monitoring solutions • EnterpriseDB Postgres Enterprise Manager

    • pganalyze • pgwatch2 • pg_statsinfo & pg_stats_reporter • PGObserver • pgCluu • PoWA • OPM: Open PostgreSQL Monitoring Source: https://wiki.postgresql.org/wiki/Monitoring
  6. 1. Basic OS metrics • CPU • RAM • Disk

    I/O 2. PostgreSQL statistics • Database stats • Table stats • Index stats • Query stats 3. Configurable dashboards • Simple UI 4. Configuration hints • Query optimization • Index/Table bloat 5. Alerting Requirements
  7. Common metrics • most interesting metrics can be collected from

    build-in tables: • pg_stat_database • pg_stat_bgwriter
  8. Supported functions function/version 9.2+ 9.1 8.3-9.0 8.1-8.2 7.4-8.0 datid x

    x x x datname x x x x numbackends x x x x x xact_commit x x x x x xact_rollback x x x x x blks_read x x x x x blks_hit x x x x x tup_returned x x x tup_fetched x x x
  9. Supported functions (continue) function/version 9.2+ 9.1 8.3-9.0 8.1-8.2 7.4-8.0 tup_returned

    x x x tup_fetched x x x tup_updated x x x tup_deleted x x x conflicts x x temp_files x temp_bytes x deadlocks x blk_read_time x
  10. PGObserver • From Zalando • Components: • Gatherer (Java) •

    Frontend (Python + JavaScript) • Backend DB (PostgreSQL) • Optional: • Aggregation crons • Blocking monitor • InfluxDB exporter • Timeline analysis
  11. Interesting features • Unused indexes • Duplicate indexes • Bloated

    tables Disadvantages • Not actively developed • Old-school interface • Complicated installation PGOBserver - summary
  12. OPM (Open PostgreSQL Monitoring) • Components: • opm-core (JavaScript/Perl) •

    check_pgactivity • wh_nagios (PostgreSQL extension) • PostgreSQL (data storage) • nagios / icinga (check execution)
  13. • connections • bgwriter • btree_bloat • commit_ratio • database_size

    • hit_ratio • invalid_indexes • last_analyze • last_vacuum • longest_query • max_freeze_age • streaming_delta • temp_files • wal_files OPM - many checks few examples (not a complete list):
  14. OPM - example • check_streaming_delta • reflects changes in PostgreSQL

    versions q{SELECT application_name, client_addr, pid, sent_lsn, write_lsn, flush_lsn, replay_lsn, CASE pg_is_in_recovery() WHEN true THEN pg_last_wal_receive_lsn() ELSE pg_current_wal_lsn() END FROM pg_stat_replication WHERE state NOT IN ('startup', 'backup')}, 01. 02. 03. 04. 05. 06.
  15. Interesting features • Well maintained nagios check script • No

    need to install agent (if you're using Nagios) Disadvantages • Clumsy UI • Can't compare multiple databases OPM - summary
  16. PostreSQL Workload Analyzer (PoWA) • Components: • powa-archivist (PostgreSQL extension)

    • powa-web (Python/JavaScript) • Optional: • pg_stat_statements - providing data about queries being executed • pg_qualstats - providing data about predicates, or where clauses • pg_stat_kcache - providing data about operating-system level cache • HypoPG - allowing you to create hypothetical indexes
  17. Interesting features • Query hints • Resources used per query

    • Package for Debian/Redhat Disadvantages • Clumsy UI • Can't compare multiple databases • PoWA is storing data in DB we're monitoring PoWA - summary
  18. pgCluu • Statically generated reports • Components: • pgcluu_collectd (Perl)

    • pgcluu.service - generate reports (Perl/JavaScript)
  19. Interesting features • System/database statistics in one place Disadvantages •

    Clumsy UI • Can't compare multiple databases • Can't change date range pgCluu - summary
  20. Interesting features • Query hints • Table stats • Package

    for Debian/Redhat • Easy installation Disadvantages • SaaS • No free trial (with full features) • No open-source backend pganalyze - summary
  21. What we're getting? • Various attempts on frontend UI •

    Different techniques of collecting metrics
  22. What we're getting? • Various attempts on frontend UI •

    Different techniques of collecting metrics • Incompatible agents/storage
  23. What we're getting? • Various attempts on frontend UI •

    Different techniques of collecting metrics • Incompatible agents/storage • Hardly customizable solutions
  24. What we're getting? • Various attempts on frontend UI •

    Different techniques of collecting metrics • Incompatible agents/storage • Hardly customizable solutions • Similar functionality reimplemented many times
  25. What do we need? • Unobtrusive monitoring • Common protocols

    • Modular components / customizable features
  26. What do we need? • Unobtrusive monitoring • Common protocols

    • Modular components / customizable features • Usable UI
  27. pgwatch2 • Components: • Metrics collector (go) • Web UI

    (Grafana) • Timeseries DB (InfluxDB) • Config DB (PostgreSQL)
  28. Interesting features • Great UI (thx to Grafana!) • Timeseries

    storage • Default data retention 90 days ( PW2_IRETENTIONDAYS ) Disadvantages • High coupling (single container) • Persistence? • Scalability? pgwatch2 - summary
  29. Collectd • Metric collector (written in C) • Includes predefined

    queries for PostgreSQL • Customizable queries • Supports OS metrics (CPU, RAM, disk plugins) • Multiple supported outputs: • Graphite • Prometheus • Riemann • OpenTSDB
  30. Collectd example <Query table_states> Statement "SELECT sum(n_live_tup) AS live,\ sum(n_dead_tup)

    AS dead FROM pg_stat_user_tables;" <Result> Type "pg_n_tup_g" InstancePrefix "live" ValuesFrom "live" </Result> MinVersion 80300 </Query> 01. 02. 03. 04. 05. 06. 07. 08. 09. 10.
  31. Telegraf • Metric collector (go) • Reads data from pg_stat_database

    and pg_stat_bgwriter • Part of TICK stack • Supports OS metrics (CPU, RAM, disk plugins) • Customizable queries [[inputs.postgresql]] address = "postgres://telegraf@localhost/someDB" databases = ["app_production"] 01. 02. 03.
  32. Datadog • Basic metrics description • Collecting metrics from pg_stat*

    Open-source • Similar dashboards can be displayed in Grafana • Most problematic is storing high- cardinality data in timeseries DB Examples of SaaS solutions
  33. Conclusion • DB clusters might require different approaches to monitoring

    • Metric collection daemons ( collectd , telegraf ) works well for common metrics • Grafana is currently best (open-source) UI for monitoring dashboards • Expensive monitoring queries should be executed in less frequent intervals
  34. Future? • Monitoring PostgreSQL from container (no DB restart needed)

    • Unix-like components (single purpose) rather than complex monitoring systems • Advanced query analysis and hints
  35. • Sad Elephant by gawrifort • You're being monitored •

    Kitchen Stories (2003) • Business vector created by Dooder - Freepik.com • Icons made by Webalys Freebies from www.flaticon.com • Building software by Manu