Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PostgreSQL Unused Indexes

PostgreSQL Unused Indexes

Presented as an internal engineering team talk January 2021 based on some recent work to remove unused database indexes

Andrew Atkinson

February 02, 2021
Tweet

More Decks by Andrew Atkinson

Other Decks in Programming

Transcript

  1. Removing Unused Indexes
    January 2021
    Andy Atkinson
    andyatkinson.com

    View full-size slide

  2. Benefits of indexes
    Query performance
    Reduce query execution time (reduce "cost")
    Enforce uniqueness (unique constraint)
    Sorted

    View full-size slide

  3. Downsides of Indexes
    Use disk space.
    Slow down UPDATE
    , INSERT
    and DELETE
    operations
    Indexed columns prevent H.O.T. updates
    They contribute to VACUUM
    time
    They add to query planning time
    Space contributes to backup and restore time

    View full-size slide

  4. How did we start down this path?
    Exceeding IOPS limits on cloud provider
    For very large tables we were seeing very infrequent and long running autovacuum
    Removing indexes helps improve write rate

    View full-size slide

  5. Discovery Process 1 of 2
    Make sure these are set to on
    SHOW track_activities;

    SHOW track_counts;

    PG tracks all index scans!
    Watch out for indexes used only by infrequent jobs
    Run unused indexes query on primary and replicas

    View full-size slide

  6. Discovery Process 2 of 2
    Query to find indexes with 0 scans. Credit: Cybertec
    SELECT s.schemaname,

    s.relname AS tablename,

    s.indexrelname AS indexname,

    pg_relation_size(s.indexrelid) AS index_size

    FROM pg_catalog.pg_stat_user_indexes s

    JOIN pg_catalog.pg_index i ON s.indexrelid = i.indexrelid

    WHERE s.idx_scan = 0 -- has never been scanned

    AND 0 <>ALL (i.indkey) -- no index column is an expression

    AND NOT i.indisunique -- is not a UNIQUE index

    AND NOT EXISTS -- does not enforce a constraint

    (SELECT 1 FROM pg_catalog.pg_constraint c

    WHERE c.conindid = s.indexrelid)

    ORDER BY pg_relation_size(s.indexrelid) DESC;

    View full-size slide

  7. Removal Process
    Fetch the create index statement prior to the removal, as a recovery plan.
    SELECT pg_get_indexdef(indexrelid) AS index_query

    FROM pg_index WHERE indrelid = 'users'::regclass;

    Perform the create index...
    with the CONCURRENTLY
    option
    Rails: We drop them concurrently, which requires a migration file with
    disable_ddl_transaction!
    and we use the concurrently
    option.
    Rails: We check for the presence of the index first e.g. index_exists?(:posts,
    :title)
    in case the index wasn't added to all environments

    View full-size slide

  8. Future investigations: Corruption and bloat
    Both tables and indexes may have bloat
    To address bloat, reindex
    the index
    To reindex on PG < 12 consider tools like pg_repack
    On PG >= 12, consider REINDEX INDEX CONCURRENTLY idx
    ;

    View full-size slide

  9. References
    Some helpful posts on removing unused indexes, common DBA tasks, queries for index
    maintenance, and index bloat and index types.
    1. https://www.cybertec-postgresql.com/en/get-rid-of-your-unused-indexes
    2. Common DBA tasks for PostgreSQL
    3. https://www.percona.com/blog/2020/03/31/useful-queries-for-postgresql-index-
    maintenance
    4. Significant database and index bloat
    5. https://thoughtbot.com/blog/postgres-index-types

    View full-size slide

  10. Summary
    Remove unused indexes to improve write rate
    Find unused indexes by looking for zero scans
    For frequently utilized indexes that are bloated, remove the bloat

    View full-size slide