Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Zero-downtime Postgres migrations - the hard parts

Zero-downtime Postgres migrations - the hard parts

Talk video: http://www.thedotpost.com/2015/06/chris-sinjakli-zero-downtime-postgresql-migrations

As a payments company, GoCardless can't take unexpected downtime lightly. Back in January, we were hit by a 15 second API outage during a Postgres schema change. We thought we had all the bases covered, but obviously we'd missed something.

In the talk I briefly introduce the problem of zero-downtime schema changes, what happened to us, and give a few tips on how to avoid it happening to you.

The talk is a condensed version of this blog post: https://gocardless.com/blog/zero-downtime-postgres-migrations-the-hard-parts/

Presented at dotScale 2015

Chris Sinjakli

June 08, 2015
Tweet

More Decks by Chris Sinjakli

Other Decks in Programming

Transcript

  1. Zero-downtime
    Postgres migrations -
    the hard parts
    @ChrisSinjo

    View Slide

  2. Apps change
    frequently

    View Slide

  3. Many apps can’t
    take downtime

    View Slide

  4. The easy parts

    View Slide

  5. - Multiple app versions
    - Fast DDL statements

    View Slide

  6. Fine, until it’s not

    View Slide

  7. We took downtime :(

    View Slide

  8. Let’s talk about
    queues

    View Slide

  9. -- Slow query (AccessShare)
    SELECT DISTINCT(customer_id) FROM payments;
    -- Forces this to queue (AccessExclusive)
    ALTER TABLE payments
    ADD COLUMN refunded boolean;
    -- Which blocks these (AccessShare)
    SELECT * FROM payments WHERE id = 123;

    View Slide

  10. - Kill slow queries
    - Set lock_timeout low
    - Enable log_lock_waits

    View Slide

  11. We blogged about this

    two weeks ago
    I’ll tweet the link from @ChrisSinjo

    View Slide