$30 off During Our Annual Pro Sale. View Details »

Postgres: an intro for new developers

Postgres: an intro for new developers

Selena Deckelmann

February 27, 2015
Tweet

More Decks by Selena Deckelmann

Other Decks in Technology

Transcript

  1. Postgres
    Postgres: An introduction
    Selena Deckelmann
    Mozilla
    @selenamarie
    [email protected]

    View Slide

  2. AMA

    Please, ask questions.

    Hands raised is useful so I look toward the voice.

    Can I get a scribe volunteer?

    View Slide

  3. Socorro

    Crash storage and analysis for Breakpad, the crash
    collection client for Firefox

    3-5 million crashes per day

    S3, Postgres, Elastic Search, RabbitMQ

    Python, .js and some C

    http://socorro.readthedocs.org

    View Slide

  4. View Slide

  5. about:crashes

    View Slide

  6. crash-stats.mozilla.com

    View Slide

  7. View Slide

  8. View Slide

  9. View Slide

  10. Postgres Developer Summit, 2008

    View Slide

  11. History of Postgres

    Code first written in 1986

    Open sourced in 1996 (version 6.0)

    Current released version: 9.4

    300-500 contributors per year

    Extremely active developer and user community

    View Slide

  12. My story

    Worked on an ERP migration for a bike company from
    Mac OS 9 DB Qube → Postgres

    Was already into open source, started a user group

    Started running conferences

    Was a consultant for a few years

    Worked on a pathological schema with ~500k+ tables,
    100k schemas

    View Slide

  13. Key features

    SQL system

    Strict attention to standards

    Procedural languages in the database like pl/v8 (.js) and
    pl/python

    Rich developer interfaces

    Designed for stability and data safety

    View Slide

  14. Framework support

    Django prefers Postgres :)

    Flask support is great

    Tell me what frameworks I should know about :)

    View Slide

  15. Cloudy Postgres

    Heroku
    heroku addons:add heroku­postgresql:hobby­dev

    Amazon RDS Postgres

    Engine Yard

    Rackspace

    View Slide

  16. DIY Postgres

    There are Puppet/Chef (or whatever) recipes out there!

    Main files to be concerned about:
    postgresql.conf # DB config (GUCs)
    pg_hba.conf # access control
    recovery.conf # replication

    Postgres team distributes RPMs and .deb and we have
    packagers for most major distros

    OS X? Postgres.app for local dev http://postgresapp.com/
    Wild west for reproducible deploys. brew maybe?

    Windows has good support from EnterpriseDB

    View Slide

  17. Types: your secret weapon

    Help you keep your data tidy

    Make queries faster!

    Makes migrations and testing easier

    View Slide

  18. Types: your secret weapon

    Core: TEXT, INTEGER, BIGINT, BOOLEAN, FLOAT,
    NUMERIC, DATE, TIMESTAMPTZ
    and so many more!

    Common: UUID, JSON, JSONB (9.4), tsearch types (full
    text search)

    Third party: PostGIS
    – We guess half of all Postgres users use PostGIS

    View Slide

  19. Built in functions

    http://www.postgresql.org/docs/9.4/static/functions.ht
    ml

    Tons of work already done for you for doing useful and
    common things
    – Network address manipulation:
    http://www.postgresql.org/docs/9.4/static/functions
    -net.html
    – Date/time functions and operators:
    http://www.postgresql.org/docs/9.4/static/functions
    -datetime.html

    Excellent support for range types means lots of
    awesome, easy to write queries for you.

    View Slide

  20. protips

    JOIN using columns of the same type

    Never SELECT *

    Indexes can be made from functions

    Test with ROLLBACK

    EXPLAIN and EXPLAIN ANALYZE

    ANALYZE is your friend

    pg_stat_user_tables/indexes

    UNLOGGED tables

    View Slide

  21. JOIN using columns of the same type

    Example of something that seems ok, but turns out bad
    at 200m rows:
    SELECT things, stuff
    FROM foo
    JOIN bar ON foo.uuid = bar.uuid::text;

    Use the type system. It will help you.

    View Slide

  22. Never SELECT *

    Seems ok:
    SELECT * FROM simple_table;

    Hmmm:
    SELECT * FROM simple_table
    JOIN splendid_table ON simple_table.id = splendid_table.id;

    Errrgghhhhh...
    SELECT * FROM simple_table s1
    JOIN splendid_table s2 ON s1.id = s2.id
    JOIN superb_table s3 ON s1.id = s3.id
    JOIN whatevs_table s4 ON s1.thing = s4.thing;

    View Slide

  23. Indexes can be made with functions

    Useful for matching common query predicates
    CREATE INDEX new_index ON
    raw_crashes(get_text('name', json_data));

    Good for glue between feature updates or
    stepping stones for schema refactoring

    View Slide

  24. Test with ROLLBACK

    Table and data changes are transaction safe in Postgres
    BEGIN;
    DROP TABLE important_things;
    -- oops
    ROLLBACK;

    View Slide

  25. EXPLAIN and EXPLAIN ANALYZE

    EXPLAIN is how Postgres can tell you how it decided to
    execute a query.

    Easy help with the output: http://explain.depesz.com/

    EXPLAIN ANALYZE actually runs the query (caution!)
    and then includes real durations

    View Slide

  26. ANALYZE is your friend

    ANALYZE is how statistics about your tables are created
    – Creates a histogram of the contents of each column and
    common values (if relevant)
    – Used by the planner to choose efficient plans for queries

    ANALYZE after bulk data loads!
    – Autoanalyze runs periodically on its own, but right after
    huge data imports stats can be out of date

    Increase default_statistics_target if your EXPLAIN plans are
    wacky/indeterminate (default is 100, and fine for many
    situations)

    View Slide

  27. pg_stat_user_tables/indexes

    Tons of info about tables can be accessed with SELECT

    Estimated row count, index scans, sequential scans,
    dead/live tuples and autovacuum/autoanalyze runs

    Common use: find out indexes aren't being used or
    tables have a ton of expensive seq_scans

    View Slide

  28. Unlogged tables

    Caution: MongoDB mode

    You can create UNLOGGED tables
    CREATE UNLOGGED TABLE foo (bar int);

    Fast

    Not replicated, not saved if database crashes or
    shutdown uncleanly

    Can be useful for test suites (do your own benchmark!)

    Lots of other uses if you're working with ephemeral data

    View Slide

  29. Great resources

    Postgres Guide http://www.postgresguide.com/

    The main docs:
    http://www.postgresql.org/docs/9.4/static/

    Planet Postgres: http://planet.postgresql.org/

    Mailing lists: [email protected] ,
    [email protected],
    [email protected], [email protected]

    IRC: Freenode #postgresql

    View Slide

  30. Postgres
    Postgres: An introduction
    Selena Deckelmann
    Mozilla
    @selenamarie
    [email protected]

    View Slide