Upgrade to Pro — share decks privately, control downloads, hide ads and more …

PostgreSQL for Pythonistas.

PostgreSQL for Pythonistas.

PostgreSQL is a fantastic database with support for many features not found in other open source databases. This talk will show you how to use some of the more interesting and unique capabilities of PostgreSQL.

Learn how to use advanced column types and how they map to python, how to query an external non-SQL database from PostgreSQL (with SQL) and how to write stored procedures in Python, to implement logic within the database.

The rough outline of the talk will be:

Introduction to myself and what I do.
Quick PostgreSQL history lesson.
Advanced PostgreSQL column types.
Accessing external databases with foreign data wrappers.
Using PLPython to write stored procedures.

This talk requires no previous experience of PostgreSQL but general experience with a RDMS would be useful.

Dougal Matthews

July 04, 2013
Tweet

More Decks by Dougal Matthews

Other Decks in Technology

Transcript

  1. POSTGRESQL
    FOR PYTHONISTAS
    Thursday, 4 July 13

    View Slide

  2. WHAT DO I DO?
    • Working as a senior Python developer for Artirix.
    • Building backend systems and services.
    • Organiser of Python Glasgow.
    Maximising the Value of Content, Data & Information
    Thursday, 4 July 13

    View Slide

  3. AGENDA.
    Combining two of my favourite things! To make the most of
    Python with Postgres, and the most of Postgres with Python.
    • Very brief history.
    • PL/Python.
    • Listen/Notify
    • Foreign Data Wrappers
    Thursday, 4 July 13

    View Slide

  4. HISTORY
    • Ingress (1973)
    • Postgres
    • PostgreSQL (1997)
    Thursday, 4 July 13

    View Slide

  5. Foreign Data Wrappers
    PL/PYTHON
    5
    Thursday, 4 July 13

    View Slide

  6. PL/PYTHON - WHY?
    • Use all the Python libraries.
    • Speak to anything Python can.
    • Python is nicer to write than PL/pgSQL.
    Thursday, 4 July 13

    View Slide

  7. DIVING IN, A SIMPLE FUNCTION.
    CREATE EXTENSION plpythonu;
    CREATE OR REPLACE FUNCTION
    python_pow(a integer, b integer)
    returns integer AS $$
    return pow(a, b)
    $$ LANGUAGE plpythonu;
    Thursday, 4 July 13

    View Slide

  8. SELECT python_pow(2, 10);
    python_pow
    ------------
    1024
    (1 row)
    Thursday, 4 July 13

    View Slide

  9. LOGGING
    CREATE OR REPLACE FUNCTION
    python_pow(a integer, b integer)
    returns integer AS $$
    plpy.notice("a to the power of b")
    return pow(a, b)
    $$ LANGUAGE plpythonu;
    Thursday, 4 July 13

    View Slide

  10. EXCEPTIONS
    CREATE OR REPLACE FUNCTION
    python_pow(a integer, b integer)
    returns integer AS $$
    if a == 0 and b < 0:
    raise Exception("Zero Division")
    return pow(a, b)
    $$ LANGUAGE plpythonu;
    Thursday, 4 July 13

    View Slide

  11. EXCEPTIONS
    SELECT python_pow(0, -1);
    ERROR: Exception: Zero Division
    CONTEXT: Traceback (most recent call last):
    PL/Python function "python_pow", line 3, in

    raise Exception("Zero Division")
    PL/Python function "python_pow"
    Thursday, 4 July 13

    View Slide

  12. TRIGGERS
    CREATE OR REPLACE FUNCTION
    check_editable()
    returns trigger AS $$
    if not TD['old']['is_editable']:
    raise Exception("Not allowed.")
    $$ LANGUAGE plpythonu;
    CREATE TRIGGER verify_editable BEFORE
    UPDATE ON events FOR EACH ROW EXECUTE
    PROCEDURE check_editable();
    Thursday, 4 July 13

    View Slide

  13. TRIGGERS
    UPDATE events SET year = "2014" WHERE id = 123;
    ERROR: Exception: Not Allowed
    CONTEXT: Traceback (most recent call last):
    PL/Python function "check_editable", line 2,
    in
    raise Exception("Not Allowed")
    PL/Python function "check_editable"
    Thursday, 4 July 13

    View Slide

  14. WHEN?
    • Cache invalidation or warm up with triggers.
    • Avoid network round trips.
    • Implement logic constraints in the database.
    Thursday, 4 July 13

    View Slide

  15. LIMITATIONS
    • No virtualenvs and super user privileges. :(
    • Debugging is a pain.
    • Requires Migrations for your Python functions.
    • Use in moderation
    Thursday, 4 July 13

    View Slide

  16. Foreign Data Wrappers
    LISTEN / NOTIFY
    16
    Thursday, 4 July 13

    View Slide

  17. QUICK EXAMPLE
    conn = psycopg2.connect(...)
    curs = conn.cursor()
    curs.execute("LISTEN events_notify;")
    while 1:
    if select.select([conn],[],[],5) == ([],[],[]):
    continue # after a timeout retry
    else:
    conn.poll()
    while conn.notifies:
    n = conn.notifies.pop()
    print notify.channel, notify.payload
    Thursday, 4 July 13

    View Slide

  18. LIMITATIONS
    NOTIFY events_notify, 'New event added';
    NOTIFY events_json, '{"msg": "New event added", "event":
    {"title": "EuroPython"}}';
    Thursday, 4 July 13

    View Slide

  19. Foreign Data Wrappers
    FOREIGN DATA
    WRAPPERS
    19
    Thursday, 4 July 13

    View Slide

  20. FOREIGN DATA WRAPPERS
    • SQL Management of External Data
    • Really good support from 9.1
    • Many backends: Static files, Twitter, CouchDB, Redis ...
    Thursday, 4 July 13

    View Slide

  21. REDIS
    CREATE EXTENSION redis_fdw;
    CREATE SERVER redis_server
    FOREIGN DATA WRAPPER redis_fdw
    OPTIONS (address '127.0.0.1', port '6379');
    CREATE USER MAPPING FOR PUBLIC SERVER redis_server
    OPTIONS (password 'secret');
    Thursday, 4 July 13

    View Slide

  22. REDIS
    CREATE FOREIGN TABLE redis_0(key text, value text)
    SERVER redis_server OPTIONS (database '0');
    SELECT * FROM redis_0 ORDER BY key;
    key | value
    ---------+--------
    event | EuroPython
    location | Florence
    (2 rows)
    Thursday, 4 July 13

    View Slide

  23. REDIS
    CREATE FOREIGN TABLE redis_1(key text, value text[])
    SERVER redis_server
    OPTIONS (tabletype 'hash', tablekeyprefix 'hash',
    database '1');
    SELECT key, hstore(value) FROM redis_1 ORDER BY key;
    key | hstore
    ---------+----------------------------
    event_1 | "name"=>"EuroPython", "where"=>"Florence"
    (1 rows)
    Thursday, 4 July 13

    View Slide

  24. WRITING A FDW
    So, who here likes C?
    Redis - 1.5k SLOC
    MongoDB - 8k SLOC
    Thursday, 4 July 13

    View Slide

  25. WRITING A FDW WITH PYTHON!
    • Multicorn - A FDW that allows you to write other FDW’s in Python.
    • Comes with a number built in and makes it trivial to write more.
    Thursday, 4 July 13

    View Slide

  26. from multicorn import ForeignDataWrapper
    import urllib from icalendar
    import Calendar, Event
    class ICalFdw(ForeignDataWrapper):
    def __init__ ( self , options, columns):
    super (ICalFdw, self ). __init__ (options, columns)
    self .url = options.get( 'url' , None )
    self .columns = columns
    def execute( self , quals, columns):
    ical_file = urllib .urlopen( self .url).read()
    cal = Calendar.from_string(ical_file)
    for v in cal.walk( 'vevent' ):
    e = Event(v)
    line = {}
    for column_name in self .columns:
    line[column_name] = e.decoded(column_name)
    yield line
    Thursday, 4 July 13

    View Slide

  27. CREATE FOREIGN TABLE gmail (
    "Message-ID" character varying,
    "From" character varying,
    "Subject" character varying,
    "payload" character varying,
    "flags" character varying[],
    "To" character varying)
    SERVER multicorn_imap OPTIONS (
    host 'imap.gmail.com',
    port '465',
    payload_column 'payload',
    flags_column 'flags',
    ssl 'True',
    login 'mylogin',
    password 'mypassword'
    );
    SELECT flags, "Subject", payload FROM gmail
    WHERE "Subject" LIKE '%euro%'
    OR "Subject" LIKE '%python%' LIMIT 2;
    Thursday, 4 July 13

    View Slide

  28. OR, SOMETHING MORE INTERESTING.
    • Using Multicorn to write a FDW to use SQLAlchemy to access
    the database your running in.
    Thursday, 4 July 13

    View Slide

  29. OR, SOMETHING MORE INTERESTING
    FOREIGN DATA WRAPPER INCEPTION
    • Using Multicorn to write a FDW to use SQLAlchemy to access
    the database your running in.
    Thursday, 4 July 13

    View Slide

  30. FDW CONCLUSION AND LIMITATIONS
    • Moving data around.
    • For when you really want SQL Joins back.
    • Can be slow - really limited query planner performance.
    Thursday, 4 July 13

    View Slide

  31. BONUS SLIDE - ASYNC QUERIES
    conn = psycopg2.connect(..., async=1)
    wait(conn)
    acurs = conn.cursor()
    acurs.execute("SELECT pg_sleep(10);
    SELECT * from events;")
    # Do something useful here
    wait(acurs.connection)
    print acurs.fetchone()[0]
    Thursday, 4 July 13

    View Slide

  32. POSTGRES WEEKLY
    “A free, once–weekly e-mail round-up of PostgreSQL news and
    articles”
    http://postgresweekly.com/
    Thursday, 4 July 13

    View Slide

  33. Questions?
    Follow me on Twitter: d0ugal
    artirix.com
    dougalmatthews.com
    speakerdeck.com/d0ugal
    Thursday, 4 July 13

    View Slide