$30 off During Our Annual Pro Sale. View Details »

PostgreSQL for Pythonistas.

PostgreSQL for Pythonistas.

PostgreSQL is a fantastic database with support for many features not found in other open source databases. This talk will show you how to use some of the more interesting and unique capabilities of PostgreSQL.

Learn how to use advanced column types and how they map to python, how to query an external non-SQL database from PostgreSQL (with SQL) and how to write stored procedures in Python, to implement logic within the database.

The rough outline of the talk will be:

Introduction to myself and what I do.
Quick PostgreSQL history lesson.
Advanced PostgreSQL column types.
Accessing external databases with foreign data wrappers.
Using PLPython to write stored procedures.

This talk requires no previous experience of PostgreSQL but general experience with a RDMS would be useful.

Dougal Matthews

July 04, 2013

More Decks by Dougal Matthews

Other Decks in Technology



  2. WHAT DO I DO? • Working as a senior Python

    developer for Artirix. • Building backend systems and services. • Organiser of Python Glasgow. Maximising the Value of Content, Data & Information Thursday, 4 July 13
  3. AGENDA. Combining two of my favourite things! To make the

    most of Python with Postgres, and the most of Postgres with Python. • Very brief history. • PL/Python. • Listen/Notify • Foreign Data Wrappers Thursday, 4 July 13
  4. HISTORY • Ingress (1973) • Postgres • PostgreSQL (1997) Thursday,

    4 July 13
  5. Foreign Data Wrappers PL/PYTHON 5 Thursday, 4 July 13

  6. PL/PYTHON - WHY? • Use all the Python libraries. •

    Speak to anything Python can. • Python is nicer to write than PL/pgSQL. Thursday, 4 July 13

    REPLACE FUNCTION python_pow(a integer, b integer) returns integer AS $$ return pow(a, b) $$ LANGUAGE plpythonu; Thursday, 4 July 13
  8. SELECT python_pow(2, 10); python_pow ------------ 1024 (1 row) Thursday, 4

    July 13
  9. LOGGING CREATE OR REPLACE FUNCTION python_pow(a integer, b integer) returns

    integer AS $$ plpy.notice("a to the power of b") return pow(a, b) $$ LANGUAGE plpythonu; Thursday, 4 July 13
  10. EXCEPTIONS CREATE OR REPLACE FUNCTION python_pow(a integer, b integer) returns

    integer AS $$ if a == 0 and b < 0: raise Exception("Zero Division") return pow(a, b) $$ LANGUAGE plpythonu; Thursday, 4 July 13
  11. EXCEPTIONS SELECT python_pow(0, -1); ERROR: Exception: Zero Division CONTEXT: Traceback

    (most recent call last): PL/Python function "python_pow", line 3, in <module> raise Exception("Zero Division") PL/Python function "python_pow" Thursday, 4 July 13
  12. TRIGGERS CREATE OR REPLACE FUNCTION check_editable() returns trigger AS $$

    if not TD['old']['is_editable']: raise Exception("Not allowed.") $$ LANGUAGE plpythonu; CREATE TRIGGER verify_editable BEFORE UPDATE ON events FOR EACH ROW EXECUTE PROCEDURE check_editable(); Thursday, 4 July 13
  13. TRIGGERS UPDATE events SET year = "2014" WHERE id =

    123; ERROR: Exception: Not Allowed CONTEXT: Traceback (most recent call last): PL/Python function "check_editable", line 2, in <module> raise Exception("Not Allowed") PL/Python function "check_editable" Thursday, 4 July 13
  14. WHEN? • Cache invalidation or warm up with triggers. •

    Avoid network round trips. • Implement logic constraints in the database. Thursday, 4 July 13
  15. LIMITATIONS • No virtualenvs and super user privileges. :( •

    Debugging is a pain. • Requires Migrations for your Python functions. • Use in moderation Thursday, 4 July 13
  16. Foreign Data Wrappers LISTEN / NOTIFY 16 Thursday, 4 July

  17. QUICK EXAMPLE conn = psycopg2.connect(...) curs = conn.cursor() curs.execute("LISTEN events_notify;")

    while 1: if select.select([conn],[],[],5) == ([],[],[]): continue # after a timeout retry else: conn.poll() while conn.notifies: n = conn.notifies.pop() print notify.channel, notify.payload Thursday, 4 July 13
  18. LIMITATIONS NOTIFY events_notify, 'New event added'; NOTIFY events_json, '{"msg": "New

    event added", "event": {"title": "EuroPython"}}'; Thursday, 4 July 13
  19. Foreign Data Wrappers FOREIGN DATA WRAPPERS 19 Thursday, 4 July

  20. FOREIGN DATA WRAPPERS • SQL Management of External Data •

    Really good support from 9.1 • Many backends: Static files, Twitter, CouchDB, Redis ... Thursday, 4 July 13

    redis_fdw OPTIONS (address '', port '6379'); CREATE USER MAPPING FOR PUBLIC SERVER redis_server OPTIONS (password 'secret'); Thursday, 4 July 13
  22. REDIS CREATE FOREIGN TABLE redis_0(key text, value text) SERVER redis_server

    OPTIONS (database '0'); SELECT * FROM redis_0 ORDER BY key; key | value ---------+-------- event | EuroPython location | Florence (2 rows) Thursday, 4 July 13
  23. REDIS CREATE FOREIGN TABLE redis_1(key text, value text[]) SERVER redis_server

    OPTIONS (tabletype 'hash', tablekeyprefix 'hash', database '1'); SELECT key, hstore(value) FROM redis_1 ORDER BY key; key | hstore ---------+---------------------------- event_1 | "name"=>"EuroPython", "where"=>"Florence" (1 rows) Thursday, 4 July 13
  24. WRITING A FDW So, who here likes C? Redis -

    1.5k SLOC MongoDB - 8k SLOC Thursday, 4 July 13
  25. WRITING A FDW WITH PYTHON! • Multicorn - A FDW

    that allows you to write other FDW’s in Python. • Comes with a number built in and makes it trivial to write more. Thursday, 4 July 13
  26. from multicorn import ForeignDataWrapper import urllib from icalendar import Calendar,

    Event class ICalFdw(ForeignDataWrapper): def __init__ ( self , options, columns): super (ICalFdw, self ). __init__ (options, columns) self .url = options.get( 'url' , None ) self .columns = columns def execute( self , quals, columns): ical_file = urllib .urlopen( self .url).read() cal = Calendar.from_string(ical_file) for v in cal.walk( 'vevent' ): e = Event(v) line = {} for column_name in self .columns: line[column_name] = e.decoded(column_name) yield line Thursday, 4 July 13
  27. CREATE FOREIGN TABLE gmail ( "Message-ID" character varying, "From" character

    varying, "Subject" character varying, "payload" character varying, "flags" character varying[], "To" character varying) SERVER multicorn_imap OPTIONS ( host 'imap.gmail.com', port '465', payload_column 'payload', flags_column 'flags', ssl 'True', login 'mylogin', password 'mypassword' ); SELECT flags, "Subject", payload FROM gmail WHERE "Subject" LIKE '%euro%' OR "Subject" LIKE '%python%' LIMIT 2; Thursday, 4 July 13
  28. OR, SOMETHING MORE INTERESTING. • Using Multicorn to write a

    FDW to use SQLAlchemy to access the database your running in. Thursday, 4 July 13

    Multicorn to write a FDW to use SQLAlchemy to access the database your running in. Thursday, 4 July 13
  30. FDW CONCLUSION AND LIMITATIONS • Moving data around. • For

    when you really want SQL Joins back. • Can be slow - really limited query planner performance. Thursday, 4 July 13
  31. BONUS SLIDE - ASYNC QUERIES conn = psycopg2.connect(..., async=1) wait(conn)

    acurs = conn.cursor() acurs.execute("SELECT pg_sleep(10); SELECT * from events;") # Do something useful here wait(acurs.connection) print acurs.fetchone()[0] Thursday, 4 July 13
  32. POSTGRES WEEKLY “A free, once–weekly e-mail round-up of PostgreSQL news

    and articles” http://postgresweekly.com/ Thursday, 4 July 13
  33. Questions? Follow me on Twitter: d0ugal artirix.com dougalmatthews.com speakerdeck.com/d0ugal Thursday,

    4 July 13