Upgrade to Pro — share decks privately, control downloads, hide ads and more …

putting python in postgresql

putting python in postgresql

Conferences Box

March 09, 2012
Tweet

More Decks by Conferences Box

Other Decks in Technology

Transcript

  1. Why on earth would you want to do that? •

    Build in Python rather than banging your head against PL/PGSQL • Utilize PyPi packages • Use a network connection (ZMQ anyone?) • Retrofit features on a large/complex/ proprietary system Tuesday, March 13, 12
  2. Installing pl/python • Ubuntu - aptitude install postgresql-plpython-<version> • Mac

    OS X Homebrew - PYTHON = / path/to/python brew install postgresql Tuesday, March 13, 12
  3. Setting up the database •createdb <database name> •createlang plpythonu <databasename>

    •Check with: SELECT * FROM pg_language Tuesday, March 13, 12
  4. Writing your first function CREATE OR REPLACE FUNCTION pymulti(a integer,

    b integer) returns integer AS $$ return a * b $$ LANGUAGE plpythonu; Tuesday, March 13, 12
  5. Writing your first function CREATE OR REPLACE FUNCTION pymulti(a integer,

    b integer) returns integer AS $$ return a * b $$ LANGUAGE plpythonu; Tuesday, March 13, 12
  6. Datatypes PostgreSQL Python small int and int int bigint long

    boolean bool All text types str SQL Array list Custom Types dict Tuesday, March 13, 12
  7. Debugging • Using print doesn’t work all that well. •

    You can log directly to PostgreSQL’s log file with: plpy.notice(“<msg>”) plpy.debug(“<msg>”) plpy.error(“<msg>”) plpy.fatal(“<msg>”) Tuesday, March 13, 12
  8. Or log with Python... import logging logger = logging.BasicConfig( filename=’/tmp/plpy.log’

    level=logging.INFO ) logging.info(“Hi there!”) Tuesday, March 13, 12
  9. DANGER WILL ROBINSON! • Kind of a pain to maintain

    and debug • Can easily confuse your DBA (if that isn’t you) • Not exactly slow, but definitely not free • Use with caution: This should be in your back of tricks if you need it, but use sparingly • Requires superuser privs and no virtualenvs Tuesday, March 13, 12
  10. When should you use it? • Rolling up/aggregating data •

    Enforce constraints across teams/devs/ languages • Protect data integrity from ad hoc queries • Change/Add features at the DB level when you can’t at the app level Tuesday, March 13, 12
  11. Let’s say we are setup like this... CREATE TABLE trigger_test

    ( id serial, username varchar, is_active boolean default true, balance int4 ); INSERT INTO trigger_test (username, is_active, balance) values ('frankwiles', 't', 1000); INSERT INTO trigger_test (username, is_active, balance) values ('jacobkaplanmoss', 'f', 50); INSERT INTO trigger_test (username, is_active, balance) values (‘jefftriplett', 't', 10); Tuesday, March 13, 12
  12. Let’s ensure you can’t change an inactive user’s balance CREATE

    OR REPLACE FUNCTION check_active() returns trigger AS $$ class NotActive(Exception): pass if not TD["old"]["is_active"]: raise NotActive $$ LANGUAGE plpythonu; Tuesday, March 13, 12
  13. If updating active and balance... CREATE OR REPLACE FUNCTION check_active()

    returns trigger AS $$ class NotActive(Exception): pass if not TD["old"]["is_active"] and not TD["new"]["is_active"]: raise NotActive $$ LANGUAGE plpythonu; Tuesday, March 13, 12
  14. How to hook a trigger to a table CREATE TRIGGER

    double_check_active BEFORE UPDATE ON trigger_test FOR EACH ROW EXECUTE PROCEDURE check_active(); Tuesday, March 13, 12
  15. What happens? pycon=# update trigger_test set balance = 60 where

    username=‘jacobkaplanmoss’; ERROR: NotActive: CONTEXT: Traceback (most recent call last): PL/Python function "check_active", line 5, in <module> raise NotActive PL/Python function "check_active" pycon=# Tuesday, March 13, 12
  16. Info in TD TD[“event”] INSERT, UPDATE, DELETE, TRUNCATE TD[“when”] BEFORE,

    AFTER, or INSTEAD OF TD[“level”] ROW or STATEMENT TD[“new”]/TD[“old”] New and Old data TD[“name”] Name of trigger TD[“table_name”] Table trigger called on TD[“table_schema”] Schema table is in TD[“args”] Arguments to trigger function Tuesday, March 13, 12
  17. CREATE TRIGGER options CREATE TRIGGER name {BEFORE|AFTER} { event[ OR

    ... ] } ON table [ FOR [EACH] {ROW |STATEMENT} ] [ WHEN ( condition ) ] EXECUTE PROCEDURE function_name ( arguments ) Tuesday, March 13, 12
  18. Checking credit card numbers CREATE OR REPLACE FUNCTION checkcc(ccnum varchar)

    returns boolean AS $$ sum = 0 num_digits = len(ccnum) for count in range(0, num_digits): digit = int(card_number[count]) if not (( count & 1 ) ^ oddeven ): digit = digit * 2 if digit > 9: digit = digit - 9 sum = sum + digit if sum & 10 == 0: return True else: return False $$ LANGUAGE plpythonu; Tuesday, March 13, 12
  19. Let’s do something more interesting... CREATE TABLE msgs ( id

    serial, to_user varchar, from_user varchar, read boolean default false, message text ); Tuesday, March 13, 12
  20. CREATE OR REPLACE FUNCTION handle_insert() RETURNS trigger AS $$ import

    redis to_user = TD["new"]["to_user"] unread_key = "unread-%s" % to_user r = redis.Redis() r.incr(unread_count_key) $$ LANGUAGE plpythonu; For INSERTs... Tuesday, March 13, 12
  21. CREATE OR REPLACE FUNCTION handle_update() RETURNS trigger AS $$ import

    redis to_user = TD["new"]["to_user"] unread_key = "unread-%s" % to_user r = redis.Redis() if not TD["old"]["read"] and TD["new"]["read"]: r.decr(unread_count_key) else: r.incr(unread_count_key) $$ LANGUAGE plpythonu; For UPDATEs... Tuesday, March 13, 12
  22. CREATE OR REPLACE FUNCTION handle_delete() RETURNS trigger AS $$ import

    redis to_user = TD["new"]["to_user"] unread_key = "unread-%s" % to_user r = redis.Redis() r.decr(unread_count_key) $$ LANGUAGE plpythonu; For DELETEs... Tuesday, March 13, 12
  23. Attach them to the table CREATE TRIGGER counter_insert AFTER INSERT

    ON msgs FOR EACH ROW EXECUTE PROCEDURE handle_insert(); CREATE TRIGGER counter_update AFTER UPDATE ON msgs FOR EACH ROW EXECUTE PROCEDURE handle_update(); CREATE TRIGGER counter_delete AFTER DELETE ON msgs FOR EACH ROW EXECUTE PROCEDURE handle_delete(); Tuesday, March 13, 12
  24. Full database access! CREATE OR REPLACE FUNCTION db_count() RETURNS trigger

    AS $$ if TD[“new”][“active”] == False: plpy.execute( “INSERT INTO finished_tickets (ticket_id) VALUES (%d)” % TD[“new”] [“id”] $$ LANGUAGE plpythonu; Tuesday, March 13, 12
  25. Internal ideas... • Aggregate/rollup data • Push “noisy” calculations into

    the DB to avoid round trips • Build and maintain materialized views • Regenerate expensive reports only when the underlying data changes enough to warrant it Tuesday, March 13, 12
  26. More external ideas... • Send emails based on data additions/changes

    • Expire/repopulate caches • Fire off celery tasks • Trigger backups based on % of data changed • Hit external APIs based on conditions • Lots of fun to be had with ZeroMQ... Tuesday, March 13, 12
  27. Retrofitting an existing system • Send email alerts • Populate

    your system with data based on INSERTs in another • Want an SMS when your boss adds a ticket to Trac, but don’t want to dive into the code? Tuesday, March 13, 12