Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Python & PostgreSQL - A Wonderful Wedding (English)

Python & PostgreSQL - A Wonderful Wedding (English)

Python and PostgreSQL, two tools we like to use for our projects but do you know everything about them?

The talk will give an overview of psycopg2, Peewee, SQLAlchemy, Alembic and PL/Python, these libraries can be used with PostgreSQL.

* psycopg2, the well known connector, this basic component is really useful, well documented and battle-tested and used by the most famous toolkits of the Python ecosystem.
* Peewee is a basic but awesome Object Relational Mapper.
* SQLAlchemy, a Python SQL toolkit and Object Relational Mapper, you can use this library to create your models and interact with them.
* Alembic, a lightweight database migration tool for usage with SQLAlchemy, allows to create some migration scripts for your project.
* PL/Python, a procedural language for PostgreSQL, allows to write functions in the Python language.
* MultiCorn, a Foreign Data Wrapper in Python

Stéphane Wirtel

January 31, 2015
Tweet

More Decks by Stéphane Wirtel

Other Decks in Programming

Transcript

  1. Python & Postgresql
    A Wonderful Wedding
    Stéphane Wirtel
    @PythonFOSDEM 2015
    http://goo.gl/c3U0vA

    View full-size slide

  2. Stéphane Wirtel
    #fellow Python Software Foundation
    #member EuroPython Society
    #member AFPy
    @PythonFOSDEM
    @matrixise
    [email protected]
    http://wirtel.be

    View full-size slide

  3. Schedule
    Qu’allons-nous apprendre ?

    View full-size slide

  4. Python, what is it?
    Programming language
    Clear and easy syntax
    Interpreted
    Multi-plateforms
    Dynamic Typing
    Multi-paradigms
    Garbage Collector

    View full-size slide

  5. PostgreSQL - Story
    Developed at Berkeley
    Rewriting of Ingres
    1985 -> Postgres (come from Post-Ingres)
    1995 -> Postgres95 (0.01) (Add the SQL Interpreter)
    1997 -> PostgreSQL 6.0
    2013 -> PostgreSQL 9.3
    2014 -> PostgreSQL 9.4 (b4)

    View full-size slide

  6. PostgreSQL - TL;DR
    ORDBMS, ACID
    SQL:2011
    DataTypes
    Transactional DDL
    Concurrent Index
    Extensions
    Common Table Expression
    MultiVersion Concurrency Control
    Cross-Platform

    View full-size slide

  7. PostgreSQL - TL;DR
    Replication
    Foreign Data Wrappers
    Procedural Languages
    Triggers
    Full text Search
    Views (Materialized)
    Table Inheritance
    Listen/Notify

    View full-size slide

  8. PostgreSQL, FDW ?
    Allow to use external resources from PostgreSQL
    Twitter, RSS, CSV, XML
    FileSystem
    Processes

    View full-size slide

  9. PostgreSQL, PL ?
    PL => Procedural Languages
    Stored Functions
    Safe (sandbox, SQL, PL/pgSQL) / Unsafe (C)
    PL/Python, PL/V8, PL/PERL

    View full-size slide

  10. DB-API 2.0
    The Standard…
    http://python.org/peps/pep-0249

    View full-size slide

  11. DB-API 2.0 aka PEP-249
    API for the Database Engines Connectors
    Easy to use and to understand
    Use 2 concepts:
    • Connection
    • Cursor
    http://legacy.python.org/dev/peps/pep-0249/

    View full-size slide

  12. Connection
    Wrap a database connection
    Handle the transactions and their internal state
    (commit and rollback)
    You cant’ execute SQL queries.

    View full-size slide

  13. Connection - API
    • connect(parameters=None)
    • close()
    • commit()
    • rollback()
    • cursor([name=None])

    View full-size slide

  14. Connection - Example
    import driver
    conn = driver.connect(database='database',
    host='localhost',
    port=5432)
    try:
    # create the cursor
    # use the cursor
    except Exception:
    conn.rollback()
    else:
    conn.commit()
    conn.close()

    View full-size slide

  15. Cursor
    Created from an existing connection
    Used for DML (Database Manipulation Language),
    INSERT, SELECT, UPDATE, DELETE
    cursor = conn.cursor()
    cursor.execute("""
    SELECT column1, column2
    FROM tableA
    """)
    for column1, column2 in cursor.fetchall():
    print(column1, column2)

    View full-size slide

  16. Cursor - API
    • callproc(procname[, parameters])
    • execute(query[, parameters])
    • fetchone(),
    fetchmany([size=cursor.arraysize]),
    fetchall()
    • close()

    View full-size slide

  17. Cursor - execute
    execute(query[, parameters])
    Performance and Security
    NEVER use the string interpolation
    (%) and the concatenation (+)
    => SQL Injection

    View full-size slide

  18. Cursor - execute (2)
    Accept string formatting for the
    request.
    qmark Question mark WHERE field = ?
    numeric Numeric positional WHERE field = :1
    named Named WHERE field = :code
    format ANSI C print format WHERE field = %s
    pyformat Python format WHERE field = %(name)s

    View full-size slide

  19. “Advanced” Example ;-)
    import driver
    conn = driver.connect(database='database',
    host='localhost',
    port=5432)
    cursor = conn.cursor()
    try:
    cursor.execute("SELECT column1, column2 FROM tableA")
    for column1, column2 in cursor.fetchall():
    print(column1, column2)
    except Exception:
    conn.rollback()
    else:
    conn.commit()
    finally:
    cursor.close()
    conn.close()

    View full-size slide

  20. psycopg2
    Basic…
    pip install psycopg2

    View full-size slide

  21. Introduction
    PostgreSQL Adaptor, based on libpq (C lib for PostgreSQL)
    DB-API 2.0 Compliant
    Multi-thread / Thread-safe
    Connection Pool
    Full Asynchronous, Coroutines
    Support ALL the data types(Json, Hstore, …)
    Python 2.5+, 3.1+, PostgreSQL 7.4+

    View full-size slide

  22. Cursor
    cursor = conn.cursor()
    cursor = conn.cursor(name=“pycon_cursor”)
    • Database Cursor
    • DML.
    • Named Cursor => Used for the big result set.

    View full-size slide

  23. SQL Requests
    cursor.execute("SELECT * FROM table")
    cursor.execute("INSERT INTO table (field1, field2) VALUES (%s, %s)",
    (field1, field2))
    cursor.execute("DELETE FROM table")
    cursor.execute("UPDATE table SET field1=%s", ('value',))
    cursor.execute("CREATE DATABASE database")
    cursor.execute("DROP DATABASE database")

    View full-size slide

  24. Example
    import psycopg2
    conn = psycopg2.connect(host='localhost', port=5432,
    user='username', password='password',
    database='database')
    cursor = conn.cursor()
    try:
    cursor.execute("SELECT column1, column2 FROM tableA")
    for column1, column2 in cursor.fetchall():
    print(column1, column2)
    except Exception:
    conn.rollback()
    else:
    conn.commit()
    finally:
    cursor.close()
    conn.close()

    View full-size slide

  25. Example - Context Manager
    import psycopg2
    DSN = dict(host='localhost', port=5432,
    user='username', password='password',
    database='database')
    with psycopg2.connect(**DSN) as conn:
    with conn.cursor() as cursor:
    cursor.execute("SELECT column1, column2 FROM tableA")
    for column1, column2 in cursor.fetchall():
    print(column1, column2)
    conn.commit()

    View full-size slide

  26. Object Relational
    Mapping

    View full-size slide

  27. Introduction
    “Object-relational mapping in computer science
    is a programming technique for converting data
    between incompatible type systems in object-
    oriented programming languages.”
    Wikipedia

    View full-size slide

  28. In brief
    • Map a Business Class on a SQL Table!
    • Add methods for add, modify, delete and search

    View full-size slide

  29. Peewee-ORM
    a small Object Relational Mapping
    pip install peewee

    View full-size slide

  30. Peewee
    ORM
    Simple, Easy to understand and extensible
    Python 2.6+ et 3.2+
    Extensions
    CLOC = 3200 (MySQL, Sqlite3 et PostgreSQL)
    Document is very good

    View full-size slide

  31. Connection
    import peewee
    database = peewee.PostgresqlDatabase('demo')

    View full-size slide

  32. Models
    class BaseModel(peewee.Model):
    id = peewee.PrimaryKeyField()
    class Meta:
    database = database
    class Address(BaseModel):
    street = peewee.CharField()
    zipcode = peewee.CharField()
    country = peewee.CharField()
    class Contact(BaseModel):
    firstname = peewee.CharField(null=False)
    lastname = peewee.CharField(null=False)
    email = peewee.CharField(unique=True, null=False, index=True)
    address = peewee.ForeignKeyField(Address, null=False,
    related_name=‘contacts')

    View full-size slide

  33. Creation of T
    ables
    database.create_tables([Address, Contact])

    View full-size slide

  34. Peewee - T
    ransactions
    tx = database.transaction()
    tx.commit()
    tx.rollback()
    with database.transaction():
    pass

    View full-size slide

  35. Create
    with database.transaction():
    address = Address.create(
    street='Avenue Franklin D. Roosevelt 50',
    zipcode='1050',
    country='Brussels')
    contact = Contact.create(
    firstname='Stephane', lastname='Wirtel',
    email='[email protected]',
    address=address
    )
    print(contact.id)

    View full-size slide

  36. Read
    contact = Contact.get(Contact.email == '[email protected]')
    print(contact.firstname)
    contacts = Contact.select()
    for contact in contacts:
    print(contact.firstname)
    for contact in contacts.where(Contact.email == '[email protected]'):
    print(contact.firstname)

    View full-size slide

  37. Update
    with database.transaction():
    contact = Contact.get(Contact.email == '[email protected]')
    contact.firstname = 'Speaker'
    contact.save()

    View full-size slide

  38. Delete
    # Remove an entire collection
    contacts = Contact.delete().where(Contact.email == '[email protected]')
    contacts.execute()
    # Remove one record
    contact = Contact.get(Contact.email == '[email protected]')
    contact.delete_instance()

    View full-size slide

  39. SQLAlchemy
    The Master of Object Relational Mapping
    pip install sqlalchemy

    View full-size slide

  40. SQLAlchemy
    • Handle the connections
    • Abstract the SQL connectors
    • ORM

    View full-size slide

  41. Connection
    from sqlalchemy import create_engine
    engine = create_engine('postgresql:///demo', echo=True)

    View full-size slide

  42. Models
    from sqlalchemy import Column, Integer, String, ForeignKey
    from sqlalchemy.ext.declarative import declarative_base
    from sqlalchemy.orm import relationship
    Base = declarative_base()
    class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    street = Column(String)
    zipcode = Column(String)
    country = Column(String, nullable=False)
    class Contact(Base):
    __tablename__ = 'contact'
    id = Column(Integer, primary_key=True)
    firstname = Column(String, nullable=False)
    lastname = Column(String, nullable=False)
    email = Column(String, nullable=False)
    address_id = Column(Integer, ForeignKey(Address.id), nullable=False)
    address = relationship('Address')

    View full-size slide

  43. Creation of T
    ables
    Base.metadata.create_all(engine)

    View full-size slide

  44. Sessions
    Wraps the transactions
    session = Session(engine)
    session.commit()

    View full-size slide

  45. Create
    session = Session(engine)
    address = Address(street='SQLAlchemy Street', zipcode='5432',
    country='Belgique')
    contact = Contact(
    firstname='Stephane', lastname='Wirtel',
    email='[email protected]',
    address=address
    )
    session.add(contact)
    session.commit()

    View full-size slide

  46. Read
    contact =
    session.query(Contact).filter_by(email='[email protected]').first()
    print(contact.firstname)
    contacts = session.query(Contact).all()
    for contact in contacts:
    print(contact.firstname)
    contacts =
    session.query(Contact).filter_by(email='[email protected]').all()
    for contact in contacts:
    print(contact.firstname)

    View full-size slide

  47. Update
    contact = session.query(Contact) \
    .filter_by(email='[email protected]').first()
    contact.email = ‘[email protected]'
    session.add(contact)
    session.commit()

    View full-size slide

  48. Delete
    contact = session.query(Contact) \
    .filter_by(email='[email protected]').first()
    session.delete(contact)
    session.commit()

    View full-size slide

  49. Alembic
    La Migration Facile
    pip install alembic

    View full-size slide

  50. Alembic
    Based on SQLAlchemy
    Use a Migration environment
    Configuration File INI
    Migration Scripts
    http://alembic.readthedocs.org/

    View full-size slide

  51. Alembic
    cd votreproject
    alembic init migrations
    > tree
    migrations/
    !"" README
    !"" alembic.ini
    !"" env.py
    !"" script.py.mako
    #"" versions

    View full-size slide

  52. Alembic
    revision = '22630db6f519'
    down_revision = None
    from alembic import op
    import sqlalchemy as sa
    def upgrade():
    op.create_table('user',
    sa.Column('id', sa.Integer(), nullable=False),
    sa.Column('created_at', sa.DateTime(), nullable=False),
    sa.Column('name', sa.String(length=255), nullable=False),
    sa.Column('email', sa.String(length=255), nullable=True),
    sa.Column('password', sa.String(length=255), nullable=True),
    sa.Column('active', sa.Boolean(), nullable=True),
    sa.Column('confirmed_at', sa.DateTime(), nullable=True),
    sa.PrimaryKeyConstraint('id'),
    sa.UniqueConstraint('email')
    )
    def downgrade():
    op.drop_table('user')
    alembic revision -m “bootstrap"
    alembic upgrade head

    View full-size slide

  53. Alembic
    revision = '7e4b6a43e6c'
    down_revision = '22630db6f519'
    from alembic import op
    import sqlalchemy as sa
    def upgrade():
    op.add_column('user', sa.Column('slug', sa.Unicode(length=255),
    nullable=False, server_default=''))
    def downgrade():
    op.drop_column('user', 'slug')
    alembic revision -m “add_slug”
    alembic upgrade head

    View full-size slide

  54. Alembic
    > tree
    migrations/
    !"" README
    !"" alembic.ini
    !"" env.py
    !"" script.py.mako
    #"" versions
    !"" 22630db6f519_bootstrap.py
    #"" 7e4b6a43e6c_add_slug.py

    View full-size slide

  55. Alembic
    from alembic import op
    from sqlalchemy.orm import Session
    from youproject import models
    def upgrade():
    engine = op.get_bind()
    session = Session(bind=engine)
    for contact in session.query(models.Contact).all():
    session.add(
    models.WifiUser(
    login=contact.name.lower(),
    password=random_string()
    )
    )
    session.commit()
    alembic upgrade head

    View full-size slide

  56. Alembic
    alembic revision
    alembic upgrade head
    alembic upgrade +2
    alembic downgrade -1
    alembic downgrade base
    alembic current
    alembic history

    View full-size slide

  57. Danger Area !

    View full-size slide

  58. Multicorn
    How to connect PostgreSQL to the rest of the World!
    http://multicorn.org

    View full-size slide

  59. Reminder: Foreign Data Wrappers
    Use an external resource from PostgreSQL
    Twitter, RSS, CSV, XML
    FileSystem
    Processes
    PostgreSQL, Oracle, MongoDB

    View full-size slide

  60. declarations for dynamic loading */
    _MODULE_MAGIC;
    G_FUNCTION_INFO_V1(mongo_fdw_handler);
    PG_FUNCTION_INFO_V1(mongo_fdw_validator);
    /*
    * mongo_fdw_handler creates and returns a struct with pointers to foreign table
    * callback functions.
    */
    Datum
    mongo_fdw_handler(PG_FUNCTION_ARGS)
    { FdwRoutine *fdwRoutine = makeNode(FdwRoutine);
    fdwRoutine->GetForeignRelSize = MongoGetForeignRelSize;
    fdwRoutine->GetForeignPaths = MongoGetForeignPaths;
    fdwRoutine->GetForeignPlan = MongoGetForeignPlan;
    fdwRoutine->ExplainForeignScan = MongoExplainForeignScan;
    fdwRoutine->BeginForeignScan = MongoBeginForeignScan;
    fdwRoutine->IterateForeignScan = MongoIterateForeignScan;
    fdwRoutine->ReScanForeignScan = MongoReScanForeignScan;
    fdwRoutine->EndForeignScan = MongoEndForeignScan;
    fdwRoutine->AnalyzeForeignTable = MongoAnalyzeForeignTable;
    PG_RETURN_POINTER(fdwRoutine);

    View full-size slide

  61. Multicorn
    • PostgreSQL Extension
    • Allow to write a Foreign Data Wrapper in Python
    • You can use your favourites SQL tools with FDW
    • Support FULL SQL ;-)

    View full-size slide

  62. Examples of Basic FDW
    • RSS, CSV, XML
    • LDAP
    • Gmail, IMAP
    • Google Search
    • SQLAlchemy (mysql, sqlite, oracle, …)

    View full-size slide

  63. Simple Example
    CREATE EXTENSION multicorn;

    View full-size slide

  64. Simple Example (2)
    CREATE SERVER wirtel_be_srv
    FOREIGN DATA WRAPPER multicorn
    OPTIONS (
    wrapper 'multicorn.rssfdw.RssFdw'
    );
    CREATE FOREIGN TABLE wirtel_be_rss (
    "pubDate" TIMESTAMP,
    description CHARACTER VARYING,
    title CHARACTER VARYING,
    link CHARACTER VARYING
    ) SERVER wirtel_be_srv OPTIONS(
    url 'http://wirtel.be/feeds/python.rss.xml'
    );

    View full-size slide

  65. Simple Example (3)
    SELECT "pubDate", title, link FROM wirtel_be_rss LIMIT 1;
    -[ RECORD 1 ]------------------------------------------------------------------------
    pubDate | 2014-10-19 00:00:00
    title | Python @ FOSDEM 2015 - Call For Proposals
    link | http://wirtel.be/posts/en/2014/10/19/python-fosdem-2015-call-for-proposals/

    View full-size slide

  66. with Odoo (OpenERP)
    from multicorn import ForeignDataWrapper
    import erppeek
    class OdooForeignDataWrapper(ForeignDataWrapper):
    def __init__(self, options, columns):
    super(OdooForeignDataWrapper, self).__init__(options, columns)
    url = 'http://{hostname}:{password}'.format(**options)
    self.client = erppeek.Client(url,
    options['database'],
    options['username'],
    options['password'])
    self.object_name = options['object']
    def execute(self, quals, columns):
    proxy = self.client.model(self.object_name)
    item = {}
    for record in proxy.browse([]):
    for column in columns:
    item[column] = record[column]
    yield item

    View full-size slide

  67. with Odoo (OpenERP) (2)
    CREATE EXTENSION multicorn;
    CREATE SERVER multicorn_odoo
    FOREIGN DATA WRAPPER multicorn
    OPTIONS (wrapper 'multicorn.odoofdw.OdooForeignDataWrapper');
    CREATE FOREIGN TABLE odoo_users (
    login character varying,
    name character varying)
    SERVER multicorn_odoo OPTIONS (
    hostname 'localhost',
    port '8069',
    database 'odoo',
    username 'admin',
    password 'admin',
    object 'res.users'
    );
    SELECT id, login, name, active
    FROM odoo_users
    WHERE login = 'admin';

    View full-size slide

  68. PL/Python
    Extend PostgreSQL with Python 2.x/3.x

    View full-size slide

  69. PL/Python
    Python 2 or Python 3
    Use all your favourite Python libraries (PyPI)
    Learning Curve very small PL/pgSQL
    Untrusted

    View full-size slide

  70. PL/Python
    apt-get install postgresql-plpython-9.4
    CREATE EXTENSION IF NOT EXISTS plpythonu;

    View full-size slide

  71. PL/Python
    CREATE OR REPLACE FUNCTION str_title(s VARCHAR) RETURNS VARCHAR AS $$
    return s.title()
    $$ LANGUAGE plpythonu;
    demo=# select str_title('hello world');
    -[ RECORD 1 ]----------
    str_title | Hello World

    View full-size slide

  72. PL/Python - DataT
    ypes
    PostgreSQL Python
    integer, bigint int, long
    boolean bool
    text types str
    SQL Array list
    Custom Types dict

    View full-size slide

  73. PL/Python - Debug
    Built-in Functions
    • plpy.notice(“”)
    • plpy.debug(“”)
    • plpy.error(“”)
    • plpy.fatal(“”)
    Don’t use ”print”

    View full-size slide

  74. PL/Python
    CREATE OR REPLACE FUNCTION get_pid_cpu_mem(pid INT)
    RETURNS TABLE(pid INT, cpu_perc FLOAT, mem_perc FLOAT) AS $$
    import psutil
    process = psutil.Process(pid)
    return [
    {
    'pid': pid,
    'cpu_perc': process.get_cpu_percent(interval=0),
    'mem_perc': process.get_memory_percent()
    }
    ]
    $$ LANGUAGE plpythonu;

    View full-size slide

  75. PL/Python
    -[ RECORD 1 ]----+---------------------------------
    datid | 16416
    datname | demo
    pid | 14680
    usesysid | 16384
    usename | stephane
    application_name | psql
    client_addr |
    client_hostname |
    client_port | -1
    backend_start | 2014-10-25 04:22:08.235532+02
    xact_start | 2014-10-25 04:25:28.712812+02
    query_start | 2014-10-25 04:25:28.712812+02
    state_change | 2014-10-25 04:25:28.712815+02
    waiting | f
    state | active
    query | select * from pg_stat_activity ;
    SELECT * FROM pg_stat_activity;

    View full-size slide

  76. PL/Python
    pid | cpu_perc | mem_perc
    -------+----------+--------------
    14680 | 0 | 1.4454081372
    (1 row)
    SELECT * FROM get_pid_cpu_mem(14680);

    View full-size slide

  77. PL/Python
    pid | cpu_perc | mem_perc | application_name
    -------+----------+---------------+------------------
    14680 | 0 | 1.50435626678 | psql
    (1 row)
    WITH stats AS (
    SELECT psa.*, get_pid_cpu_mem(psa.pid) as attrs
    from pg_stat_activity psa
    )
    SELECT (stats.attrs).pid,
    (stats.attrs).cpu_perc,
    (stats.attrs).mem_perc,
    stats.application_name
    FROM stats;

    View full-size slide

  78. PL/Python - Attention
    Unsafe -> pas de sandbox
    Difficult to maintain and to debug
    Your DBA will become Crazy ;-)
    No virtualenv and ask the superuser privileges.

    View full-size slide

  79. PL/Python - Pistes ;-)
    Treatment of big results
    Add hard constraints and business logic in the system.
    Add new features to PostgreSQL (see “str_title”)
    Triggers ?
    Use external libs from PyPI (requests, redis, zeromq,
    smtp…)
    Full Access to the database !!!

    View full-size slide

  80. Questions ?
    @matrixise
    https://speakerdeck.com/matrixise

    View full-size slide