Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Python & PostgreSQL - A Wonderful Wedding (English)

Python & PostgreSQL - A Wonderful Wedding (English)

Python and PostgreSQL, two tools we like to use for our projects but do you know everything about them?

The talk will give an overview of psycopg2, Peewee, SQLAlchemy, Alembic and PL/Python, these libraries can be used with PostgreSQL.

* psycopg2, the well known connector, this basic component is really useful, well documented and battle-tested and used by the most famous toolkits of the Python ecosystem.
* Peewee is a basic but awesome Object Relational Mapper.
* SQLAlchemy, a Python SQL toolkit and Object Relational Mapper, you can use this library to create your models and interact with them.
* Alembic, a lightweight database migration tool for usage with SQLAlchemy, allows to create some migration scripts for your project.
* PL/Python, a procedural language for PostgreSQL, allows to write functions in the Python language.
* MultiCorn, a Foreign Data Wrapper in Python

Stéphane Wirtel

January 31, 2015
Tweet

More Decks by Stéphane Wirtel

Other Decks in Programming

Transcript

  1. Python & Postgresql
    A Wonderful Wedding
    Stéphane Wirtel
    @PythonFOSDEM 2015
    http://goo.gl/c3U0vA

    View Slide

  2. Stéphane Wirtel
    #fellow Python Software Foundation
    #member EuroPython Society
    #member AFPy
    @PythonFOSDEM
    @matrixise
    [email protected]
    http://wirtel.be

    View Slide

  3. Schedule
    Qu’allons-nous apprendre ?

    View Slide

  4. View Slide

  5. Python

    View Slide

  6. Python, what is it?
    Programming language
    Clear and easy syntax
    Interpreted
    Multi-plateforms
    Dynamic Typing
    Multi-paradigms
    Garbage Collector

    View Slide

  7. Postgresql

    View Slide

  8. PostgreSQL - Story
    Developed at Berkeley
    Rewriting of Ingres
    1985 -> Postgres (come from Post-Ingres)
    1995 -> Postgres95 (0.01) (Add the SQL Interpreter)
    1997 -> PostgreSQL 6.0
    2013 -> PostgreSQL 9.3
    2014 -> PostgreSQL 9.4 (b4)

    View Slide

  9. PostgreSQL - TL;DR
    ORDBMS, ACID
    SQL:2011
    DataTypes
    Transactional DDL
    Concurrent Index
    Extensions
    Common Table Expression
    MultiVersion Concurrency Control
    Cross-Platform

    View Slide

  10. PostgreSQL - TL;DR
    Replication
    Foreign Data Wrappers
    Procedural Languages
    Triggers
    Full text Search
    Views (Materialized)
    Table Inheritance
    Listen/Notify

    View Slide

  11. PostgreSQL, FDW ?
    Allow to use external resources from PostgreSQL
    Twitter, RSS, CSV, XML
    FileSystem
    Processes

    View Slide

  12. PostgreSQL, PL ?
    PL => Procedural Languages
    Stored Functions
    Safe (sandbox, SQL, PL/pgSQL) / Unsafe (C)
    PL/Python, PL/V8, PL/PERL

    View Slide

  13. DB-API 2.0
    The Standard…
    http://python.org/peps/pep-0249

    View Slide

  14. DB-API 2.0 aka PEP-249
    API for the Database Engines Connectors
    Easy to use and to understand
    Use 2 concepts:
    • Connection
    • Cursor
    http://legacy.python.org/dev/peps/pep-0249/

    View Slide

  15. Connection
    Wrap a database connection
    Handle the transactions and their internal state
    (commit and rollback)
    You cant’ execute SQL queries.

    View Slide

  16. Connection - API
    • connect(parameters=None)
    • close()
    • commit()
    • rollback()
    • cursor([name=None])

    View Slide

  17. Connection - Example
    import driver
    conn = driver.connect(database='database',
    host='localhost',
    port=5432)
    try:
    # create the cursor
    # use the cursor
    except Exception:
    conn.rollback()
    else:
    conn.commit()
    conn.close()

    View Slide

  18. Cursor
    Created from an existing connection
    Used for DML (Database Manipulation Language),
    INSERT, SELECT, UPDATE, DELETE
    cursor = conn.cursor()
    cursor.execute("""
    SELECT column1, column2
    FROM tableA
    """)
    for column1, column2 in cursor.fetchall():
    print(column1, column2)

    View Slide

  19. Cursor - API
    • callproc(procname[, parameters])
    • execute(query[, parameters])
    • fetchone(),
    fetchmany([size=cursor.arraysize]),
    fetchall()
    • close()

    View Slide

  20. Cursor - execute
    execute(query[, parameters])
    Performance and Security
    NEVER use the string interpolation
    (%) and the concatenation (+)
    => SQL Injection

    View Slide

  21. Cursor - execute (2)
    Accept string formatting for the
    request.
    qmark Question mark WHERE field = ?
    numeric Numeric positional WHERE field = :1
    named Named WHERE field = :code
    format ANSI C print format WHERE field = %s
    pyformat Python format WHERE field = %(name)s

    View Slide

  22. “Advanced” Example ;-)
    import driver
    conn = driver.connect(database='database',
    host='localhost',
    port=5432)
    cursor = conn.cursor()
    try:
    cursor.execute("SELECT column1, column2 FROM tableA")
    for column1, column2 in cursor.fetchall():
    print(column1, column2)
    except Exception:
    conn.rollback()
    else:
    conn.commit()
    finally:
    cursor.close()
    conn.close()

    View Slide

  23. psycopg2
    Basic…
    pip install psycopg2

    View Slide

  24. Introduction
    PostgreSQL Adaptor, based on libpq (C lib for PostgreSQL)
    DB-API 2.0 Compliant
    Multi-thread / Thread-safe
    Connection Pool
    Full Asynchronous, Coroutines
    Support ALL the data types(Json, Hstore, …)
    Python 2.5+, 3.1+, PostgreSQL 7.4+

    View Slide

  25. Cursor
    cursor = conn.cursor()
    cursor = conn.cursor(name=“pycon_cursor”)
    • Database Cursor
    • DML.
    • Named Cursor => Used for the big result set.

    View Slide

  26. SQL Requests
    cursor.execute("SELECT * FROM table")
    cursor.execute("INSERT INTO table (field1, field2) VALUES (%s, %s)",
    (field1, field2))
    cursor.execute("DELETE FROM table")
    cursor.execute("UPDATE table SET field1=%s", ('value',))
    cursor.execute("CREATE DATABASE database")
    cursor.execute("DROP DATABASE database")

    View Slide

  27. Example
    import psycopg2
    conn = psycopg2.connect(host='localhost', port=5432,
    user='username', password='password',
    database='database')
    cursor = conn.cursor()
    try:
    cursor.execute("SELECT column1, column2 FROM tableA")
    for column1, column2 in cursor.fetchall():
    print(column1, column2)
    except Exception:
    conn.rollback()
    else:
    conn.commit()
    finally:
    cursor.close()
    conn.close()

    View Slide

  28. Example - Context Manager
    import psycopg2
    DSN = dict(host='localhost', port=5432,
    user='username', password='password',
    database='database')
    with psycopg2.connect(**DSN) as conn:
    with conn.cursor() as cursor:
    cursor.execute("SELECT column1, column2 FROM tableA")
    for column1, column2 in cursor.fetchall():
    print(column1, column2)
    conn.commit()

    View Slide

  29. Object Relational
    Mapping

    View Slide

  30. Introduction
    “Object-relational mapping in computer science
    is a programming technique for converting data
    between incompatible type systems in object-
    oriented programming languages.”
    Wikipedia

    View Slide

  31. In brief
    • Map a Business Class on a SQL Table!
    • Add methods for add, modify, delete and search

    View Slide

  32. Peewee-ORM
    a small Object Relational Mapping
    pip install peewee

    View Slide

  33. Peewee
    ORM
    Simple, Easy to understand and extensible
    Python 2.6+ et 3.2+
    Extensions
    CLOC = 3200 (MySQL, Sqlite3 et PostgreSQL)
    Document is very good

    View Slide

  34. Connection
    import peewee
    database = peewee.PostgresqlDatabase('demo')

    View Slide

  35. Models
    class BaseModel(peewee.Model):
    id = peewee.PrimaryKeyField()
    class Meta:
    database = database
    class Address(BaseModel):
    street = peewee.CharField()
    zipcode = peewee.CharField()
    country = peewee.CharField()
    class Contact(BaseModel):
    firstname = peewee.CharField(null=False)
    lastname = peewee.CharField(null=False)
    email = peewee.CharField(unique=True, null=False, index=True)
    address = peewee.ForeignKeyField(Address, null=False,
    related_name=‘contacts')

    View Slide

  36. Creation of T
    ables
    database.create_tables([Address, Contact])

    View Slide

  37. Peewee - T
    ransactions
    tx = database.transaction()
    tx.commit()
    tx.rollback()
    with database.transaction():
    pass

    View Slide

  38. Create
    with database.transaction():
    address = Address.create(
    street='Avenue Franklin D. Roosevelt 50',
    zipcode='1050',
    country='Brussels')
    contact = Contact.create(
    firstname='Stephane', lastname='Wirtel',
    email='[email protected]',
    address=address
    )
    print(contact.id)

    View Slide

  39. Read
    contact = Contact.get(Contact.email == '[email protected]')
    print(contact.firstname)
    contacts = Contact.select()
    for contact in contacts:
    print(contact.firstname)
    for contact in contacts.where(Contact.email == '[email protected]'):
    print(contact.firstname)

    View Slide

  40. Update
    with database.transaction():
    contact = Contact.get(Contact.email == '[email protected]')
    contact.firstname = 'Speaker'
    contact.save()

    View Slide

  41. Delete
    # Remove an entire collection
    contacts = Contact.delete().where(Contact.email == '[email protected]')
    contacts.execute()
    # Remove one record
    contact = Contact.get(Contact.email == '[email protected]')
    contact.delete_instance()

    View Slide

  42. SQLAlchemy
    The Master of Object Relational Mapping
    pip install sqlalchemy

    View Slide

  43. SQLAlchemy
    • Handle the connections
    • Abstract the SQL connectors
    • ORM

    View Slide

  44. Connection
    from sqlalchemy import create_engine
    engine = create_engine('postgresql:///demo', echo=True)

    View Slide

  45. Models
    from sqlalchemy import Column, Integer, String, ForeignKey
    from sqlalchemy.ext.declarative import declarative_base
    from sqlalchemy.orm import relationship
    Base = declarative_base()
    class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    street = Column(String)
    zipcode = Column(String)
    country = Column(String, nullable=False)
    class Contact(Base):
    __tablename__ = 'contact'
    id = Column(Integer, primary_key=True)
    firstname = Column(String, nullable=False)
    lastname = Column(String, nullable=False)
    email = Column(String, nullable=False)
    address_id = Column(Integer, ForeignKey(Address.id), nullable=False)
    address = relationship('Address')

    View Slide

  46. Creation of T
    ables
    Base.metadata.create_all(engine)

    View Slide

  47. Sessions
    Wraps the transactions
    session = Session(engine)
    session.commit()

    View Slide

  48. Create
    session = Session(engine)
    address = Address(street='SQLAlchemy Street', zipcode='5432',
    country='Belgique')
    contact = Contact(
    firstname='Stephane', lastname='Wirtel',
    email='[email protected]',
    address=address
    )
    session.add(contact)
    session.commit()

    View Slide

  49. Read
    contact =
    session.query(Contact).filter_by(email='[email protected]').first()
    print(contact.firstname)
    contacts = session.query(Contact).all()
    for contact in contacts:
    print(contact.firstname)
    contacts =
    session.query(Contact).filter_by(email='[email protected]').all()
    for contact in contacts:
    print(contact.firstname)

    View Slide

  50. Update
    contact = session.query(Contact) \
    .filter_by(email='[email protected]').first()
    contact.email = ‘[email protected]'
    session.add(contact)
    session.commit()

    View Slide

  51. Delete
    contact = session.query(Contact) \
    .filter_by(email='[email protected]').first()
    session.delete(contact)
    session.commit()

    View Slide

  52. Alembic
    La Migration Facile
    pip install alembic

    View Slide

  53. Alembic
    Based on SQLAlchemy
    Use a Migration environment
    Configuration File INI
    Migration Scripts
    http://alembic.readthedocs.org/

    View Slide

  54. Alembic
    cd votreproject
    alembic init migrations
    > tree
    migrations/
    !"" README
    !"" alembic.ini
    !"" env.py
    !"" script.py.mako
    #"" versions

    View Slide

  55. Alembic
    revision = '22630db6f519'
    down_revision = None
    from alembic import op
    import sqlalchemy as sa
    def upgrade():
    op.create_table('user',
    sa.Column('id', sa.Integer(), nullable=False),
    sa.Column('created_at', sa.DateTime(), nullable=False),
    sa.Column('name', sa.String(length=255), nullable=False),
    sa.Column('email', sa.String(length=255), nullable=True),
    sa.Column('password', sa.String(length=255), nullable=True),
    sa.Column('active', sa.Boolean(), nullable=True),
    sa.Column('confirmed_at', sa.DateTime(), nullable=True),
    sa.PrimaryKeyConstraint('id'),
    sa.UniqueConstraint('email')
    )
    def downgrade():
    op.drop_table('user')
    alembic revision -m “bootstrap"
    alembic upgrade head

    View Slide

  56. Alembic
    revision = '7e4b6a43e6c'
    down_revision = '22630db6f519'
    from alembic import op
    import sqlalchemy as sa
    def upgrade():
    op.add_column('user', sa.Column('slug', sa.Unicode(length=255),
    nullable=False, server_default=''))
    def downgrade():
    op.drop_column('user', 'slug')
    alembic revision -m “add_slug”
    alembic upgrade head

    View Slide

  57. Alembic
    > tree
    migrations/
    !"" README
    !"" alembic.ini
    !"" env.py
    !"" script.py.mako
    #"" versions
    !"" 22630db6f519_bootstrap.py
    #"" 7e4b6a43e6c_add_slug.py

    View Slide

  58. Alembic
    from alembic import op
    from sqlalchemy.orm import Session
    from youproject import models
    def upgrade():
    engine = op.get_bind()
    session = Session(bind=engine)
    for contact in session.query(models.Contact).all():
    session.add(
    models.WifiUser(
    login=contact.name.lower(),
    password=random_string()
    )
    )
    session.commit()
    alembic upgrade head

    View Slide

  59. Alembic
    alembic revision
    alembic upgrade head
    alembic upgrade +2
    alembic downgrade -1
    alembic downgrade base
    alembic current
    alembic history

    View Slide

  60. Danger Area !

    View Slide

  61. Multicorn
    How to connect PostgreSQL to the rest of the World!
    http://multicorn.org

    View Slide

  62. Reminder: Foreign Data Wrappers
    Use an external resource from PostgreSQL
    Twitter, RSS, CSV, XML
    FileSystem
    Processes
    PostgreSQL, Oracle, MongoDB

    View Slide

  63. declarations for dynamic loading */
    _MODULE_MAGIC;
    G_FUNCTION_INFO_V1(mongo_fdw_handler);
    PG_FUNCTION_INFO_V1(mongo_fdw_validator);
    /*
    * mongo_fdw_handler creates and returns a struct with pointers to foreign table
    * callback functions.
    */
    Datum
    mongo_fdw_handler(PG_FUNCTION_ARGS)
    { FdwRoutine *fdwRoutine = makeNode(FdwRoutine);
    fdwRoutine->GetForeignRelSize = MongoGetForeignRelSize;
    fdwRoutine->GetForeignPaths = MongoGetForeignPaths;
    fdwRoutine->GetForeignPlan = MongoGetForeignPlan;
    fdwRoutine->ExplainForeignScan = MongoExplainForeignScan;
    fdwRoutine->BeginForeignScan = MongoBeginForeignScan;
    fdwRoutine->IterateForeignScan = MongoIterateForeignScan;
    fdwRoutine->ReScanForeignScan = MongoReScanForeignScan;
    fdwRoutine->EndForeignScan = MongoEndForeignScan;
    fdwRoutine->AnalyzeForeignTable = MongoAnalyzeForeignTable;
    PG_RETURN_POINTER(fdwRoutine);

    View Slide

  64. Multicorn
    • PostgreSQL Extension
    • Allow to write a Foreign Data Wrapper in Python
    • You can use your favourites SQL tools with FDW
    • Support FULL SQL ;-)

    View Slide

  65. Examples of Basic FDW
    • RSS, CSV, XML
    • LDAP
    • Gmail, IMAP
    • Google Search
    • SQLAlchemy (mysql, sqlite, oracle, …)

    View Slide

  66. Simple Example
    CREATE EXTENSION multicorn;

    View Slide

  67. Simple Example (2)
    CREATE SERVER wirtel_be_srv
    FOREIGN DATA WRAPPER multicorn
    OPTIONS (
    wrapper 'multicorn.rssfdw.RssFdw'
    );
    CREATE FOREIGN TABLE wirtel_be_rss (
    "pubDate" TIMESTAMP,
    description CHARACTER VARYING,
    title CHARACTER VARYING,
    link CHARACTER VARYING
    ) SERVER wirtel_be_srv OPTIONS(
    url 'http://wirtel.be/feeds/python.rss.xml'
    );

    View Slide

  68. Simple Example (3)
    SELECT "pubDate", title, link FROM wirtel_be_rss LIMIT 1;
    -[ RECORD 1 ]------------------------------------------------------------------------
    pubDate | 2014-10-19 00:00:00
    title | Python @ FOSDEM 2015 - Call For Proposals
    link | http://wirtel.be/posts/en/2014/10/19/python-fosdem-2015-call-for-proposals/

    View Slide

  69. with Odoo (OpenERP)
    from multicorn import ForeignDataWrapper
    import erppeek
    class OdooForeignDataWrapper(ForeignDataWrapper):
    def __init__(self, options, columns):
    super(OdooForeignDataWrapper, self).__init__(options, columns)
    url = 'http://{hostname}:{password}'.format(**options)
    self.client = erppeek.Client(url,
    options['database'],
    options['username'],
    options['password'])
    self.object_name = options['object']
    def execute(self, quals, columns):
    proxy = self.client.model(self.object_name)
    item = {}
    for record in proxy.browse([]):
    for column in columns:
    item[column] = record[column]
    yield item

    View Slide

  70. with Odoo (OpenERP) (2)
    CREATE EXTENSION multicorn;
    CREATE SERVER multicorn_odoo
    FOREIGN DATA WRAPPER multicorn
    OPTIONS (wrapper 'multicorn.odoofdw.OdooForeignDataWrapper');
    CREATE FOREIGN TABLE odoo_users (
    login character varying,
    name character varying)
    SERVER multicorn_odoo OPTIONS (
    hostname 'localhost',
    port '8069',
    database 'odoo',
    username 'admin',
    password 'admin',
    object 'res.users'
    );
    SELECT id, login, name, active
    FROM odoo_users
    WHERE login = 'admin';

    View Slide

  71. PL/Python
    Extend PostgreSQL with Python 2.x/3.x

    View Slide

  72. PL/Python
    Python 2 or Python 3
    Use all your favourite Python libraries (PyPI)
    Learning Curve very small PL/pgSQL
    Untrusted

    View Slide

  73. PL/Python
    apt-get install postgresql-plpython-9.4
    CREATE EXTENSION IF NOT EXISTS plpythonu;

    View Slide

  74. PL/Python
    CREATE OR REPLACE FUNCTION str_title(s VARCHAR) RETURNS VARCHAR AS $$
    return s.title()
    $$ LANGUAGE plpythonu;
    demo=# select str_title('hello world');
    -[ RECORD 1 ]----------
    str_title | Hello World

    View Slide

  75. PL/Python - DataT
    ypes
    PostgreSQL Python
    integer, bigint int, long
    boolean bool
    text types str
    SQL Array list
    Custom Types dict

    View Slide

  76. PL/Python - Debug
    Built-in Functions
    • plpy.notice(“”)
    • plpy.debug(“”)
    • plpy.error(“”)
    • plpy.fatal(“”)
    Don’t use ”print”

    View Slide

  77. PL/Python
    CREATE OR REPLACE FUNCTION get_pid_cpu_mem(pid INT)
    RETURNS TABLE(pid INT, cpu_perc FLOAT, mem_perc FLOAT) AS $$
    import psutil
    process = psutil.Process(pid)
    return [
    {
    'pid': pid,
    'cpu_perc': process.get_cpu_percent(interval=0),
    'mem_perc': process.get_memory_percent()
    }
    ]
    $$ LANGUAGE plpythonu;

    View Slide

  78. PL/Python
    -[ RECORD 1 ]----+---------------------------------
    datid | 16416
    datname | demo
    pid | 14680
    usesysid | 16384
    usename | stephane
    application_name | psql
    client_addr |
    client_hostname |
    client_port | -1
    backend_start | 2014-10-25 04:22:08.235532+02
    xact_start | 2014-10-25 04:25:28.712812+02
    query_start | 2014-10-25 04:25:28.712812+02
    state_change | 2014-10-25 04:25:28.712815+02
    waiting | f
    state | active
    query | select * from pg_stat_activity ;
    SELECT * FROM pg_stat_activity;

    View Slide

  79. PL/Python
    pid | cpu_perc | mem_perc
    -------+----------+--------------
    14680 | 0 | 1.4454081372
    (1 row)
    SELECT * FROM get_pid_cpu_mem(14680);

    View Slide

  80. PL/Python
    pid | cpu_perc | mem_perc | application_name
    -------+----------+---------------+------------------
    14680 | 0 | 1.50435626678 | psql
    (1 row)
    WITH stats AS (
    SELECT psa.*, get_pid_cpu_mem(psa.pid) as attrs
    from pg_stat_activity psa
    )
    SELECT (stats.attrs).pid,
    (stats.attrs).cpu_perc,
    (stats.attrs).mem_perc,
    stats.application_name
    FROM stats;

    View Slide

  81. PL/Python - Attention
    Unsafe -> pas de sandbox
    Difficult to maintain and to debug
    Your DBA will become Crazy ;-)
    No virtualenv and ask the superuser privileges.

    View Slide

  82. PL/Python - Pistes ;-)
    Treatment of big results
    Add hard constraints and business logic in the system.
    Add new features to PostgreSQL (see “str_title”)
    Triggers ?
    Use external libs from PyPI (requests, redis, zeromq,
    smtp…)
    Full Access to the database !!!

    View Slide

  83. View Slide

  84. Questions ?
    @matrixise
    https://speakerdeck.com/matrixise

    View Slide