Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Python & PostgreSQL - A Wonderful Wedding

Python & PostgreSQL - A Wonderful Wedding

Stéphane Wirtel

October 25, 2014
Tweet

More Decks by Stéphane Wirtel

Other Decks in Programming

Transcript

  1. Python & Postgresql
    A Wonderful Wedding
    Stéphane Wirtel
    @PyConFR 2014 - Lyon
    http://goo.gl/UsPLwh

    View Slide

  2. Stéphane Wirtel
    #fellow @ThePSF
    #member EPS
    #member AFPy
    @PythonFOSDEM
    @matrixise
    [email protected]
    http://wirtel.be

    View Slide

  3. Agenda
    Qu’allons-nous apprendre ?

    View Slide

  4. View Slide

  5. Python

    View Slide

  6. Python, c’est quoi ?
    Langage de Programmation
    Syntaxe simple & lisible
    Interprété
    Multi-plateformes
    Typage Dynamique Fort
    Multi-paradigmes
    Garbage Collector

    View Slide

  7. Postgresql

    View Slide

  8. PostgreSQL - Story
    Développée à Berkeley
    Réécriture de Ingres
    1985 -> Postgres (vient de Post-Ingres)
    1995 -> Postgres95 (0.01) (Ajout d’un interpréter SQL)
    1997 -> PostgreSQL 6.0
    2013 -> PostgreSQL 9.3
    2014 -> PostgreSQL 9.4 (b4)

    View Slide

  9. PostgreSQL - TL;DR
    ORDBMS, ACID
    SQL:2011
    DataTypes
    Transactional DDL
    Concurrent Index
    Extensions
    Common Table Expression
    MultiVersion Concurrency Control
    Cross-Platform

    View Slide

  10. PostgreSQL - TL;DR
    Replication
    Foreign Data Wrappers
    Procedural Languages
    Triggers
    Full text Search
    Views (Materialized)
    Table Inheritance
    Listen/Notify

    View Slide

  11. PostgreSQL, FDW ?
    Utiliser une source externe à PostgreSQL
    Twitter, RSS, CSV, XML
    FileSystem
    Processes

    View Slide

  12. PostgreSQL, PL ?
    Étend la base de données
    Fonctions stockées
    Safe (sandbox, SQL, PL/pgSQL) / Unsafe (C)
    PL/Python, PL/V8, PL/PERL

    View Slide

  13. DB-API 2.0
    Commençons par le commencement…
    http://python.org/peps/pep-0249

    View Slide

  14. DB-API 2.0 aka PEP-249
    API pour les connecteurs de Base de Données
    Facile à utiliser, et à comprendre
    Utilise deux concepts:
    • Connection
    • Curseur
    http://legacy.python.org/dev/peps/pep-0249/

    View Slide

  15. Connection
    Emballe une connection vers la base de données
    Gère les transactions et leur état (Commit/
    Rollback)
    N’exécute pas de requêtes SQL.

    View Slide

  16. Connection - API
    • connect(parameters=None)
    • close()
    • commit()
    • rollback()
    • cursor([name=None])

    View Slide

  17. Connection - Exemple
    import driver
    conn = driver.connect(database='database',
    host='localhost',
    port=5432)
    try:
    # cree le curseur
    # utilise le curseur
    except Exception:
    conn.rollback()
    else:
    conn.commit()
    conn.close()

    View Slide

  18. Curseur
    Créé via une instance d’une connection
    Utilisé pour la manipulation et interrogation de la
    Base de Données
    cursor = conn.cursor()
    cursor.execute("""
    SELECT column1, column2
    FROM tableA
    """)
    for column1, column2 in cursor.fetchall():
    print(column1, column2)

    View Slide

  19. Curseur - API
    • callproc(procname[, parameters])
    • execute(query[, parameters])
    • fetchone(),
    fetchmany([size=cursor.arraysize]),
    fetchall()
    • close()

    View Slide

  20. Curseur - execute
    execute(query[, parameters])
    Performance et Sécurité
    JAMAIS utiliser l’interpolation (%)
    et la concaténation (+)
    => SQL Injection

    View Slide

  21. Curseur - execute (2)
    Accepte un formatage de la requête.
    qmark Question mark WHERE field = ?
    numeric Numeric positional WHERE field = :1
    named Named WHERE field = :code
    format ANSI C print format WHERE field = %s
    pyformat Python format WHERE field = %(name)s

    View Slide

  22. Exemple plus complet ;-)
    import driver
    conn = driver.connect(database='database',
    host='localhost',
    port=5432)
    cursor = conn.cursor()
    try:
    cursor.execute("SELECT column1, column2 FROM tableA")
    for column1, column2 in cursor.fetchall():
    print(column1, column2)
    except Exception:
    conn.rollback()
    else:
    conn.commit()
    finally:
    cursor.close()
    conn.close()

    View Slide

  23. psycopg2
    La Base…
    pip install psycopg2

    View Slide

  24. Introduction
    PostgreSQL Adaptor, basé sur libpq
    DB-API 2.0 Compliant
    Multi-thread
    Pool de Connections
    Full Asynchronous, Coroutines
    Supporte TOUS les types de données de PostgreSQL (Json,
    Hstore, …)
    Python 2.5+, 3.1+, PostgreSQL 7.4+

    View Slide

  25. Curseurs
    cursor = conn.cursor()
    cursor = conn.cursor(name=“pycon_cursor”)
    • Curseur coté PostgreSQL
    • Interrogation, manipulation des données de la
    base.
    • Nommé => Utilisé pour les gros ensembles de
    données

    View Slide

  26. Requêtes SQL
    cursor.execute("SELECT * FROM table")
    cursor.execute("INSERT INTO table (field1, field2) VALUES (%s, %s)",
    (field1, field2))
    cursor.execute("DELETE FROM table")
    cursor.execute("UPDATE table SET field1=%s", ('value',))
    cursor.execute("CREATE DATABASE database")
    cursor.execute("DROP DATABASE database")

    View Slide

  27. Exemple
    import psycopg2
    conn = psycopg2.connect(host='localhost', port=5432,
    user='username', password='password',
    database='database')
    cursor = conn.cursor()
    try:
    cursor.execute("SELECT column1, column2 FROM tableA")
    for column1, column2 in cursor.fetchall():
    print(column1, column2)
    except Exception:
    conn.rollback()
    else:
    conn.commit()
    finally:
    cursor.close()
    conn.close()

    View Slide

  28. Exemple - Context Manager
    import psycopg2
    DSN = dict(host='localhost', port=5432,
    user='username', password='password',
    database='database')
    with psycopg2.connect(**DSN) as conn:
    with conn.cursor() as cursor:
    cursor.execute("SELECT column1, column2 FROM tableA")
    for column1, column2 in cursor.fetchall():
    print(column1, column2)
    conn.commit()

    View Slide

  29. Object Relational
    Mapping

    View Slide

  30. Introduction
    “Technique de Programmation créant l’illusion
    d’une base de données orientée objet à partir
    d’une base de données relationnelle en définissant
    des correspondances entre cette base de
    données et les objets du langage utilisé.”
    Wikipedia

    View Slide

  31. En bref
    • Mapper une classe Business sur une table !
    • Méthodes pour ajouter, modifier, supprimer et
    rechercher

    View Slide

  32. Peewee-ORM
    Object Relational Mapping
    pip install peewee

    View Slide

  33. Peewee
    ORM
    Simple, facile à comprendre et extensible
    Python 2.6+ et 3.2+
    Extensions
    3200 lignes de code (MySQL, Sqlite3 et PostgreSQL)
    Très bonne documentation

    View Slide

  34. Connection
    import peewee
    database = peewee.PostgresqlDatabase('demo')

    View Slide

  35. Modèles
    class BaseModel(peewee.Model):
    id = peewee.PrimaryKeyField()
    class Meta:
    database = database
    class Address(BaseModel):
    street = peewee.CharField()
    zipcode = peewee.CharField()
    country = peewee.CharField()
    class Contact(BaseModel):
    firstname = peewee.CharField(null=False)
    lastname = peewee.CharField(null=False)
    email = peewee.CharField(unique=True, null=False, index=True)
    address = peewee.ForeignKeyField(Address, null=False,
    related_name=‘contacts')

    View Slide

  36. Création des tables
    database.create_tables([Address, Contact])

    View Slide

  37. Peewee - T
    ransactions
    tx = database.transaction()
    tx.commit()
    tx.rollback()
    with database.transaction():
    pass

    View Slide

  38. Create
    with database.transaction():
    address = Address.create(
    street='Rue de Lyon',
    zipcode='90001',
    country='France')
    contact = Contact.create(
    firstname='Stephane', lastname='Wirtel',
    email='[email protected]',
    address=address
    )
    print(contact.id)

    View Slide

  39. Read
    contact = Contact.get(Contact.email == '[email protected]')
    print(contact.firstname)
    contacts = Contact.select()
    for contact in contacts:
    print(contact.firstname)
    for contact in contacts.where(Contact.email == '[email protected]'):
    print(contact.firstname)

    View Slide

  40. Update
    with database.transaction():
    contact = Contact.get(Contact.email == '[email protected]')
    contact.firstname = 'Speaker'
    contact.save()

    View Slide

  41. Delete
    # Suppression pour une collection
    contacts = Contact.delete().where(Contact.email == '[email protected]')
    contacts.execute()
    # Suppression d'un seul enregistrement
    contact = Contact.get(Contact.email == '[email protected]')
    contact.delete_instance()

    View Slide

  42. SQLAlchemy
    Object Relational Mapping
    pip install sqlalchemy

    View Slide

  43. SQLAlchemy
    • Gestionnaire de connections
    • Abstraction des connecteurs SQL
    • Langage d’Expression SQL
    • ORM

    View Slide

  44. Connection
    from sqlalchemy import create_engine
    engine = create_engine('postgresql:///demo', echo=True)

    View Slide

  45. Modèles
    from sqlalchemy import Column, Integer, String, ForeignKey
    from sqlalchemy.ext.declarative import declarative_base
    from sqlalchemy.orm import relationship
    Base = declarative_base()
    class Address(Base):
    __tablename__ = 'address'
    id = Column(Integer, primary_key=True)
    street = Column(String)
    zipcode = Column(String)
    country = Column(String, nullable=False)
    class Contact(Base):
    __tablename__ = 'contact'
    id = Column(Integer, primary_key=True)
    firstname = Column(String, nullable=False)
    lastname = Column(String, nullable=False)
    email = Column(String, nullable=False)
    address_id = Column(Integer, ForeignKey(Address.id), nullable=False)
    address = relationship('Address')

    View Slide

  46. Création des tables
    Base.metadata.create_all(engine)

    View Slide

  47. Sessions
    Wraps les transactions
    session = Session(engine)
    session.commit()

    View Slide

  48. Create
    session = Session(engine)
    address = Address(street='SQLAlchemy Street', zipcode='5432',
    country='Belgique')
    contact = Contact(
    firstname='Stephane', lastname='Wirtel',
    email='[email protected]',
    address=address
    )
    session.add(contact)
    session.commit()

    View Slide

  49. Read
    contact =
    session.query(Contact).filter_by(email='[email protected]').first()
    print(contact.firstname)
    contacts = session.query(Contact).all()
    for contact in contacts:
    print(contact.firstname)
    contacts =
    session.query(Contact).filter_by(email='[email protected]').all()
    for contact in contacts:
    print(contact.firstname)

    View Slide

  50. Update
    contact = session.query(Contact) \
    .filter_by(email='[email protected]').first()
    contact.email = '[email protected]'
    session.add(contact)
    session.commit()

    View Slide

  51. Delete
    contact = session.query(Contact) \
    .filter_by(email='[email protected]').first()
    session.delete(contact)
    session.commit()

    View Slide

  52. Alembic
    La Migration Facile
    pip install alembic

    View Slide

  53. Alembic
    Basé sur SQLAlchemy
    Utilise un environnement de Migration
    Fichier de configuration INI
    Scripts de migration
    http://alembic.readthedocs.org/

    View Slide

  54. Alembic
    cd votreproject
    alembic init migrations
    > tree
    migrations/
    !"" README
    !"" alembic.ini
    !"" env.py
    !"" script.py.mako
    #"" versions

    View Slide

  55. Alembic
    revision = '22630db6f519'
    down_revision = None
    from alembic import op
    import sqlalchemy as sa
    def upgrade():
    op.create_table('user',
    sa.Column('id', sa.Integer(), nullable=False),
    sa.Column('created_at', sa.DateTime(), nullable=False),
    sa.Column('name', sa.String(length=255), nullable=False),
    sa.Column('email', sa.String(length=255), nullable=True),
    sa.Column('password', sa.String(length=255), nullable=True),
    sa.Column('active', sa.Boolean(), nullable=True),
    sa.Column('confirmed_at', sa.DateTime(), nullable=True),
    sa.PrimaryKeyConstraint('id'),
    sa.UniqueConstraint('email')
    )
    def downgrade():
    op.drop_table('user')
    alembic revision -m “bootstrap"
    alembic upgrade head

    View Slide

  56. Alembic
    revision = '7e4b6a43e6c'
    down_revision = '22630db6f519'
    from alembic import op
    import sqlalchemy as sa
    def upgrade():
    op.add_column('user', sa.Column('slug', sa.Unicode(length=255),
    nullable=False, server_default=''))
    def downgrade():
    op.drop_column('user', 'slug')
    alembic revision -m “add_slug”
    alembic upgrade head

    View Slide

  57. Alembic
    > tree
    migrations/
    !"" README
    !"" alembic.ini
    !"" env.py
    !"" script.py.mako
    #"" versions
    !"" 22630db6f519_bootstrap.py
    #"" 7e4b6a43e6c_add_slug.py

    View Slide

  58. Alembic
    from alembic import op
    from sqlalchemy.orm import Session
    from youproject import models
    def upgrade():
    engine = op.get_bind()
    session = Session(bind=engine)
    for contact in session.query(models.Contact).all():
    session.add(
    models.WifiUser(
    login=contact.name.lower(),
    password=random_string()
    )
    )
    session.commit()
    alembic upgrade head

    View Slide

  59. Alembic
    alembic revision
    alembic upgrade head
    alembic upgrade +2
    alembic downgrade -1
    alembic downgrade base
    alembic current
    alembic history

    View Slide

  60. Zone Dangereuse !!!

    View Slide

  61. Multicorn
    Où comment se connecter au monde !
    http://multicorn.org

    View Slide

  62. Rappel: Foreign Data Wrappers
    Utiliser une source externe à PostgreSQL
    Twitter, RSS, CSV, XML
    FileSystem
    Processes
    PostgreSQL, Oracle, MongoDB

    View Slide

  63. declarations for dynamic loading */
    _MODULE_MAGIC;
    G_FUNCTION_INFO_V1(mongo_fdw_handler);
    PG_FUNCTION_INFO_V1(mongo_fdw_validator);
    /*
    * mongo_fdw_handler creates and returns a struct with pointers to foreign table
    * callback functions.
    */
    Datum
    mongo_fdw_handler(PG_FUNCTION_ARGS)
    { FdwRoutine *fdwRoutine = makeNode(FdwRoutine);
    fdwRoutine->GetForeignRelSize = MongoGetForeignRelSize;
    fdwRoutine->GetForeignPaths = MongoGetForeignPaths;
    fdwRoutine->GetForeignPlan = MongoGetForeignPlan;
    fdwRoutine->ExplainForeignScan = MongoExplainForeignScan;
    fdwRoutine->BeginForeignScan = MongoBeginForeignScan;
    fdwRoutine->IterateForeignScan = MongoIterateForeignScan;
    fdwRoutine->ReScanForeignScan = MongoReScanForeignScan;
    fdwRoutine->EndForeignScan = MongoEndForeignScan;
    fdwRoutine->AnalyzeForeignTable = MongoAnalyzeForeignTable;
    PG_RETURN_POINTER(fdwRoutine);

    View Slide

  64. Multicorn
    • Extension PostgreSQL
    • Permet d’écrire un Foreign Data Wrapper
    • Wrapper Python de l’API C de PostgreSQL
    • Tous les outils utilisant SQL peuvent utiliser un
    FDW
    • Support FULL SQL ;-)

    View Slide

  65. FDW de base
    • RSS, CSV, XML
    • LDAP
    • Gmail, IMAP
    • Google Search
    • SQLAlchemy (mysql, sqlite, oracle, …)

    View Slide

  66. Simple exemple
    CREATE EXTENSION multicorn;

    View Slide

  67. Simple exemple (2)
    CREATE SERVER wirtel_be_srv
    FOREIGN DATA WRAPPER multicorn
    OPTIONS (
    wrapper 'multicorn.rssfdw.RssFdw'
    );
    CREATE FOREIGN TABLE wirtel_be_rss (
    "pubDate" TIMESTAMP,
    description CHARACTER VARYING,
    title CHARACTER VARYING,
    link CHARACTER VARYING
    ) SERVER wirtel_be_srv OPTIONS(
    url 'http://wirtel.be/feeds/python.rss.xml'
    );

    View Slide

  68. Simple exemple (3)
    SELECT "pubDate", title, link FROM wirtel_be_rss LIMIT 1;
    -[ RECORD 1 ]------------------------------------------------------------------------
    pubDate | 2014-10-19 00:00:00
    title | Python @ FOSDEM 2015 - Call For Proposals
    link | http://wirtel.be/posts/en/2014/10/19/python-fosdem-2015-call-for-proposals/

    View Slide

  69. avec OpenERP
    from multicorn import ForeignDataWrapper
    import erppeek
    class OpenERPForeignDataWrapper(ForeignDataWrapper):
    def __init__(self, options, columns):
    super(OpenERPForeignDataWrapper, self).__init__(options, columns)
    url = 'http://{hostname}:{password}'.format(**options)
    self.client = erppeek.Client(url,
    options['database'],
    options['username'],
    options['password'])
    self.object_name = options['object']
    def execute(self, quals, columns):
    proxy = self.client.model(self.object_name)
    item = {}
    for record in proxy.browse([]):
    for column in columns:
    item[column] = record[column]
    yield item

    View Slide

  70. avec OpenERP (2)
    CREATE EXTENSION multicorn;
    CREATE SERVER multicorn_openerp
    FOREIGN DATA WRAPPER multicorn
    OPTIONS (wrapper 'multicorn.openerpfdw.OpenERPForeignDataWrapper');
    CREATE FOREIGN TABLE oe_users (
    login character varying,
    name character varying)
    SERVER multicorn_openerp OPTIONS (
    hostname 'localhost',
    port '8069',
    database 'openerp',
    username 'admin',
    password 'admin',
    object 'res.users'
    );
    SELECT id, login, name, active
    FROM oe_users
    WHERE login = 'admin';

    View Slide

  71. PL/Python
    Etendre PostgreSQL

    View Slide

  72. PL/Python
    Python 2 ou Python 3
    Utilise toutes les librairies Python (PyPI)
    Apprentissage plus rapide que PL/pgSQL

    View Slide

  73. PL/Python
    apt-get install postgresql-plpython-9.3
    CREATE EXTENSION IF NOT EXISTS plpythonu;

    View Slide

  74. PL/Python
    CREATE OR REPLACE FUNCTION str_title(s VARCHAR) RETURNS VARCHAR AS $$
    return s.title()
    $$ LANGUAGE plpythonu;
    demo=# select str_title('hello world');
    -[ RECORD 1 ]----------
    str_title | Hello World

    View Slide

  75. PL/Python - DataT
    ypes
    PostgreSQL Python
    integer, bigint int, long
    boolean bool
    text types str
    SQL Array list
    Custom Types dict

    View Slide

  76. PL/Python - Debug
    Fonctions de base pour afficher les messages
    • plpy.notice(“”)
    • plpy.debug(“”)
    • plpy.error(“”)
    • plpy.fatal(“”)
    Oublier le ”print”

    View Slide

  77. PL/Python
    CREATE OR REPLACE FUNCTION get_pid_cpu_mem(pid INT)
    RETURNS TABLE(pid INT, cpu_perc FLOAT, mem_perc FLOAT) AS $$
    import psutil
    process = psutil.Process(pid)
    return [
    {
    'pid': pid,
    'cpu_perc': process.get_cpu_percent(interval=0),
    'mem_perc': process.get_memory_percent()
    }
    ]
    $$ LANGUAGE plpythonu;

    View Slide

  78. PL/Python
    -[ RECORD 1 ]----+---------------------------------
    datid | 16416
    datname | demo
    pid | 14680
    usesysid | 16384
    usename | stephane
    application_name | psql
    client_addr |
    client_hostname |
    client_port | -1
    backend_start | 2014-10-25 04:22:08.235532+02
    xact_start | 2014-10-25 04:25:28.712812+02
    query_start | 2014-10-25 04:25:28.712812+02
    state_change | 2014-10-25 04:25:28.712815+02
    waiting | f
    state | active
    query | select * from pg_stat_activity ;
    SELECT * FROM pg_stat_activity;

    View Slide

  79. PL/Python
    pid | cpu_perc | mem_perc
    -------+----------+--------------
    14680 | 0 | 1.4454081372
    (1 row)
    SELECT * FROM get_pid_cpu_mem(14680);

    View Slide

  80. PL/Python
    pid | cpu_perc | mem_perc | application_name
    -------+----------+---------------+------------------
    14680 | 0 | 1.50435626678 | psql
    (1 row)
    WITH stats AS (
    SELECT psa.*, get_pid_cpu_mem(psa.pid) as attrs
    from pg_stat_activity psa
    )
    SELECT (stats.attrs).pid,
    (stats.attrs).cpu_perc,
    (stats.attrs).mem_perc,
    stats.application_name
    FROM stats;

    View Slide

  81. PL/Python - Attention
    Unsafe -> pas de sandbox
    Difficile à maintenir et à debugger
    Comment rendre fou votre DBA ;-)
    Pas de virtualenv et demande les privilèges
    superuser.

    View Slide

  82. PL/Python - Pistes ;-)
    Traitement sur grosses masses de données
    Ajouter des contraintes fortes et logique dans le système.
    Ajouter de nouvelles fonctionnalités dans PostgreSQL
    “str_title”
    Triggers ?
    Utilisation des libs de PyPI (requests, redis, zmq, smtp…)
    Accès complet à la base de données !!!

    View Slide

  83. View Slide

  84. Questions ?
    @matrixise
    https://speakerdeck.com/matrixise

    View Slide