Slide 1

Slide 1 text

Python & Postgresql A Wonderful Wedding Stéphane Wirtel @PyConFR 2014 - Lyon http://goo.gl/UsPLwh

Slide 2

Slide 2 text

Stéphane Wirtel #fellow @ThePSF #member EPS #member AFPy @PythonFOSDEM @matrixise [email protected] http://wirtel.be

Slide 3

Slide 3 text

Agenda Qu’allons-nous apprendre ?

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

Python

Slide 6

Slide 6 text

Python, c’est quoi ? Langage de Programmation Syntaxe simple & lisible Interprété Multi-plateformes Typage Dynamique Fort Multi-paradigmes Garbage Collector

Slide 7

Slide 7 text

Postgresql

Slide 8

Slide 8 text

PostgreSQL - Story Développée à Berkeley Réécriture de Ingres 1985 -> Postgres (vient de Post-Ingres) 1995 -> Postgres95 (0.01) (Ajout d’un interpréter SQL) 1997 -> PostgreSQL 6.0 2013 -> PostgreSQL 9.3 2014 -> PostgreSQL 9.4 (b4)

Slide 9

Slide 9 text

PostgreSQL - TL;DR ORDBMS, ACID SQL:2011 DataTypes Transactional DDL Concurrent Index Extensions Common Table Expression MultiVersion Concurrency Control Cross-Platform

Slide 10

Slide 10 text

PostgreSQL - TL;DR Replication Foreign Data Wrappers Procedural Languages Triggers Full text Search Views (Materialized) Table Inheritance Listen/Notify

Slide 11

Slide 11 text

PostgreSQL, FDW ? Utiliser une source externe à PostgreSQL Twitter, RSS, CSV, XML FileSystem Processes

Slide 12

Slide 12 text

PostgreSQL, PL ? Étend la base de données Fonctions stockées Safe (sandbox, SQL, PL/pgSQL) / Unsafe (C) PL/Python, PL/V8, PL/PERL

Slide 13

Slide 13 text

DB-API 2.0 Commençons par le commencement… http://python.org/peps/pep-0249

Slide 14

Slide 14 text

DB-API 2.0 aka PEP-249 API pour les connecteurs de Base de Données Facile à utiliser, et à comprendre Utilise deux concepts: • Connection • Curseur http://legacy.python.org/dev/peps/pep-0249/

Slide 15

Slide 15 text

Connection Emballe une connection vers la base de données Gère les transactions et leur état (Commit/ Rollback) N’exécute pas de requêtes SQL.

Slide 16

Slide 16 text

Connection - API • connect(parameters=None) • close() • commit() • rollback() • cursor([name=None])

Slide 17

Slide 17 text

Connection - Exemple import driver conn = driver.connect(database='database', host='localhost', port=5432) try: # cree le curseur # utilise le curseur except Exception: conn.rollback() else: conn.commit() conn.close()

Slide 18

Slide 18 text

Curseur Créé via une instance d’une connection Utilisé pour la manipulation et interrogation de la Base de Données cursor = conn.cursor() cursor.execute(""" SELECT column1, column2 FROM tableA """) for column1, column2 in cursor.fetchall(): print(column1, column2)

Slide 19

Slide 19 text

Curseur - API • callproc(procname[, parameters]) • execute(query[, parameters]) • fetchone(), fetchmany([size=cursor.arraysize]), fetchall() • close()

Slide 20

Slide 20 text

Curseur - execute execute(query[, parameters]) Performance et Sécurité JAMAIS utiliser l’interpolation (%) et la concaténation (+) => SQL Injection

Slide 21

Slide 21 text

Curseur - execute (2) Accepte un formatage de la requête. qmark Question mark WHERE field = ? numeric Numeric positional WHERE field = :1 named Named WHERE field = :code format ANSI C print format WHERE field = %s pyformat Python format WHERE field = %(name)s

Slide 22

Slide 22 text

Exemple plus complet ;-) import driver conn = driver.connect(database='database', host='localhost', port=5432) cursor = conn.cursor() try: cursor.execute("SELECT column1, column2 FROM tableA") for column1, column2 in cursor.fetchall(): print(column1, column2) except Exception: conn.rollback() else: conn.commit() finally: cursor.close() conn.close()

Slide 23

Slide 23 text

psycopg2 La Base… pip install psycopg2

Slide 24

Slide 24 text

Introduction PostgreSQL Adaptor, basé sur libpq DB-API 2.0 Compliant Multi-thread Pool de Connections Full Asynchronous, Coroutines Supporte TOUS les types de données de PostgreSQL (Json, Hstore, …) Python 2.5+, 3.1+, PostgreSQL 7.4+

Slide 25

Slide 25 text

Curseurs cursor = conn.cursor() cursor = conn.cursor(name=“pycon_cursor”) • Curseur coté PostgreSQL • Interrogation, manipulation des données de la base. • Nommé => Utilisé pour les gros ensembles de données

Slide 26

Slide 26 text

Requêtes SQL cursor.execute("SELECT * FROM table") cursor.execute("INSERT INTO table (field1, field2) VALUES (%s, %s)", (field1, field2)) cursor.execute("DELETE FROM table") cursor.execute("UPDATE table SET field1=%s", ('value',)) cursor.execute("CREATE DATABASE database") cursor.execute("DROP DATABASE database")

Slide 27

Slide 27 text

Exemple import psycopg2 conn = psycopg2.connect(host='localhost', port=5432, user='username', password='password', database='database') cursor = conn.cursor() try: cursor.execute("SELECT column1, column2 FROM tableA") for column1, column2 in cursor.fetchall(): print(column1, column2) except Exception: conn.rollback() else: conn.commit() finally: cursor.close() conn.close()

Slide 28

Slide 28 text

Exemple - Context Manager import psycopg2 DSN = dict(host='localhost', port=5432, user='username', password='password', database='database') with psycopg2.connect(**DSN) as conn: with conn.cursor() as cursor: cursor.execute("SELECT column1, column2 FROM tableA") for column1, column2 in cursor.fetchall(): print(column1, column2) conn.commit()

Slide 29

Slide 29 text

Object Relational Mapping

Slide 30

Slide 30 text

Introduction “Technique de Programmation créant l’illusion d’une base de données orientée objet à partir d’une base de données relationnelle en définissant des correspondances entre cette base de données et les objets du langage utilisé.” Wikipedia

Slide 31

Slide 31 text

En bref • Mapper une classe Business sur une table ! • Méthodes pour ajouter, modifier, supprimer et rechercher

Slide 32

Slide 32 text

Peewee-ORM Object Relational Mapping pip install peewee

Slide 33

Slide 33 text

Peewee ORM Simple, facile à comprendre et extensible Python 2.6+ et 3.2+ Extensions 3200 lignes de code (MySQL, Sqlite3 et PostgreSQL) Très bonne documentation

Slide 34

Slide 34 text

Connection import peewee database = peewee.PostgresqlDatabase('demo')

Slide 35

Slide 35 text

Modèles class BaseModel(peewee.Model): id = peewee.PrimaryKeyField() class Meta: database = database class Address(BaseModel): street = peewee.CharField() zipcode = peewee.CharField() country = peewee.CharField() class Contact(BaseModel): firstname = peewee.CharField(null=False) lastname = peewee.CharField(null=False) email = peewee.CharField(unique=True, null=False, index=True) address = peewee.ForeignKeyField(Address, null=False, related_name=‘contacts')

Slide 36

Slide 36 text

Création des tables database.create_tables([Address, Contact])

Slide 37

Slide 37 text

Peewee - T ransactions tx = database.transaction() tx.commit() tx.rollback() with database.transaction(): pass

Slide 38

Slide 38 text

Create with database.transaction(): address = Address.create( street='Rue de Lyon', zipcode='90001', country='France') contact = Contact.create( firstname='Stephane', lastname='Wirtel', email='[email protected]', address=address ) print(contact.id)

Slide 39

Slide 39 text

Read contact = Contact.get(Contact.email == '[email protected]') print(contact.firstname) contacts = Contact.select() for contact in contacts: print(contact.firstname) for contact in contacts.where(Contact.email == '[email protected]'): print(contact.firstname)

Slide 40

Slide 40 text

Update with database.transaction(): contact = Contact.get(Contact.email == '[email protected]') contact.firstname = 'Speaker' contact.save()

Slide 41

Slide 41 text

Delete # Suppression pour une collection contacts = Contact.delete().where(Contact.email == '[email protected]') contacts.execute() # Suppression d'un seul enregistrement contact = Contact.get(Contact.email == '[email protected]') contact.delete_instance()

Slide 42

Slide 42 text

SQLAlchemy Object Relational Mapping pip install sqlalchemy

Slide 43

Slide 43 text

SQLAlchemy • Gestionnaire de connections • Abstraction des connecteurs SQL • Langage d’Expression SQL • ORM

Slide 44

Slide 44 text

Connection from sqlalchemy import create_engine engine = create_engine('postgresql:///demo', echo=True)

Slide 45

Slide 45 text

Modèles from sqlalchemy import Column, Integer, String, ForeignKey from sqlalchemy.ext.declarative import declarative_base from sqlalchemy.orm import relationship Base = declarative_base() class Address(Base): __tablename__ = 'address' id = Column(Integer, primary_key=True) street = Column(String) zipcode = Column(String) country = Column(String, nullable=False) class Contact(Base): __tablename__ = 'contact' id = Column(Integer, primary_key=True) firstname = Column(String, nullable=False) lastname = Column(String, nullable=False) email = Column(String, nullable=False) address_id = Column(Integer, ForeignKey(Address.id), nullable=False) address = relationship('Address')

Slide 46

Slide 46 text

Création des tables Base.metadata.create_all(engine)

Slide 47

Slide 47 text

Sessions Wraps les transactions session = Session(engine) session.commit()

Slide 48

Slide 48 text

Create session = Session(engine) address = Address(street='SQLAlchemy Street', zipcode='5432', country='Belgique') contact = Contact( firstname='Stephane', lastname='Wirtel', email='[email protected]', address=address ) session.add(contact) session.commit()

Slide 49

Slide 49 text

Read contact = session.query(Contact).filter_by(email='[email protected]').first() print(contact.firstname) contacts = session.query(Contact).all() for contact in contacts: print(contact.firstname) contacts = session.query(Contact).filter_by(email='[email protected]').all() for contact in contacts: print(contact.firstname)

Slide 50

Slide 50 text

Update contact = session.query(Contact) \ .filter_by(email='[email protected]').first() contact.email = '[email protected]' session.add(contact) session.commit()

Slide 51

Slide 51 text

Delete contact = session.query(Contact) \ .filter_by(email='[email protected]').first() session.delete(contact) session.commit()

Slide 52

Slide 52 text

Alembic La Migration Facile pip install alembic

Slide 53

Slide 53 text

Alembic Basé sur SQLAlchemy Utilise un environnement de Migration Fichier de configuration INI Scripts de migration http://alembic.readthedocs.org/

Slide 54

Slide 54 text

Alembic cd votreproject alembic init migrations > tree migrations/ !"" README !"" alembic.ini !"" env.py !"" script.py.mako #"" versions

Slide 55

Slide 55 text

Alembic revision = '22630db6f519' down_revision = None from alembic import op import sqlalchemy as sa def upgrade(): op.create_table('user', sa.Column('id', sa.Integer(), nullable=False), sa.Column('created_at', sa.DateTime(), nullable=False), sa.Column('name', sa.String(length=255), nullable=False), sa.Column('email', sa.String(length=255), nullable=True), sa.Column('password', sa.String(length=255), nullable=True), sa.Column('active', sa.Boolean(), nullable=True), sa.Column('confirmed_at', sa.DateTime(), nullable=True), sa.PrimaryKeyConstraint('id'), sa.UniqueConstraint('email') ) def downgrade(): op.drop_table('user') alembic revision -m “bootstrap" alembic upgrade head

Slide 56

Slide 56 text

Alembic revision = '7e4b6a43e6c' down_revision = '22630db6f519' from alembic import op import sqlalchemy as sa def upgrade(): op.add_column('user', sa.Column('slug', sa.Unicode(length=255), nullable=False, server_default='')) def downgrade(): op.drop_column('user', 'slug') alembic revision -m “add_slug” alembic upgrade head

Slide 57

Slide 57 text

Alembic > tree migrations/ !"" README !"" alembic.ini !"" env.py !"" script.py.mako #"" versions !"" 22630db6f519_bootstrap.py #"" 7e4b6a43e6c_add_slug.py

Slide 58

Slide 58 text

Alembic from alembic import op from sqlalchemy.orm import Session from youproject import models def upgrade(): engine = op.get_bind() session = Session(bind=engine) for contact in session.query(models.Contact).all(): session.add( models.WifiUser( login=contact.name.lower(), password=random_string() ) ) session.commit() alembic upgrade head

Slide 59

Slide 59 text

Alembic alembic revision alembic upgrade head alembic upgrade +2 alembic downgrade -1 alembic downgrade base alembic current alembic history

Slide 60

Slide 60 text

Zone Dangereuse !!!

Slide 61

Slide 61 text

Multicorn Où comment se connecter au monde ! http://multicorn.org

Slide 62

Slide 62 text

Rappel: Foreign Data Wrappers Utiliser une source externe à PostgreSQL Twitter, RSS, CSV, XML FileSystem Processes PostgreSQL, Oracle, MongoDB

Slide 63

Slide 63 text

declarations for dynamic loading */ _MODULE_MAGIC; G_FUNCTION_INFO_V1(mongo_fdw_handler); PG_FUNCTION_INFO_V1(mongo_fdw_validator); /* * mongo_fdw_handler creates and returns a struct with pointers to foreign table * callback functions. */ Datum mongo_fdw_handler(PG_FUNCTION_ARGS) { FdwRoutine *fdwRoutine = makeNode(FdwRoutine); fdwRoutine->GetForeignRelSize = MongoGetForeignRelSize; fdwRoutine->GetForeignPaths = MongoGetForeignPaths; fdwRoutine->GetForeignPlan = MongoGetForeignPlan; fdwRoutine->ExplainForeignScan = MongoExplainForeignScan; fdwRoutine->BeginForeignScan = MongoBeginForeignScan; fdwRoutine->IterateForeignScan = MongoIterateForeignScan; fdwRoutine->ReScanForeignScan = MongoReScanForeignScan; fdwRoutine->EndForeignScan = MongoEndForeignScan; fdwRoutine->AnalyzeForeignTable = MongoAnalyzeForeignTable; PG_RETURN_POINTER(fdwRoutine);

Slide 64

Slide 64 text

Multicorn • Extension PostgreSQL • Permet d’écrire un Foreign Data Wrapper • Wrapper Python de l’API C de PostgreSQL • Tous les outils utilisant SQL peuvent utiliser un FDW • Support FULL SQL ;-)

Slide 65

Slide 65 text

FDW de base • RSS, CSV, XML • LDAP • Gmail, IMAP • Google Search • SQLAlchemy (mysql, sqlite, oracle, …)

Slide 66

Slide 66 text

Simple exemple CREATE EXTENSION multicorn;

Slide 67

Slide 67 text

Simple exemple (2) CREATE SERVER wirtel_be_srv FOREIGN DATA WRAPPER multicorn OPTIONS ( wrapper 'multicorn.rssfdw.RssFdw' ); CREATE FOREIGN TABLE wirtel_be_rss ( "pubDate" TIMESTAMP, description CHARACTER VARYING, title CHARACTER VARYING, link CHARACTER VARYING ) SERVER wirtel_be_srv OPTIONS( url 'http://wirtel.be/feeds/python.rss.xml' );

Slide 68

Slide 68 text

Simple exemple (3) SELECT "pubDate", title, link FROM wirtel_be_rss LIMIT 1; -[ RECORD 1 ]------------------------------------------------------------------------ pubDate | 2014-10-19 00:00:00 title | Python @ FOSDEM 2015 - Call For Proposals link | http://wirtel.be/posts/en/2014/10/19/python-fosdem-2015-call-for-proposals/

Slide 69

Slide 69 text

avec OpenERP from multicorn import ForeignDataWrapper import erppeek class OpenERPForeignDataWrapper(ForeignDataWrapper): def __init__(self, options, columns): super(OpenERPForeignDataWrapper, self).__init__(options, columns) url = 'http://{hostname}:{password}'.format(**options) self.client = erppeek.Client(url, options['database'], options['username'], options['password']) self.object_name = options['object'] def execute(self, quals, columns): proxy = self.client.model(self.object_name) item = {} for record in proxy.browse([]): for column in columns: item[column] = record[column] yield item

Slide 70

Slide 70 text

avec OpenERP (2) CREATE EXTENSION multicorn; CREATE SERVER multicorn_openerp FOREIGN DATA WRAPPER multicorn OPTIONS (wrapper 'multicorn.openerpfdw.OpenERPForeignDataWrapper'); CREATE FOREIGN TABLE oe_users ( login character varying, name character varying) SERVER multicorn_openerp OPTIONS ( hostname 'localhost', port '8069', database 'openerp', username 'admin', password 'admin', object 'res.users' ); SELECT id, login, name, active FROM oe_users WHERE login = 'admin';

Slide 71

Slide 71 text

PL/Python Etendre PostgreSQL

Slide 72

Slide 72 text

PL/Python Python 2 ou Python 3 Utilise toutes les librairies Python (PyPI) Apprentissage plus rapide que PL/pgSQL

Slide 73

Slide 73 text

PL/Python apt-get install postgresql-plpython-9.3 CREATE EXTENSION IF NOT EXISTS plpythonu;

Slide 74

Slide 74 text

PL/Python CREATE OR REPLACE FUNCTION str_title(s VARCHAR) RETURNS VARCHAR AS $$ return s.title() $$ LANGUAGE plpythonu; demo=# select str_title('hello world'); -[ RECORD 1 ]---------- str_title | Hello World

Slide 75

Slide 75 text

PL/Python - DataT ypes PostgreSQL Python integer, bigint int, long boolean bool text types str SQL Array list Custom Types dict

Slide 76

Slide 76 text

PL/Python - Debug Fonctions de base pour afficher les messages • plpy.notice(“”) • plpy.debug(“”) • plpy.error(“”) • plpy.fatal(“”) Oublier le ”print”

Slide 77

Slide 77 text

PL/Python CREATE OR REPLACE FUNCTION get_pid_cpu_mem(pid INT) RETURNS TABLE(pid INT, cpu_perc FLOAT, mem_perc FLOAT) AS $$ import psutil process = psutil.Process(pid) return [ { 'pid': pid, 'cpu_perc': process.get_cpu_percent(interval=0), 'mem_perc': process.get_memory_percent() } ] $$ LANGUAGE plpythonu;

Slide 78

Slide 78 text

PL/Python -[ RECORD 1 ]----+--------------------------------- datid | 16416 datname | demo pid | 14680 usesysid | 16384 usename | stephane application_name | psql client_addr | client_hostname | client_port | -1 backend_start | 2014-10-25 04:22:08.235532+02 xact_start | 2014-10-25 04:25:28.712812+02 query_start | 2014-10-25 04:25:28.712812+02 state_change | 2014-10-25 04:25:28.712815+02 waiting | f state | active query | select * from pg_stat_activity ; SELECT * FROM pg_stat_activity;

Slide 79

Slide 79 text

PL/Python pid | cpu_perc | mem_perc -------+----------+-------------- 14680 | 0 | 1.4454081372 (1 row) SELECT * FROM get_pid_cpu_mem(14680);

Slide 80

Slide 80 text

PL/Python pid | cpu_perc | mem_perc | application_name -------+----------+---------------+------------------ 14680 | 0 | 1.50435626678 | psql (1 row) WITH stats AS ( SELECT psa.*, get_pid_cpu_mem(psa.pid) as attrs from pg_stat_activity psa ) SELECT (stats.attrs).pid, (stats.attrs).cpu_perc, (stats.attrs).mem_perc, stats.application_name FROM stats;

Slide 81

Slide 81 text

PL/Python - Attention Unsafe -> pas de sandbox Difficile à maintenir et à debugger Comment rendre fou votre DBA ;-) Pas de virtualenv et demande les privilèges superuser.

Slide 82

Slide 82 text

PL/Python - Pistes ;-) Traitement sur grosses masses de données Ajouter des contraintes fortes et logique dans le système. Ajouter de nouvelles fonctionnalités dans PostgreSQL “str_title” Triggers ? Utilisation des libs de PyPI (requests, redis, zmq, smtp…) Accès complet à la base de données !!!

Slide 83

Slide 83 text

No content

Slide 84

Slide 84 text

Questions ? @matrixise https://speakerdeck.com/matrixise