Je vais prendre des raccourcis • Points de douleurs avec SQLAlchemy (venant du monde Django) • La vrai vie est plus compliquée que les extraits de code montrés ici
"/pyconfr2025/step1.py", line 11, in <module> print(session.query(User).all()) ~~~~~~~~~~~~~~~~~~~~~~~^^ sqlalchemy.exc.IntegrityError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely) (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "pyconfr2025_user_email_key" … user = session.query(User).filter(User.email=="[email protected]").one() user.email = "[email protected]" users = session.query(User).all() autres_modifications_user(user, users) session.flush()
en base de donnée • ou pas assez fi nement (flush() marche en tout ou rien) • no_autoflush règle globalement le problème • Mais même si autoflush désactivé, certaines commandes font un flush : commit(), begin_nested() IntegrityError en lecture Points de douleur
print(f"User {user.email} has:") print("groups:") for group in user.groups: print(f"- {group.name}") end = datetime.datetime.now() print(f"Printed {len(users)} users in {(end - start).total_seconds()} seconds") joinedload Code naïf - posons le cadre
print(f"User {user.email} has:") print("groups:") for group in user.groups: print(f"- {group.name}") end = datetime.datetime.now() print(f"Printed {len(users)} users in {(end - start).total_seconds()} seconds") joinedload Code naïf - posons le cadre
print(f"User {user.email} has:") print("groups:") for group in user.groups: print(f"- {group.name}") end = datetime.datetime.now() print(f"Printed {len(users)} users in {(end - start).total_seconds()} seconds") joinedload Code naïf - posons le cadre
print(f"User {user.email} has:") print("groups:") for group in user.groups: print(f"- {group.name}") end = datetime.datetime.now() print(f"Printed {len(users)} users in {(end - start).total_seconds()} seconds") joinedload Code naïf - posons le cadre
B Utilisateur 003 Entreprise C Utilisateur 004 Entreprise A Utilisateur 005 None / NULL SELECT user.*, company.* FROM user LEFT OUTER JOIN company ON user.company_id = company
sqlalchemy.orm.joinedload(User.groups), ).all() for user in users: print(f"User {user.email} has:") print("groups:") for group in user.groups: print(f"- {group.name}") end = datetime.datetime.now() print(f"Loaded {len(users)} users in {(end - start).total_seconds()} seconds")
= session.query(User).options( sqlalchemy.orm.joinedload(User.groups), sqlalchemy.orm.joinedload(User.tags), ).all() for user in users: print(f"User {user.email} has:") print("groups:") for group in user.groups: print(f"- {group.name}") print("tags:") for tag in user.tags: print(f"- {tag.name}") end = datetime.datetime.now() print(f"Loaded {len(users)} users in {(end - start).total_seconds()} seconds")
sqlalchemy.orm.selectinload(User.tags), ).all() for user in users: print(f"User {user.email} has:") print("groups:") for group in user.groups: print(f"- {group.name}") print("tags:") for tag in user.tags: print(f"- {tag.name}") end = datetime.datetime.now() print(f"Loaded {len(users)} users in {(end - start).total_seconds()} seconds")
- timedelta(weeks=4) for user in users: user.created_at = ( created_at if len(tags)>5 else datetime.now() ) session.commit() 3 requêtes SELECT N requêtes UPDATE
- timedelta(weeks=4) for user in users: user.created_at = ( created_at if len(tags)>5 else datetime.now() ) session.commit() 3 requêtes SELECT N requêtes UPDATE 3xN requêtes SELECT
- timedelta(weeks=3) for batch_ids in batched(user_ids, 50): users = session.query(User).options( sqlalchemy.orm.selectinload(User.groups), sqlalchemy.orm.selectinload(User.tags), ).filter(User.id.in_([i for i, in batch_ids])).all() for user in users: user.created_at = created_at session.commit()
instances will be fully expired after each commit(), so that all attribute/object access subsequent to a completed transaction will load from the most recent database state.