Slide 1

Slide 1 text

DJANGO WITH POSTGRESQL SUPERPOWERS PAOLO MELCHIORRE ~ @pauloxnet

Slide 2

Slide 2 text

Paolo Melchiorre ~ @pauloxnet 2 Django database backends

Slide 3

Slide 3 text

Paolo Melchiorre ~ @pauloxnet Django 3 © 1946 William Gottlieb (Public Domain) - Django Reinhardt at the Aquarium jazz club in New York

Slide 4

Slide 4 text

Paolo Melchiorre ~ @pauloxnet PostgreSQL 4 © 2019 Nam Anh (Unsplash Free Use) - Elephant walking during daytime

Slide 5

Slide 5 text

Paolo Melchiorre ~ @pauloxnet @pauloxnet • CTO @ 20tab • Software engineer • Python developer • Django contributor • PostgreSQL user Paolo Melchiorre 5 © 2022 Bartek Pawlik (CC BY-NC-SA) - DjangoCon US

Slide 6

Slide 6 text

Paolo Melchiorre ~ @pauloxnet 6 Django no-code $ python3.11 -m venv ~/.mysite $ . ~/.mysite/bin/activate $ python -m pip install django~=4.1 … Successfully installed … django-4.1.7 … $ cd ~/projects $ python -m django startproject mysite

Slide 7

Slide 7 text

Paolo Melchiorre ~ @pauloxnet 7 © 1943 Antoine de Saint-Exupéry (Public domain) - Snake digesting an elephant

Slide 8

Slide 8 text

Paolo Melchiorre ~ @pauloxnet 8 Psycopg psycopg (v0.01 ~2001) PostgreSQL driver, Python DB API 2.0, LGPL psycopg1 (v1.0 ~2001) psycopg2 (v2.0 ~2006) Python3 (v2.4 ~2011) JSON (v2.5 ~2013) psycopg3 (v3.0 ~2021)

Slide 9

Slide 9 text

Paolo Melchiorre ~ @pauloxnet 9 Installing PostgreSQL C client library $ sudo apt install libpq-dev … Setting up libpq-dev (14.7-0) …

Slide 10

Slide 10 text

Paolo Melchiorre ~ @pauloxnet 10 Installing psycopg2 $ python3 -m pip install psycopg2 … Successfully installed psycopg2-2.9.5

Slide 11

Slide 11 text

Paolo Melchiorre ~ @pauloxnet 11 Activating PostgreSQL backend DATABASES = { "default": { "ENGINE": "django.db.backends.postgresql", "HOST": "127.0.0.1", "NAME": "my_database_name", "PASSWORD": "my_database_password", "PORT": "5432", "USER": "my_database_user", } }

Slide 12

Slide 12 text

Paolo Melchiorre ~ @pauloxnet 12 PostgreSQL Connection URIs # environment DATABASE_URL=postgres://user:password@host:port/name # python3 -m pip install dj-database-url import dj_database_url DATABASES = {"default": dj_database_url.config()} # python3 -m pip install django-configurations from configurations import values DATABASES = values.DatabaseURLValue()

Slide 13

Slide 13 text

Paolo Melchiorre ~ @pauloxnet Making queries 13 https:/ /docs.djangoproject.com/en/stable/topics/db/queries/

Slide 14

Slide 14 text

Paolo Melchiorre ~ @pauloxnet 14 Blog models from django.db import models class Author(models.Model): name = models.CharField(max_length=200) class Entry(models.Model): headline = models.CharField(max_length=255) body_text = models.TextField() authors = models.ManyToManyField("Author")

Slide 15

Slide 15 text

Paolo Melchiorre ~ @pauloxnet 15 Making queries >>> from blog.models import Author >>> Author.objects.filter(name__contains="Terry") SELECT "name" FROM "blog_author" WHERE "name" LIKE '%Terry%'

Slide 16

Slide 16 text

Paolo Melchiorre ~ @pauloxnet Kickstarter 16 https:/ /www.kickstarter.com/projects/mjtamlyn/improved-postgresql-support-in-django

Slide 17

Slide 17 text

Paolo Melchiorre ~ @pauloxnet 17 PostgreSQL contrib django.contrib.postgres (v1.8 ~2015) Fields, indexes, functions, extensions, … Crowdfound Marc Tamlyn (~2014) JSONûeld* (v1.9 ~2015) Full-text search (v1.10 ~2016) RandomUUID (v2.0 ~2017) Operator Classes (v3.2 ~2021)

Slide 18

Slide 18 text

Paolo Melchiorre ~ @pauloxnet 18 Activating PostgreSQL module INSTALLED_APPS = [ "django.contrib.admin", "django.contrib.auth", "django.contrib.contenttypes", "django.contrib.sessions", "django.contrib.messages", "django.contrib.staticfiles", "django.contrib.postgres", "blog", ]

Slide 19

Slide 19 text

Paolo Melchiorre ~ @pauloxnet DjangoCon Europe 2017 19 © 2017 Paolo Melchiorre (CC BY-SA) - DjangoCon Europe Sprint

Slide 20

Slide 20 text

Paolo Melchiorre ~ @pauloxnet • Version 4 (random) UUID • pgcrypto module • cryptographic functions • CryptoExtension migration • in-core from PG 13 RandomUUID 20

Slide 21

Slide 21 text

Paolo Melchiorre ~ @pauloxnet 21 UUID field import uuid from django.db import models class Entry(models.Model): headline = models.CharField(max_length=255) body_text = models.TextField() authors = models.ManyToManyField("Author") uuid = models.UUIDField(default=uuid.uuid4)

Slide 22

Slide 22 text

Paolo Melchiorre ~ @pauloxnet 22 RandomUUID update >>> from django.contrib.postgres import functions >>> from blog.models import Entry >>> Entry.objects.update(uuid=functions.RandomUUID()) UPDATE "blog_entry" SET "uuid" = GEN_RANDOM_UUID() -- ~1 million UUIDs in ~30 s

Slide 23

Slide 23 text

Paolo Melchiorre ~ @pauloxnet 23 © 2017 Paolo Melchiorre (CC BY-SA) - EuroPython Sprints

Slide 24

Slide 24 text

Paolo Melchiorre ~ @pauloxnet • Fields, expressions, functions • GIN index • GiST index • Phrase search • Web search Full-text search 24

Slide 25

Slide 25 text

Paolo Melchiorre ~ @pauloxnet 25 Search lookup >>> from blog.models import Entry >>> Entry.objects.filter(headline__search="tomato") SELECT * FROM "blog_entry" WHERE to_tsvector("headline") @@ plainto_tsquery('tomato')

Slide 26

Slide 26 text

Paolo Melchiorre ~ @pauloxnet 26 SearchVector from django.contrib.postgres import indexes, search from django.db import models V = search.SearchVector("headline", config="english") class Entry(models.Model): ... class Meta: indexes = [indexes.GinIndex(V, name="v_idx")]

Slide 27

Slide 27 text

Paolo Melchiorre ~ @pauloxnet 27 SearchQuery >>> from django.contrib.postgres import search >>> from blog.models import Entry, V >>> q = "ham OR tuna -jam" >>> Q = search.SearchQuery(q, search_type="websearch") >>> Entry.objects.annotate(search=V).filter(search=Q) SELECT * FROM "blog_entry" WHERE to_tsvector('english', "headline") @@ websearch_to_tsquery('ham OR tuna -jam)

Slide 28

Slide 28 text

Paolo Melchiorre ~ @pauloxnet 28 © 2018 Paolo Melchiorre (CC BY-SA) - Trondheim (Norway)

Slide 29

Slide 29 text

Paolo Melchiorre ~ @pauloxnet • PG array • Array of data • Base ûeld • Size • Multi-dimensional Array Field 29

Slide 30

Slide 30 text

Paolo Melchiorre ~ @pauloxnet 30 Array Email Field from django.contrib.postgres.fields import ArrayField from django.db import models class Author(models.Model): name = models.CharField(max_length=200) emails = ArrayField(models.EmailField())

Slide 31

Slide 31 text

Paolo Melchiorre ~ @pauloxnet 31 Making array queries >>> from django.db.models import Q >>> from blog.models import Author >>> q = Q(emails__contains=["[email protected]"]) >>> Author.objects.filter(q).values_list("emails") [['[email protected]', '[email protected]']] SELECT "emails" FROM "blog_author" WHERE "emails" @> ARRAY['[email protected]']

Slide 32

Slide 32 text

Paolo Melchiorre ~ @pauloxnet 32 django-better-admin-arrayfield Before After

Slide 33

Slide 33 text

Paolo Melchiorre ~ @pauloxnet 33 © 2013 Paolo Melchiorre (CC BY-SA) - San Francisco (California)

Slide 34

Slide 34 text

Paolo Melchiorre ~ @pauloxnet • California Civic Data Coalition • PostgreSQL COPY support • COPY TO • COPY FROM • Temporary tables Django Postgres Copy 34

Slide 35

Slide 35 text

Paolo Melchiorre ~ @pauloxnet 35 Django Postgres Copy - Model from django.db import models from postgres_copy import CopyManager class Feature(models.Model): name = models.CharField(max_length=255) alternative_names = models.TextField() osm_type = models.CharField(max_length=8) osm_id = models.PositiveBigIntegerField() ... objects = CopyManager()

Slide 36

Slide 36 text

Paolo Melchiorre ~ @pauloxnet 36 Django Postgres Copy - CSV >>> import pathlib >>> pathlib.Path("/tmp/italy.csv").stat().st_size 217879987 (~208 MB) >>> from blog.models import Feature >>> Feature.objects.from_csv("/tmp/italy.csv") CPU times: user 354 ms, sys: 101 ms, total: 455 ms 918904 (~1 M) records loaded Wall time: 3.46 s

Slide 37

Slide 37 text

Paolo Melchiorre ~ @pauloxnet 37 Django Postgres Copy - SQL CREATE TEMPORARY TABLE "temp_blog_feature" ("..."); COPY "temp_blog_feature" FROM STDIN WITH CSV HEADER; -- COPY 918904 -- Time: 2233,077 ms (00:02,233) INSERT INTO "blog_feature" ("...") ( SELECT ... FROM "temp_blog_feature" ); DROP TABLE IF EXISTS "temp_blog_feature";

Slide 38

Slide 38 text

Paolo Melchiorre ~ @pauloxnet 38 Django Postgres Copy - GZip >>> import pathlib >>> pathlib.Path("/tmp/italy.csv.gz").stat().st_size 42876187 (~41 MB) >>> from geonames.models import Feature >>> import gzip >>> with gzip.open("/tmp/italy.csv.gz", "rb") as f: Feature.objects.from_csv(f) CPU times: user 1.02 s, sys: 42.8 ms, total: 1.06 s 918904 (~1M) records loaded Wall time: 3.45 s

Slide 39

Slide 39 text

Paolo Melchiorre ~ @pauloxnet Tree 39 © 2020 Paolo Melchiorre (CC BY-SA) - Abruzzo (Italy)

Slide 40

Slide 40 text

Paolo Melchiorre ~ @pauloxnet • ltree module • Hierarchical tree-like data • Materialized path • Dedicated PathField • Abstract TreeModel Django ltree 40

Slide 41

Slide 41 text

Paolo Melchiorre ~ @pauloxnet 41 TreeModel from django.contrib.postgres.indexes import GistIndex from django_ltree.models import TreeModel class Entry(TreeModel): ... class Meta: indexes = [GistIndex(fields=['path'])]

Slide 42

Slide 42 text

Paolo Melchiorre ~ @pauloxnet 42 Tree - Example top pictures science astronomy astronomy astrophysics cosmology galaxies stars astronauts

Slide 43

Slide 43 text

Paolo Melchiorre ~ @pauloxnet 43 Tree - Queries >>> from blog.models import Entry >>> path = "top.science" >>> q = Entry.objects.filter(path__descendants=path) >>> q.order_by("path").values_list("path", flat=True) [top.science, top.science.astronomy, top.science.astronomy.astrophysics, top.science.astronomy.cosmology] SELECT "path" FROM "blog_entry" WHERE "path" <@ 'top.science' ORDER BY "path" ASC

Slide 44

Slide 44 text

Paolo Melchiorre ~ @pauloxnet 44 © 2020 Paolo Melchiorre (CC BY-SA) Appennines (Italy)

Slide 45

Slide 45 text

Paolo Melchiorre ~ @pauloxnet • PostgreSQL extension • Best* GeoDjango backend • Spatial data types • Spatial indexing • Spatial functions PostGIS 45

Slide 46

Slide 46 text

Paolo Melchiorre ~ @pauloxnet 46 GeoDjango Compatibility Table ● Spatial Lookups (18/31) ● Database functions (15/32) ● Aggregate Functions (0/5) MariaDB 49% ● Spatial Lookups (31/31) ● Database functions (32/32) ● Aggregate Functions (5/5) PostGIS 100% ● Spatial Lookups (22/31) ● Database functions (30/32) ● Aggregate Functions (4/5) Spatialite 82% ● Spatial Lookups (18/31) ● Database functions (16/32) ● Aggregate Functions (0/5) MySQL 50%

Slide 47

Slide 47 text

Paolo Melchiorre ~ @pauloxnet 47 What’s next • Indexes • Aggregation functions • Trigram extension • Range ûelds • ...

Slide 48

Slide 48 text

Paolo Melchiorre ~ @pauloxnet 48 Tips • docs in djangoproject.com • details in postgresql.org • source code in github.com • questions in stackoverüow.com

Slide 49

Slide 49 text

Paolo Melchiorre ~ @pauloxnet 49 License CC BY-SA 4.0 This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Slide 50

Slide 50 text

Paolo Melchiorre ~ @pauloxnet @psycopg danielevarrazzo psycopg lists.postgresql.org psycopg.org 50

Slide 51

Slide 51 text

Paolo Melchiorre ~ @pauloxnet @20tab 20tab 20tab [email protected] 20tab.com 51

Slide 52

Slide 52 text

Paolo Melchiorre ~ @pauloxnet @pauloxnet paolomelchiorre pauloxnet paulox.net 52 @[email protected]