Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Django with PostgreSQL superpowers

Django with PostgreSQL superpowers

-- A talk I gave at Python Web Conference 2023

Django per se is a database-agnostic web framework, but real-world projects based on it are not. PostgreSQL has the richest feature set of any supported databases and we’ll see how to use these superpowers in Django.

More info on my blog: https://www.paulox.net/2023/03/15/python-web-conf-2023/

Paolo Melchiorre

March 15, 2023
Tweet

More Decks by Paolo Melchiorre

Other Decks in Technology

Transcript

  1. Paolo Melchiorre ~ @pauloxnet 2 Django database backends <Django attempts

    to support as many features as possible on all database backends. However, not all database backends are alike …= — <Databases=, Django Documentation
  2. Paolo Melchiorre ~ @pauloxnet Django 3 © 1946 William Gottlieb

    (Public Domain) - Django Reinhardt at the Aquarium jazz club in New York
  3. Paolo Melchiorre ~ @pauloxnet PostgreSQL 4 © 2019 Nam Anh

    (Unsplash Free Use) - Elephant walking during daytime
  4. Paolo Melchiorre ~ @pauloxnet @pauloxnet • CTO @ 20tab •

    Software engineer • Python developer • Django contributor • PostgreSQL user Paolo Melchiorre 5 © 2022 Bartek Pawlik (CC BY-NC-SA) - DjangoCon US
  5. Paolo Melchiorre ~ @pauloxnet 6 Django no-code $ python3.11 -m

    venv ~/.mysite $ . ~/.mysite/bin/activate $ python -m pip install django~=4.1 … Successfully installed … django-4.1.7 … $ cd ~/projects $ python -m django startproject mysite
  6. Paolo Melchiorre ~ @pauloxnet 7 © 1943 Antoine de Saint-Exupéry

    (Public domain) - Snake digesting an elephant
  7. Paolo Melchiorre ~ @pauloxnet 8 Psycopg psycopg (v0.01 ~2001) PostgreSQL

    driver, Python DB API 2.0, LGPL psycopg1 (v1.0 ~2001) psycopg2 (v2.0 ~2006) Python3 (v2.4 ~2011) JSON (v2.5 ~2013) psycopg3 (v3.0 ~2021)
  8. Paolo Melchiorre ~ @pauloxnet 9 Installing PostgreSQL C client library

    $ sudo apt install libpq-dev … Setting up libpq-dev (14.7-0) …
  9. Paolo Melchiorre ~ @pauloxnet 10 Installing psycopg2 $ python3 -m

    pip install psycopg2 … Successfully installed psycopg2-2.9.5
  10. Paolo Melchiorre ~ @pauloxnet 11 Activating PostgreSQL backend DATABASES =

    { "default": { "ENGINE": "django.db.backends.postgresql", "HOST": "127.0.0.1", "NAME": "my_database_name", "PASSWORD": "my_database_password", "PORT": "5432", "USER": "my_database_user", } }
  11. Paolo Melchiorre ~ @pauloxnet 12 PostgreSQL Connection URIs # environment

    DATABASE_URL=postgres://user:password@host:port/name # python3 -m pip install dj-database-url import dj_database_url DATABASES = {"default": dj_database_url.config()} # python3 -m pip install django-configurations from configurations import values DATABASES = values.DatabaseURLValue()
  12. Paolo Melchiorre ~ @pauloxnet 14 Blog models from django.db import

    models class Author(models.Model): name = models.CharField(max_length=200) class Entry(models.Model): headline = models.CharField(max_length=255) body_text = models.TextField() authors = models.ManyToManyField("Author")
  13. Paolo Melchiorre ~ @pauloxnet 15 Making queries >>> from blog.models

    import Author >>> Author.objects.filter(name__contains="Terry") SELECT "name" FROM "blog_author" WHERE "name" LIKE '%Terry%'
  14. Paolo Melchiorre ~ @pauloxnet 17 PostgreSQL contrib django.contrib.postgres (v1.8 ~2015)

    Fields, indexes, functions, extensions, … Crowdfound Marc Tamlyn (~2014) JSONûeld* (v1.9 ~2015) Full-text search (v1.10 ~2016) RandomUUID (v2.0 ~2017) Operator Classes (v3.2 ~2021)
  15. Paolo Melchiorre ~ @pauloxnet 18 Activating PostgreSQL module INSTALLED_APPS =

    [ "django.contrib.admin", "django.contrib.auth", "django.contrib.contenttypes", "django.contrib.sessions", "django.contrib.messages", "django.contrib.staticfiles", "django.contrib.postgres", "blog", ]
  16. Paolo Melchiorre ~ @pauloxnet DjangoCon Europe 2017 19 © 2017

    Paolo Melchiorre (CC BY-SA) - DjangoCon Europe Sprint
  17. Paolo Melchiorre ~ @pauloxnet • Version 4 (random) UUID •

    pgcrypto module • cryptographic functions • CryptoExtension migration • in-core from PG 13 RandomUUID 20
  18. Paolo Melchiorre ~ @pauloxnet 21 UUID field import uuid from

    django.db import models class Entry(models.Model): headline = models.CharField(max_length=255) body_text = models.TextField() authors = models.ManyToManyField("Author") uuid = models.UUIDField(default=uuid.uuid4)
  19. Paolo Melchiorre ~ @pauloxnet 22 RandomUUID update >>> from django.contrib.postgres

    import functions >>> from blog.models import Entry >>> Entry.objects.update(uuid=functions.RandomUUID()) UPDATE "blog_entry" SET "uuid" = GEN_RANDOM_UUID() -- ~1 million UUIDs in ~30 s
  20. Paolo Melchiorre ~ @pauloxnet • Fields, expressions, functions • GIN

    index • GiST index • Phrase search • Web search Full-text search 24
  21. Paolo Melchiorre ~ @pauloxnet 25 Search lookup >>> from blog.models

    import Entry >>> Entry.objects.filter(headline__search="tomato") SELECT * FROM "blog_entry" WHERE to_tsvector("headline") @@ plainto_tsquery('tomato')
  22. Paolo Melchiorre ~ @pauloxnet 26 SearchVector from django.contrib.postgres import indexes,

    search from django.db import models V = search.SearchVector("headline", config="english") class Entry(models.Model): ... class Meta: indexes = [indexes.GinIndex(V, name="v_idx")]
  23. Paolo Melchiorre ~ @pauloxnet 27 SearchQuery >>> from django.contrib.postgres import

    search >>> from blog.models import Entry, V >>> q = "ham OR tuna -jam" >>> Q = search.SearchQuery(q, search_type="websearch") >>> Entry.objects.annotate(search=V).filter(search=Q) SELECT * FROM "blog_entry" WHERE to_tsvector('english', "headline") @@ websearch_to_tsquery('ham OR tuna -jam)
  24. Paolo Melchiorre ~ @pauloxnet • PG array • Array of

    data • Base ûeld • Size • Multi-dimensional Array Field 29
  25. Paolo Melchiorre ~ @pauloxnet 30 Array Email Field from django.contrib.postgres.fields

    import ArrayField from django.db import models class Author(models.Model): name = models.CharField(max_length=200) emails = ArrayField(models.EmailField())
  26. Paolo Melchiorre ~ @pauloxnet 31 Making array queries >>> from

    django.db.models import Q >>> from blog.models import Author >>> q = Q(emails__contains=["[email protected]"]) >>> Author.objects.filter(q).values_list("emails") [['[email protected]', '[email protected]']] SELECT "emails" FROM "blog_author" WHERE "emails" @> ARRAY['[email protected]']
  27. Paolo Melchiorre ~ @pauloxnet • California Civic Data Coalition •

    PostgreSQL COPY support • COPY TO • COPY FROM • Temporary tables Django Postgres Copy 34
  28. Paolo Melchiorre ~ @pauloxnet 35 Django Postgres Copy - Model

    from django.db import models from postgres_copy import CopyManager class Feature(models.Model): name = models.CharField(max_length=255) alternative_names = models.TextField() osm_type = models.CharField(max_length=8) osm_id = models.PositiveBigIntegerField() ... objects = CopyManager()
  29. Paolo Melchiorre ~ @pauloxnet 36 Django Postgres Copy - CSV

    >>> import pathlib >>> pathlib.Path("/tmp/italy.csv").stat().st_size 217879987 (~208 MB) >>> from blog.models import Feature >>> Feature.objects.from_csv("/tmp/italy.csv") CPU times: user 354 ms, sys: 101 ms, total: 455 ms 918904 (~1 M) records loaded Wall time: 3.46 s
  30. Paolo Melchiorre ~ @pauloxnet 37 Django Postgres Copy - SQL

    CREATE TEMPORARY TABLE "temp_blog_feature" ("..."); COPY "temp_blog_feature" FROM STDIN WITH CSV HEADER; -- COPY 918904 -- Time: 2233,077 ms (00:02,233) INSERT INTO "blog_feature" ("...") ( SELECT ... FROM "temp_blog_feature" ); DROP TABLE IF EXISTS "temp_blog_feature";
  31. Paolo Melchiorre ~ @pauloxnet 38 Django Postgres Copy - GZip

    >>> import pathlib >>> pathlib.Path("/tmp/italy.csv.gz").stat().st_size 42876187 (~41 MB) >>> from geonames.models import Feature >>> import gzip >>> with gzip.open("/tmp/italy.csv.gz", "rb") as f: Feature.objects.from_csv(f) CPU times: user 1.02 s, sys: 42.8 ms, total: 1.06 s 918904 (~1M) records loaded Wall time: 3.45 s
  32. Paolo Melchiorre ~ @pauloxnet • ltree module • Hierarchical tree-like

    data • Materialized path • Dedicated PathField • Abstract TreeModel Django ltree 40
  33. Paolo Melchiorre ~ @pauloxnet 41 TreeModel from django.contrib.postgres.indexes import GistIndex

    from django_ltree.models import TreeModel class Entry(TreeModel): ... class Meta: indexes = [GistIndex(fields=['path'])]
  34. Paolo Melchiorre ~ @pauloxnet 42 Tree - Example top pictures

    science astronomy astronomy astrophysics cosmology galaxies stars astronauts
  35. Paolo Melchiorre ~ @pauloxnet 43 Tree - Queries >>> from

    blog.models import Entry >>> path = "top.science" >>> q = Entry.objects.filter(path__descendants=path) >>> q.order_by("path").values_list("path", flat=True) [top.science, top.science.astronomy, top.science.astronomy.astrophysics, top.science.astronomy.cosmology] SELECT "path" FROM "blog_entry" WHERE "path" <@ 'top.science' ORDER BY "path" ASC
  36. Paolo Melchiorre ~ @pauloxnet • PostgreSQL extension • Best* GeoDjango

    backend • Spatial data types • Spatial indexing • Spatial functions PostGIS 45
  37. Paolo Melchiorre ~ @pauloxnet 46 GeoDjango Compatibility Table • Spatial

    Lookups (18/31) • Database functions (15/32) • Aggregate Functions (0/5) MariaDB 49% • Spatial Lookups (31/31) • Database functions (32/32) • Aggregate Functions (5/5) PostGIS 100% • Spatial Lookups (22/31) • Database functions (30/32) • Aggregate Functions (4/5) Spatialite 82% • Spatial Lookups (18/31) • Database functions (16/32) • Aggregate Functions (0/5) MySQL 50%
  38. Paolo Melchiorre ~ @pauloxnet 47 What’s next • Indexes •

    Aggregation functions • Trigram extension • Range ûelds • ...
  39. Paolo Melchiorre ~ @pauloxnet 48 Tips • docs in djangoproject.com

    • details in postgresql.org • source code in github.com • questions in stackoverüow.com
  40. Paolo Melchiorre ~ @pauloxnet 49 License CC BY-SA 4.0 This

    work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.