Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Django with PostgreSQL superpowers

Django with PostgreSQL superpowers

-- A talk I gave at Python Web Conference 2023

Django per se is a database-agnostic web framework, but real-world projects based on it are not. PostgreSQL has the richest feature set of any supported databases and we’ll see how to use these superpowers in Django.

More info on my blog: https://www.paulox.net/2023/03/15/python-web-conf-2023/

Paolo Melchiorre

March 15, 2023
Tweet

More Decks by Paolo Melchiorre

Other Decks in Technology

Transcript

  1. DJANGO WITH
    POSTGRESQL SUPERPOWERS
    PAOLO MELCHIORRE ~ @pauloxnet

    View Slide

  2. Paolo Melchiorre ~ @pauloxnet
    2
    Django database backends
    as many features as possible
    on all database backends.
    However,
    not all database backends are alike …=

    View Slide

  3. Paolo Melchiorre ~ @pauloxnet
    Django
    3
    © 1946 William Gottlieb (Public Domain) - Django Reinhardt at the Aquarium jazz club in New York

    View Slide

  4. Paolo Melchiorre ~ @pauloxnet
    PostgreSQL
    4
    © 2019 Nam Anh (Unsplash Free Use) - Elephant walking during daytime

    View Slide

  5. Paolo Melchiorre ~ @pauloxnet
    @pauloxnet
    • CTO @ 20tab
    • Software engineer
    • Python developer
    • Django contributor
    • PostgreSQL user
    Paolo Melchiorre
    5
    © 2022 Bartek Pawlik (CC BY-NC-SA) - DjangoCon US

    View Slide

  6. Paolo Melchiorre ~ @pauloxnet
    6
    Django no-code
    $ python3.11 -m venv ~/.mysite
    $ . ~/.mysite/bin/activate
    $ python -m pip install django~=4.1

    Successfully installed … django-4.1.7 …
    $ cd ~/projects
    $ python -m django startproject mysite

    View Slide

  7. Paolo Melchiorre ~ @pauloxnet
    7
    © 1943 Antoine de Saint-Exupéry (Public domain) - Snake digesting an elephant

    View Slide

  8. Paolo Melchiorre ~ @pauloxnet
    8
    Psycopg
    psycopg (v0.01 ~2001)
    PostgreSQL driver, Python DB API 2.0, LGPL
    psycopg1 (v1.0 ~2001)
    psycopg2 (v2.0 ~2006)
    Python3 (v2.4 ~2011)
    JSON (v2.5 ~2013)
    psycopg3 (v3.0 ~2021)

    View Slide

  9. Paolo Melchiorre ~ @pauloxnet
    9
    Installing PostgreSQL C client library
    $ sudo apt install libpq-dev

    Setting up libpq-dev (14.7-0) …

    View Slide

  10. Paolo Melchiorre ~ @pauloxnet
    10
    Installing psycopg2
    $ python3 -m pip install psycopg2

    Successfully installed psycopg2-2.9.5

    View Slide

  11. Paolo Melchiorre ~ @pauloxnet
    11
    Activating PostgreSQL backend
    DATABASES = {
    "default": {
    "ENGINE": "django.db.backends.postgresql",
    "HOST": "127.0.0.1",
    "NAME": "my_database_name",
    "PASSWORD": "my_database_password",
    "PORT": "5432",
    "USER": "my_database_user",
    }
    }

    View Slide

  12. Paolo Melchiorre ~ @pauloxnet
    12
    PostgreSQL Connection URIs
    # environment
    DATABASE_URL=postgres://user:[email protected]:port/name
    # python3 -m pip install dj-database-url
    import dj_database_url
    DATABASES = {"default": dj_database_url.config()}
    # python3 -m pip install django-configurations
    from configurations import values
    DATABASES = values.DatabaseURLValue()

    View Slide

  13. Paolo Melchiorre ~ @pauloxnet
    Making queries
    13
    https:/
    /docs.djangoproject.com/en/stable/topics/db/queries/

    View Slide

  14. Paolo Melchiorre ~ @pauloxnet
    14
    Blog models
    from django.db import models
    class Author(models.Model):
    name = models.CharField(max_length=200)
    class Entry(models.Model):
    headline = models.CharField(max_length=255)
    body_text = models.TextField()
    authors = models.ManyToManyField("Author")

    View Slide

  15. Paolo Melchiorre ~ @pauloxnet
    15
    Making queries
    >>> from blog.models import Author
    >>> Author.objects.filter(name__contains="Terry")
    SELECT "name"
    FROM "blog_author"
    WHERE "name" LIKE '%Terry%'

    View Slide

  16. Paolo Melchiorre ~ @pauloxnet
    Kickstarter
    16
    https:/
    /www.kickstarter.com/projects/mjtamlyn/improved-postgresql-support-in-django

    View Slide

  17. Paolo Melchiorre ~ @pauloxnet
    17
    PostgreSQL contrib
    django.contrib.postgres (v1.8 ~2015)
    Fields, indexes, functions, extensions, …
    Crowdfound Marc Tamlyn (~2014)
    JSONûeld* (v1.9 ~2015)
    Full-text search (v1.10 ~2016)
    RandomUUID (v2.0 ~2017)
    Operator Classes (v3.2 ~2021)

    View Slide

  18. Paolo Melchiorre ~ @pauloxnet
    18
    Activating PostgreSQL module
    INSTALLED_APPS = [
    "django.contrib.admin",
    "django.contrib.auth",
    "django.contrib.contenttypes",
    "django.contrib.sessions",
    "django.contrib.messages",
    "django.contrib.staticfiles",
    "django.contrib.postgres",
    "blog",
    ]

    View Slide

  19. Paolo Melchiorre ~ @pauloxnet
    DjangoCon Europe 2017
    19
    © 2017 Paolo Melchiorre (CC BY-SA) - DjangoCon Europe Sprint

    View Slide

  20. Paolo Melchiorre ~ @pauloxnet
    • Version 4 (random) UUID
    • pgcrypto module
    • cryptographic functions
    • CryptoExtension migration
    • in-core from PG 13
    RandomUUID
    20

    View Slide

  21. Paolo Melchiorre ~ @pauloxnet
    21
    UUID field
    import uuid
    from django.db import models
    class Entry(models.Model):
    headline = models.CharField(max_length=255)
    body_text = models.TextField()
    authors = models.ManyToManyField("Author")
    uuid = models.UUIDField(default=uuid.uuid4)

    View Slide

  22. Paolo Melchiorre ~ @pauloxnet
    22
    RandomUUID update
    >>> from django.contrib.postgres import functions
    >>> from blog.models import Entry
    >>> Entry.objects.update(uuid=functions.RandomUUID())
    UPDATE "blog_entry"
    SET "uuid" = GEN_RANDOM_UUID()
    -- ~1 million UUIDs in ~30 s

    View Slide

  23. Paolo Melchiorre ~ @pauloxnet
    23
    © 2017 Paolo Melchiorre (CC BY-SA) - EuroPython Sprints

    View Slide

  24. Paolo Melchiorre ~ @pauloxnet
    • Fields, expressions, functions
    • GIN index
    • GiST index
    • Phrase search
    • Web search
    Full-text search
    24

    View Slide

  25. Paolo Melchiorre ~ @pauloxnet
    25
    Search lookup
    >>> from blog.models import Entry
    >>> Entry.objects.filter(headline__search="tomato")
    SELECT *
    FROM "blog_entry"
    WHERE to_tsvector("headline") @@
    plainto_tsquery('tomato')

    View Slide

  26. Paolo Melchiorre ~ @pauloxnet
    26
    SearchVector
    from django.contrib.postgres import indexes, search
    from django.db import models
    V = search.SearchVector("headline", config="english")
    class Entry(models.Model):
    ...
    class Meta:
    indexes = [indexes.GinIndex(V, name="v_idx")]

    View Slide

  27. Paolo Melchiorre ~ @pauloxnet
    27
    SearchQuery
    >>> from django.contrib.postgres import search
    >>> from blog.models import Entry, V
    >>> q = "ham OR tuna -jam"
    >>> Q = search.SearchQuery(q, search_type="websearch")
    >>> Entry.objects.annotate(search=V).filter(search=Q)
    SELECT *
    FROM "blog_entry"
    WHERE to_tsvector('english', "headline") @@
    websearch_to_tsquery('ham OR tuna -jam)

    View Slide

  28. Paolo Melchiorre ~ @pauloxnet
    28
    © 2018 Paolo Melchiorre (CC BY-SA) - Trondheim (Norway)

    View Slide

  29. Paolo Melchiorre ~ @pauloxnet
    • PG array
    • Array of data
    • Base ûeld
    • Size
    • Multi-dimensional
    Array Field
    29

    View Slide

  30. Paolo Melchiorre ~ @pauloxnet
    30
    Array Email Field
    from django.contrib.postgres.fields import ArrayField
    from django.db import models
    class Author(models.Model):
    name = models.CharField(max_length=200)
    emails = ArrayField(models.EmailField())

    View Slide

  31. Paolo Melchiorre ~ @pauloxnet
    31
    Making array queries
    >>> from django.db.models import Q
    >>> from blog.models import Author
    >>> q = Q(emails__contains=["[email protected]"])
    >>> Author.objects.filter(q).values_list("emails")
    [['[email protected]', '[email protected]']]
    SELECT "emails"
    FROM "blog_author"
    WHERE "emails" @> ARRAY['[email protected]']

    View Slide

  32. Paolo Melchiorre ~ @pauloxnet
    32
    django-better-admin-arrayfield
    Before After

    View Slide

  33. Paolo Melchiorre ~ @pauloxnet
    33
    © 2013 Paolo Melchiorre (CC BY-SA) - San Francisco (California)

    View Slide

  34. Paolo Melchiorre ~ @pauloxnet
    • California Civic Data Coalition
    • PostgreSQL COPY support
    • COPY TO
    • COPY FROM
    • Temporary tables
    Django Postgres Copy
    34

    View Slide

  35. Paolo Melchiorre ~ @pauloxnet
    35
    Django Postgres Copy - Model
    from django.db import models
    from postgres_copy import CopyManager
    class Feature(models.Model):
    name = models.CharField(max_length=255)
    alternative_names = models.TextField()
    osm_type = models.CharField(max_length=8)
    osm_id = models.PositiveBigIntegerField()
    ...
    objects = CopyManager()

    View Slide

  36. Paolo Melchiorre ~ @pauloxnet
    36
    Django Postgres Copy - CSV
    >>> import pathlib
    >>> pathlib.Path("/tmp/italy.csv").stat().st_size
    217879987 (~208 MB)
    >>> from blog.models import Feature
    >>> Feature.objects.from_csv("/tmp/italy.csv")
    CPU times: user 354 ms, sys: 101 ms, total: 455 ms
    918904 (~1 M) records loaded
    Wall time: 3.46 s

    View Slide

  37. Paolo Melchiorre ~ @pauloxnet
    37
    Django Postgres Copy - SQL
    CREATE TEMPORARY TABLE "temp_blog_feature" ("...");
    COPY "temp_blog_feature" FROM STDIN WITH CSV HEADER;
    -- COPY 918904
    -- Time: 2233,077 ms (00:02,233)
    INSERT INTO "blog_feature" ("...") (
    SELECT ... FROM "temp_blog_feature"
    );
    DROP TABLE IF EXISTS "temp_blog_feature";

    View Slide

  38. Paolo Melchiorre ~ @pauloxnet
    38
    Django Postgres Copy - GZip
    >>> import pathlib
    >>> pathlib.Path("/tmp/italy.csv.gz").stat().st_size
    42876187 (~41 MB)
    >>> from geonames.models import Feature
    >>> import gzip
    >>> with gzip.open("/tmp/italy.csv.gz", "rb") as f:
    Feature.objects.from_csv(f)
    CPU times: user 1.02 s, sys: 42.8 ms, total: 1.06 s
    918904 (~1M) records loaded
    Wall time: 3.45 s

    View Slide

  39. Paolo Melchiorre ~ @pauloxnet
    Tree
    39
    © 2020 Paolo Melchiorre (CC BY-SA) - Abruzzo (Italy)

    View Slide

  40. Paolo Melchiorre ~ @pauloxnet
    • ltree module
    • Hierarchical tree-like data
    • Materialized path
    • Dedicated PathField
    • Abstract TreeModel
    Django ltree
    40

    View Slide

  41. Paolo Melchiorre ~ @pauloxnet
    41
    TreeModel
    from django.contrib.postgres.indexes import GistIndex
    from django_ltree.models import TreeModel
    class Entry(TreeModel):
    ...
    class Meta:
    indexes = [GistIndex(fields=['path'])]

    View Slide

  42. Paolo Melchiorre ~ @pauloxnet
    42
    Tree - Example
    top
    pictures
    science
    astronomy astronomy
    astrophysics cosmology galaxies stars astronauts

    View Slide

  43. Paolo Melchiorre ~ @pauloxnet
    43
    Tree - Queries
    >>> from blog.models import Entry
    >>> path = "top.science"
    >>> q = Entry.objects.filter(path__descendants=path)
    >>> q.order_by("path").values_list("path", flat=True)
    [top.science,
    top.science.astronomy,
    top.science.astronomy.astrophysics,
    top.science.astronomy.cosmology]
    SELECT "path" FROM "blog_entry"
    WHERE "path" <@ 'top.science' ORDER BY "path" ASC

    View Slide

  44. Paolo Melchiorre ~ @pauloxnet
    44
    © 2020 Paolo Melchiorre (CC BY-SA) Appennines (Italy)

    View Slide

  45. Paolo Melchiorre ~ @pauloxnet
    • PostgreSQL extension
    • Best* GeoDjango backend
    • Spatial data types
    • Spatial indexing
    • Spatial functions
    PostGIS
    45

    View Slide

  46. Paolo Melchiorre ~ @pauloxnet
    46
    GeoDjango Compatibility Table
    ● Spatial Lookups (18/31)
    ● Database functions (15/32)
    ● Aggregate Functions (0/5)
    MariaDB
    49%
    ● Spatial Lookups (31/31)
    ● Database functions (32/32)
    ● Aggregate Functions (5/5)
    PostGIS
    100%
    ● Spatial Lookups (22/31)
    ● Database functions (30/32)
    ● Aggregate Functions (4/5)
    Spatialite
    82%
    ● Spatial Lookups (18/31)
    ● Database functions (16/32)
    ● Aggregate Functions (0/5)
    MySQL
    50%

    View Slide

  47. Paolo Melchiorre ~ @pauloxnet
    47
    What’s next
    • Indexes
    • Aggregation functions
    • Trigram extension
    • Range ûelds
    • ...

    View Slide

  48. Paolo Melchiorre ~ @pauloxnet
    48
    Tips
    • docs in djangoproject.com
    • details in postgresql.org
    • source code in github.com
    • questions in stackoverüow.com

    View Slide

  49. Paolo Melchiorre ~ @pauloxnet
    49
    License
    CC BY-SA 4.0
    This work is licensed under
    a Creative Commons
    Attribution-ShareAlike 4.0
    International License.

    View Slide

  50. Paolo Melchiorre ~ @pauloxnet
    @psycopg
    danielevarrazzo
    psycopg
    lists.postgresql.org
    psycopg.org
    50

    View Slide

  51. Paolo Melchiorre ~ @pauloxnet
    @20tab
    20tab
    20tab
    [email protected]
    20tab.com
    51

    View Slide

  52. Paolo Melchiorre ~ @pauloxnet
    @pauloxnet
    paolomelchiorre
    pauloxnet
    paulox.net
    52
    @[email protected]

    View Slide