$30 off During Our Annual Pro Sale. View Details »

Dungeon Master's guide to Django's ORM

Dungeon Master's guide to Django's ORM

Presentation at DjangoCon-US, 2012

Malcolm Tredinnick

September 05, 2012
Tweet

More Decks by Malcolm Tredinnick

Other Decks in Programming

Transcript

  1. Dungeon Master's guide to Django's ORM
    Malcolm Tredinnick

    View Slide

  2. Me

    Python user since 1997

    Database dabbler since the 1980's

    Django user since September 2005

    Django committer since May 2006

    Fingers in most parts of the code base

    View Slide

  3. Some History

    View Slide

  4. 12 July 2005, 8:25 pm CDT

    “Imported Django from private SVN repository”

    django/core/meta.py was ORM code

    2,142 lines

    View Slide

  5. 30 January, 2006

    “Added 'When will you release Django 1.0?' to
    FAQ”

    “[...]This should happen in a couple of months
    or so, although it's entirely possible that it
    could happen earlier. That translates into
    February or March 2006.”

    View Slide

  6. 1 May, 2006

    “MERGED MAGIC-REMOVAL BRANCH TO
    TRUNK. This change is highly backwards-
    incompatible.”

    django/db/models/query.py appears.

    parse_lookup() and lookup_inner().

    View Slide

  7. Late July 2006

    I start to look at restructuring of SQL
    generation.

    Three fairly different approaches attempted
    – first one worked, but was probably too complex.
    – second one sucked. I don't like to talk about it.
    – third one is what we have now.

    View Slide

  8. 4 July 2007

    “Merged Unicode branch into trunk
    (r4952:5608). This should be fully backwards
    compatible for all practical purposes.”

    Made database encoding much easier.

    Bought some goodwill.

    View Slide

  9. Sunday, 27 April 2008
    Merged the queryset-refactor branch into trunk.
    This is a big internal change, but mostly backwards
    compatible with existing code.
    Also adds a couple of new features.
    Fixed #245, #1050, #1656, #1801, #2076, #2091, #2150,
    #2253, #2306, #2400, #2430, #2482, #2496, #2676,
    #2737, #2874, #2902, #2939, #3037, #3141, #3288,
    #3440, #3592, #3739, #4088, #4260, #4289, #4306,
    #4358, #4464, #4510, #4858, #5012, #5020, #5261,
    #5295, #5321, #5324, #5325, #5555, #5707, #5796,
    #5817, #5987, #6018, #6074, #6088, #6154, #6177,
    #6180, #6203, #6658

    View Slide

  10. 3 September 2008, 7:38pm EDT

    Django 1.0

    First DjangoCon on weekend of 6 – 7
    September.

    View Slide

  11. 15 January, 2009

    “Fixed #3566 -- Added support for aggregation
    to the ORM.”

    Mostly a Google Summer of Code project by
    Nicolas Lara.

    Supervision, carrot and stick provided by
    Russell Keith-Magee.

    View Slide

  12. Was it worth it?

    View Slide

  13. Was it worth it?

    Code structure has remained fairly stable.

    The layers of abstraction feel roughly correct.

    Minor code duplication in recent times.
    – Hmm... code sprints later this week

    Mostly logical code flow. (Stop laughing!)

    View Slide

  14. How does it work?

    View Slide

  15. Interlude: a useful rule
    The developers who were here before you were
    probably not insane.

    View Slide

  16. Code layers

    django/db/models/
    – managers.py
    – query.py (~1800 lines)

    django/db/models/sql/
    – query.py (~2000 lines)
    – compiler.py (~1100 lines)

    django/db/backends/*/
    – base.py
    – operations.py

    View Slide

  17. A trip down the rabbit hole
    Article.objects.filter(date__lte=today,
    author__byline=”anonymous”)

    View Slide

  18. The classes

    django.db.models.query.QuerySet

    django.db.models.sql.query.Query

    django.db.models.sql.where.WhereNode

    django.db.models.sql.compiler.SQLCompiler

    View Slide

  19. filter(date__lte=today, author__byline=”anonymous”)
    QuerySet._filter_or_exclude(negate=False,
    date__lte=today, author__byline=”anonymous”)

    View Slide

  20. QuerySet._filter_or_exclude(negate=False,
    date__lte=today,
    author__byline=”anonymous”)
    Query.add_q(
    Q(date__lte=today,
    author__byline=”anonymous”)
    )

    View Slide

  21. Query.add_q(
    Q(date__lte=today,
    author__byline=”anonymous”)
    )
    (“date__lte”, today)
    (“author__byline”, “anonymous”)
    Query.add_filter( ... )

    View Slide

  22. Query.add_filter((“author__byline”, “anonymous”), ... )
    Query.setup_joins((“author”, “byline”), ...)
    fields = (“author”, “byline”)
    lookup_type = “exact”
    value = “anonymous”

    View Slide

  23. Query.setup_joins((“author”,
    “byline”), ...)
    Query.alias_map = {
    “T1”: (“app_article”, “T1”, None, None, None, None, False),
    “T2”: (“app_author”, “T2”, INNER, “T1”, “author”, “id”, False),
    ...
    }
    nullable join?

    View Slide

  24. Query.add_filter((“author__byline”, “anonymous”), ... )
    WhereNode.add( ..., “exact”, “anonymous”)
    fields = (“author”, “byline”)
    lookup_type = “exact”
    value = “anonymous”

    View Slide

  25. In the background...

    Aliases are reference counted
    – sometimes joins can be trimmed away (for speed)

    order() can cause new joins to be added

    So can model inheritance

    SQL “HAVING” is like “WHERE”
    – also uses WhereNode

    View Slide

  26. In the background...

    All aliases in a QuerySet can be changed at
    once
    – T1, T2, T3, .. --> U1, U2, U3, ...
    – for nested queries

    QuerySets can be merged

    Same table can appear with different aliases

    View Slide

  27. In the background...

    select_related(), defer(), only() affect columns
    to be selected

    View Slide

  28. Cloning QuerySets

    Happens a lot!
    – qs . filter(...) . filter( ...) <-- 3 clones

    Effectively as fast as possible; still slow.

    We need to add a “no_clone” experts-only
    feature.

    View Slide

  29. Executing a query
    QuerySet.__iter__()
    QuerySet.iterator()
    Query.get_compiler( ... )
    SQLCompiler.results_iter()
    SQLCompiler.as_sql()
    Create
    SQL
    Return
    results
    (rows)

    View Slide

  30. Food for thought

    Look at Query.add_q()
    – can handle custom Q-like objects
    – E.g. create some to utilise advanced DB features

    Look at expression objects
    – django.db.models.expressions.ExpressionNode
    – create custom F-like objects

    View Slide

  31. Anybody still awake?
    Question time
    http://speakerdeck.com/u/malcolmt/

    View Slide