Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DjangoConEurope 22: Deep Inside Django ORM – Ho...

Bas Steins
September 21, 2022

DjangoConEurope 22: Deep Inside Django ORM – How Django build queries

Bas Steins

September 21, 2022
Tweet

Other Decks in Programming

Transcript

  1. WHO AM I Software Developer, Consultant, Trainer Currently working for

    Miltenyi Biotec German Living in the Netherlands Sebastian Steins (my German friends call me Seb(i), my Dutch friends call me Bas) Programming since the age of 14, Python since 2007, Django since 2008 Btw, Django is boring tech now: First release in 2005 (that was when web2.0 was a thing)
  2. STRUCTURE The call chain from Manager to Compiler How to

    query a lazy database Filters, Expressions, WhereNode s Views
  3. A WORD OF WARNING The inner mechanics of the ORM

    (everything below Managers and QuerySets) is undocumented and not intended to be accessed in your project. It is subject to changes (especially since the introduction of async ORM) Just be warned but hack on!
  4. LET'S START WITH A SIMPLE MODEL… class Blog(models.Model): title =

    models.CharField(max_length=255) class Author(models.Model): name = models.CharField(max_length=255) class Post(models.Model): author = models.ForeignKey(Author, on_delete=...) blog = models.ForeignKey(Blog, on_delete=...) title = models.CharField(max_length=255) body = models.TextField()
  5. NOW WHAT HAPPENS BEHIND THE SCENES? # Manager # Model

    | Filter # _|_ __|__ | # / \/ \ / \ blogs = Blog.objects.all() # \___/ # | # Queryset
  6. CALL CHAIN Manager creates QuerySet # manager.py class BaseManager: ...

    def get_queryset(self): """ Return a new QuerySet object. Subclasses can override this method to customize the behavior of the Manager. """ return self._queryset_class(model=self.model, using=self._db, hints=self._hints) ...
  7. CALL CHAIN QuerySet creates Query # query.py class QuerySet: """Represent

    a lazy database lookup for a set of objects.""" ... @property def query(self): if self._deferred_filter: negate, args, kwargs = self._deferred_filter self._filter_or_exclude_inplace(negate, args, kwargs) self._deferred_filter = None return self._query ...
  8. CALL CHAIN Query passes Query to SQLCompiler # query.py class

    Query: ... def sql_with_params(self): """ Return the query as an SQL string and the parameters that will be substituted into the query. """ compiler = self.get_compiler(DEFAULT_DB_ALIAS) return compiler.as_sql() ...
  9. CALL CHAIN Now what does the compiler do? # compiler.py

    class SQLCompiler: ... def get_select(self): """ Return three values: - a list of 3-tuples of (expression, (sql, params), alias) - a klass_info structure, - a dictionary of annotations The (sql, params) is what the expression will produce, and alias is the "AS alias" for the column (possibly None). The klass_info structure contains the following information:
  10. THE Query OBJECT qs = Blog.objects.all() print(qs.query) # SELECT "demo_blog"."id",

    "demo_blog"."title" FROM "demo_blog" print(qs.query.alias_map) # {'demo_blog': <django.db.models.sql.datastructures.BaseTable object at print(qs.query.alias_refcount) # {'demo_blog': 0} print(qs.query.table_map) # {'demo_blog': ['demo_blog']} print(qs.query.base_table) demo_blog
  11. FORCE A QUERY As long as you do not call

    any of these method, QuerySet does nothing with the data base.
  12. FORCE A QUERY On a QuerySet call a method that

    does not return another QuerySet Most of these methods a special Python methods (dunder) QeurySet implements: __iter__ (e.g. for blog in qs ) __len__ (returns count of instances) __getitem__ (e.g. qs[:10] ) – implements LIMIT couple of others like __bool__ , __nonzero__
  13. WHAT HAPPENS WITH QuerySet.filter()? # query.py class QuerySet: ... def

    _filter_or_exclude_inplace(self, negate, args, kwargs): if negate: self._query.add_q(~Q(*args, **kwargs)) else: self._query.add_q(Q(*args, **kwargs)) ...
  14. WHAT HAPPENS WITH QuerySet.filter()? KEY TAKEAWAY: filter and exclude are

    just an abstraction of the well documented Q objects
  15. THE WhereNode TREE THIS LOOKS A LOT LIKE LISP blogs

    = Blog.objects.filter(Q(Q(id=1) | Q(title__istartswith="A"))) print(blogs.query.where) # (AND: # (OR: Exact(Col(demo_blog, demo.Blog.id), 1), # IStartsWith(Col(demo_blog, demo.Blog.title), 'A')))
  16. INNER MECHANICS: _add_q def _add_q( self, q_object, used_aliases, branch_negated=False, current_negated=False,

    allow_joins=True, split_subq=True, check_filterable=True, ): """Add a Q-object to the current filter.""" connector = q_object.connector current_negated = current_negated ^ q_object.negated branch_negated = branch_negated or q_object.negated target_clause = WhereNode(connector=connector, negated=q_object.n joinpromoter = JoinPromoter( q_object.connector, len(q_object.children), current_negated
  17. WHAT ABOUT VIEWS? We have a model But we also

    have a view, which might be accessed faster HOW TO QUERY THAT VIEW BUT STILL GET OUR MODEL?
  18. "MONKEY PATCH" THE QUERY OBJECT! from django.db.models.sql.datastructures import BaseTable bt

    = BaseTable("demo_secondblog", "demo_secondblog") blogs.query.alias_map = {'demo_secondblog': bt} blogs.query.alias_refcount = {'demo_secondblog': 0} blogs.query.table_map = {'demo_secondblog': ["demo_secondblog"]} blogs.query.base_table = "demo_secondblog" print(blogs.query) # SELECT "demo_secondblog"."id", # "demo_secondblog"."title" FROM # "demo_secondblog"
  19. BUT: THERE IS A BETTER* WAY TO USE IT (*

    and more obvious) UNMANAGED MODELS class UnmanagedBlog(Blog): class Meta: managed = False db_table = "demo_secondblog"
  20. CALL CHAIN How do values flow back from compiler to

    model instance? QuerySet.__iter__ QuerySet.iterator Query.get_compiler SQLCompiler.results_iter SQLCompiler.as_sql
  21. ITER # query.py class ModelIterable: ... def __iter__(self): ... for

    row in compiler.results_iter(results): obj = model_cls.from_db( db, init_list, row[model_fields_start:model_fields_en ) ... ...
  22. HELLO WORLD EXAMPLE: SOFT DELETE MANAGER class BlogManager(models.Manager): def get_queryset(self):

    return super().get_queryset().filter(deleted=False) class Blog(models.Model): objects = BlogManager() title = models.CharField(max_length=255) deleted = models.BooleanField(default=False) ...
  23. QUESTIONS? Slides at or via CONTACT Bas Steins, Twitter: @bascodes

    https://bas.surf/djangoconeu22-slides pipx run bascodes djangoconeu22-slides [email protected] https://bas.codes