On The Look-out For Your Data (DjangoCon Europe 2018)

On The Look-out For Your Data (DjangoCon Europe 2018)

My talk from DjangoCon Europe 2018

Cd7648c536b4dbe940246b74044fbc52?s=128

Markus H

May 24, 2018
Tweet

Transcript

  1. On The Look-Out For Your Data @m_holtermann #djangocon Europe 2018

  2. I’m Markus Holtermann @m_holtermann • github.com/MarkusH • markusholtermann.eu @laterpay •

    laterpay.net • Django Contributor • Software Engineer at
  3. What Is Search?

  4. How To Search In Django?

  5. What Is Search?

  6. What Is Search? Try to find something by looking or

    otherwise seeking carefully and thoroughly. — Oxford English Dictionary
  7. What Is Search? Try to find something by looking or

    otherwise seeking carefully and thoroughly. — Oxford English Dictionary
  8. What Is Search? Try to find something by looking or

    otherwise seeking carefully and thoroughly. — Oxford English Dictionary
  9. Search Is Hard

  10. Searching In Django?

  11. from django.shortcuts import get_object_or_404, render from blog.models import Article def

    article_view(request, pk): article = get_object_or_404(Article, pk=pk) return render( request, 'article.html', context={'article': article}, )
  12. Searching Text

  13. from django.shortcuts import get_list_or_404, render from blog.models import Article def

    article_view(request): articles = get_list_or_404( Article, text__icontains=request.GET.get('query', ''), ) return render( request, 'articles.html', context={'articles': articles}, )
  14. SELECT * FROM blog_article WHERE text ILIKE '%Looking for text%'

  15. Trigrams

  16. -- As superuser # CREATE EXTENSION pg_trgm; # SELECT show_trgm('I

    love Django'); show_trgm ---------------------------------------------- - {" d", " i", " l", " dj", " i ", " lo", ang, dja, "go ", jan, lov, ngo, ove, "ve "}
  17. from django.contrib.postgres.indexes import GistIndex class TrigramGistIndex(GistIndex): suffix = 'trgm_gist' sql

    = 'CREATE INDEX %(name)s ON %(table)s %(using)s \ (UPPER(%(columns)s) gist_trgm_ops)%(extra)s' def create_sql(self, model, schema_editor, using=''): statement = super().create_sql(model, schema_editor, using=using) statement.template = self.sql return statement
  18. Searching Text

  19. Full-text Search

  20. Word order doesn’t matter “Django Migrations” = “Migrations Django”

  21. Stemming computer, compute, computation = comput

  22. Ignoring Stopwords “Django is the best” = “Django best”

  23. __search & PostgreSQL https://docs.djangoproject.com/en/ 2.0/ref/contrib/postgres/search/

  24. External Search Tools

  25. None
  26. from django.db import models, transaction class Article(models.Model): # ... def

    save(self, *args, **kwargs): with transaction.atomic(): transaction.on_commit( lambda: update_search(self.pk)) super().save(*args, **kwargs)
  27. from django.db import models, transaction class Article(models.Model): # ... def

    delete(self, *args, **kwargs): pk = self.pk with transaction.atomic(): transaction.on_commit( lambda: delete_search(pk)) return super().delete( *args, **kwargs)
  28. Maintain A Complete Search Index

  29. What Is Search? Try to find something by looking or

    otherwise seeking carefully and thoroughly. — Oxford English Dictionary
  30. • Example: https://github.com/MarkusH/talk-django-search • Search in Django: https://docs.djangoproject.com/en/2.0/topics/db/search/ • Choosing

    a PostgreSQL text search method: https://blog.2ndquadrant.com/text-search-strategies-in-postgresql/ • Trigram Extension: https://www.postgresql.org/docs/10/static/pgtrgm.html • Full-text search: https://www.postgresql.org/docs/10/static/textsearch-tables.html
  31. Thank you! @m_holtermann

  32. import blog.indexes from django.contrib.postgres.operations import TrigramExtension from django.db import migrations

    class Migration(migrations.Migration): dependencies = [('blog', '0002_auto_20180503_1925')] operations = [ TrigramExtension(), migrations.AddIndex( model_name='entry', index=blog.indexes.TrigramGistIndex( fields=['body'], name='body_trgm_gist')), ]
  33. -- Creates extension pg_trgm CREATE EXTENSION IF NOT EXISTS "pg_trgm";

    -- Create index body_trgm_gist on field(s) -- body of model entry CREATE INDEX "body_trgm_gist" ON "blog_entry" USING gist (UPPER("body") gist_trgm_ops);