Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
On The Look-out For Your Data (DjangoCon Europe 2018)
Search
Markus H
May 24, 2018
Technology
0
12k
On The Look-out For Your Data (DjangoCon Europe 2018)
My talk from DjangoCon Europe 2018
Markus H
May 24, 2018
Tweet
Share
More Decks by Markus H
See All by Markus H
🐍 ❤️ 🦀 — Python loves Rust
markush
0
180
Knock! Knock! Who's There?
markush
0
50
An Introduction To Kubernetes ☸
markush
0
72
Writing Safe Database Migrations (DjangoCon Europe 2021)
markush
0
13k
A Pony On The Move: How Migrations Work In Django 🐎
markush
0
12k
All Hands on Deck — Handling Security Issues
markush
0
13k
Logging Rethought 2: The Actions of Frank Taylor Jr. (PyCon UK 2019)
markush
0
41
Logging Rethought 2: The Actions of Frank Taylor Jr. (PyCon Australia 2019)
markush
1
170
Logging Rethought 2: The Actions of Frank Taylor Jr. (DjangoCon Europe 2019)
markush
0
13k
Other Decks in Technology
See All in Technology
JSON攻略法.pdf
miyakemito
8
5.1k
LLM開発・活用の舞台裏@2024.04.25
yushin_n
1
340
競技としてのKaggle、役に立つKaggle
yu4u
3
1.8k
Cloud Native Java with Spring Boot (CNCF Aarhus, April 2024)
thomasvitale
1
170
ワールドカフェI /チューターを改良する / World Café I and Improving the Tutors
ks91
PRO
0
120
KubeConにproposalを送りたい人へのアドバイス
sat
PRO
3
260
ChatworkのSRE部って実は 半分くらいPlatform Engineering部かもしれない
saramune
0
160
KubeCon EU 2024 Recap “Kubernetes Policy Time Machine: Where to Next?”
ryysud
0
220
LayerXにおけるLLMプロダクト開発の今までとこれから
layerx
PRO
1
360
私が trocco を推す理由
__allllllllez__
1
240
レガシーをぶっ壊せ。AEONで始めるDevRelの話 / Qiita Night 2024-2-22
aeonpeople
3
1.3k
よく聞くけど使ったことないソフトウェアNo.1 KafkaとSnowflake
foursue
4
360
Featured
See All Featured
Teambox: Starting and Learning
jrom
128
8.4k
Art, The Web, and Tiny UX
lynnandtonic
289
19k
Building Effective Engineering Teams - LeadDev
addyosmani
28
1.8k
GraphQLとの向き合い方2022年版
quramy
32
12k
Producing Creativity
orderedlist
PRO
337
39k
Scaling GitHub
holman
457
140k
Design by the Numbers
sachag
274
18k
Side Projects
sachag
451
41k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
155
14k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
2
3.4k
VelocityConf: Rendering Performance Case Studies
addyosmani
320
23k
What's in a price? How to price your products and services
michaelherold
237
11k
Transcript
On The Look-Out For Your Data @m_holtermann #djangocon Europe 2018
I’m Markus Holtermann @m_holtermann • github.com/MarkusH • markusholtermann.eu @laterpay •
laterpay.net • Django Contributor • Software Engineer at
What Is Search?
How To Search In Django?
What Is Search?
What Is Search? Try to find something by looking or
otherwise seeking carefully and thoroughly. — Oxford English Dictionary
What Is Search? Try to find something by looking or
otherwise seeking carefully and thoroughly. — Oxford English Dictionary
What Is Search? Try to find something by looking or
otherwise seeking carefully and thoroughly. — Oxford English Dictionary
Search Is Hard
Searching In Django?
from django.shortcuts import get_object_or_404, render from blog.models import Article def
article_view(request, pk): article = get_object_or_404(Article, pk=pk) return render( request, 'article.html', context={'article': article}, )
Searching Text
from django.shortcuts import get_list_or_404, render from blog.models import Article def
article_view(request): articles = get_list_or_404( Article, text__icontains=request.GET.get('query', ''), ) return render( request, 'articles.html', context={'articles': articles}, )
SELECT * FROM blog_article WHERE text ILIKE '%Looking for text%'
Trigrams
-- As superuser # CREATE EXTENSION pg_trgm; # SELECT show_trgm('I
love Django'); show_trgm ---------------------------------------------- - {" d", " i", " l", " dj", " i ", " lo", ang, dja, "go ", jan, lov, ngo, ove, "ve "}
from django.contrib.postgres.indexes import GistIndex class TrigramGistIndex(GistIndex): suffix = 'trgm_gist' sql
= 'CREATE INDEX %(name)s ON %(table)s %(using)s \ (UPPER(%(columns)s) gist_trgm_ops)%(extra)s' def create_sql(self, model, schema_editor, using=''): statement = super().create_sql(model, schema_editor, using=using) statement.template = self.sql return statement
Searching Text
Full-text Search
Word order doesn’t matter “Django Migrations” = “Migrations Django”
Stemming computer, compute, computation = comput
Ignoring Stopwords “Django is the best” = “Django best”
__search & PostgreSQL https://docs.djangoproject.com/en/ 2.0/ref/contrib/postgres/search/
External Search Tools
None
from django.db import models, transaction class Article(models.Model): # ... def
save(self, *args, **kwargs): with transaction.atomic(): transaction.on_commit( lambda: update_search(self.pk)) super().save(*args, **kwargs)
from django.db import models, transaction class Article(models.Model): # ... def
delete(self, *args, **kwargs): pk = self.pk with transaction.atomic(): transaction.on_commit( lambda: delete_search(pk)) return super().delete( *args, **kwargs)
Maintain A Complete Search Index
What Is Search? Try to find something by looking or
otherwise seeking carefully and thoroughly. — Oxford English Dictionary
• Example: https://github.com/MarkusH/talk-django-search • Search in Django: https://docs.djangoproject.com/en/2.0/topics/db/search/ • Choosing
a PostgreSQL text search method: https://blog.2ndquadrant.com/text-search-strategies-in-postgresql/ • Trigram Extension: https://www.postgresql.org/docs/10/static/pgtrgm.html • Full-text search: https://www.postgresql.org/docs/10/static/textsearch-tables.html
Thank you! @m_holtermann
import blog.indexes from django.contrib.postgres.operations import TrigramExtension from django.db import migrations
class Migration(migrations.Migration): dependencies = [('blog', '0002_auto_20180503_1925')] operations = [ TrigramExtension(), migrations.AddIndex( model_name='entry', index=blog.indexes.TrigramGistIndex( fields=['body'], name='body_trgm_gist')), ]
-- Creates extension pg_trgm CREATE EXTENSION IF NOT EXISTS "pg_trgm";
-- Create index body_trgm_gist on field(s) -- body of model entry CREATE INDEX "body_trgm_gist" ON "blog_entry" USING gist (UPPER("body") gist_trgm_ops);