Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Combining Django and Elasticsearch

Cd7648c536b4dbe940246b74044fbc52?s=47 Markus H
February 17, 2015

Combining Django and Elasticsearch

Cd7648c536b4dbe940246b74044fbc52?s=128

Markus H

February 17, 2015
Tweet

Transcript

  1. Combining Django & Elasticsearch

  2. Search is hard

  3. What is relevant information

  4. Classify information importance

  5. Selective search attributes

  6. What’s Elasticsearch

  7. Based on Lucene

  8. Distributed Multitenant Full-text

  9. RESTful API

  10. JSON serialization

  11. Open Source: Apache License 2.0

  12. It’s Java

  13. Elasticsearch & Python

  14. Python Bindings

  15. pyelasticsearch

  16. elasticsearch elasticsearch_dsl

  17. python-requests http://docs.python-requests.org/en/latest/_static/requests-sidebar.png

  18. Use the official libraries $ pip install elasticsearch-dsl

  19. >>> from elasticsearch import Elasticsearch >>> from elasticsearch_dsl import Search

    >>> client = Elasticsearch() >>> s = Search(using=client, index="blog") \ ... .query("match", title="django") \ ... .filter("term", is_public=True) >>> response = s.execute()
  20. Elasticsearch & Django

  21. Django Bindings

  22. Haystack

  23. djangoes

  24. elasticutils

  25. django-simple-elasticsearch

  26. Example

  27. from elasticsearch_dsl import DocType, String class SearchArticle(DocType): title = String()

    text = String(analyzer='english') slug = String(index='not_analyzed') url = String(index='not_analyzed') # ... class Meta: index = 'blog' Index Mapping
  28. AppConfig from elasticsearch_dsl import connections from . import models, search,

    signals class BlogConfig(AppConfig): name = 'blog' def ready(self): connections.connections.configure( **settings.ELASTICSEARCH_CONNS) search.SearchArticle.init() post_save.connect(signals.post_save_article, sender=models.Article)
  29. Signals def post_save_article(sender, instance, created, **kwargs): article = SearchArticle(id=instance.pk) if

    created else \ SearchArticle.get(id=instance.pk) article.title = instance.title article.text = instance.text article.url = instance.get_absolute_url() article.save() def post_delete_article(sender, instance, **kwargs): article = SearchArticle.get(id=instance.pk) article.delete()
  30. Search View def search(request): q = request.GET.get('q', '') context =

    {'query': q, 'results': []} if q: search = SearchArticle.search() search = search.query('simple_query_string', query=q, fields=['title', 'text']) if not request.user.is_authenticated(): search = search.filter('term', is_public=True) context['results'] = search.execute() return render(request, 'blog/search.html', context)
  31. Problems

  32. Indexing happens during request time

  33. Solution: Celery Integration

  34. Celery Integration @shared_task(bind=True, default_retry_delay=60, max_retries=3) def index_article(self, pk): try: article

    = Article.objects.get(pk=pk) except Article.ObjectDoesNotExist: self.retry() try: search_article = SearchArticle.get(id=pk) except elasticsearch.NotFoundError: search_article = SearchArticle(id=pk) search_article.title = instance.title # ... search_article.save()
  35. Demo

  36. What I want to see

  37. Auto generated mappings

  38. Celery integration

  39. Management commands to (re)index

  40. 3rd party app support

  41. Django admin integration

  42. Pagination

  43. Thank you! markusholtermann.eu @m_holtermann github.com/MarkusH