Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch DSL

Elasticsearch DSL

Slides from DjangoCon US 2014 by Honza Král, presenting on the Python DSL for Elasticsearch.

Elasticsearch Inc

September 09, 2014
Tweet

More Decks by Elasticsearch Inc

Other Decks in Technology

Transcript

  1. {! "id": 7635,! "accepted_answer_id": 7641,! "answer_count": 9,! "title": "Are you

    able to close your eyes and focus/think just on your code?",! "body": "How do I ......?",! "creation_date": "2010-09-27T19:16:57.757",! "closed_date": "2011-11-13T12:12:05.937",! "comment_count": 2,! "comments": [{! "creation_date": "2010-09-27T19:31:27.200",! "id": 9372,! "owner": { "display_name": "sange", "id": 3092 },! "post_id": 7635,! "text": "I sometimes close my eyes or stare at something ....."! }, {......}],! "favorite_count": 2,! "last_activity_date": "2010-09-28T00:28:08.393",! "owner": { "display_name": "flow", "id": 3761 },! "rating": 6,! "tags": [ "focus", "concentration" ],! "view_count": 368! } StackOverflow Question
  2. Queries (unstructured) Core Queries match, multi_match, phrase fuzzy, regexp, wildcard

    Compound Queries filtered bool function score ! Relies on analysis, produce score (relevancy)
  3. ! {! "query": {! "filtered": {! "query": {! "bool": {!

    "must": [! {"multi_match": {"fields": ["title^10", "body"], "query": "php"}},! {"has_child": { "child_type": "answer", "query": {"match": {"body": "python"}}}}! ],! "must_not": {"multi_match": {"fields": ["title", "body"], "query": "python"}}! }! },! "filter": {"range": {"creation_date": {"from": "2012-01-01"}}}! }! },! "aggs": {! "tags": {! "terms": {"field": "tags"},! "aggs": {! "comment_avg": {"avg": {"field": "comment_count"}}! }! },! "frequency": {"date_histogram": {"field": "creation_date", "interval": "month"}}! }! } Example
  4. Elasticsearch Distributed load balancing, node failure, node discovery ! Different

    deployment environments nginx, thrift, PaaS ! REST API 96 API endpoints, 672 parameters, escaping, encoding
  5. dict -> JSON from elasticsearch import Elasticsearch! ! es =

    Elasticsearch()! result = es.search(body={! "query": {! "filtered": {! "query": {! "bool": {! "must": [{"match": {"title": "python"}}],! "must_not": [{"match": {"title": "ruby"}}]! ! }! },! "filter": {! "range": {"creation_date": {"from": "2012-01-01"}}! }! }! }! })
  6. elasticsearch-dsl from elasticsearch_dsl import Search, Q! ! # create Search,

    bind it to client! s = Search(using=es)! ! # querying twice will combine queries! # or combine manually: s.query(Q() & ~Q())! s = s.query('match', title='python').query(~Q('match', title='ruby'))! ! # filter will turn it to filtered query! s = s.filter('range', creation_date={"from": date(2012, 1, 1)})! ! # get a fancy result object!! result = s.execute()
  7. elasticsearch-dsl from elasticsearch_dsl import Search, Q! ! # create Search,

    bind it to client! s = Search(using=es)! ! # querying twice will combine queries! # or combine manually: s.query(Q() & ~Q())! s = s.query('match', title='python').query(~Q('match', title='ruby'))! ! # filter will turn it to filtered query! s = s.filter('range', creation_date={"from": date(2012, 1, 1)})! ! # get a fancy result object!! result = s.execute()
  8. Q/F/A Shortcut for creating Queries/Filters/Aggregations ! Can use dicts or

    name + params ! Has elementary boolean logic Q('match', title='python') == Q({'match': {'title': 'python'}}) Q(1) & Q(2) == Q('bool', must=[Q(1), Q(2)])! Q(1) | (Q2) | Q(3) == Q('bool', should=[Q(1), Q(2), Q(3)])! ~F(1) == F('bool', must=[F(1)]) Q('match', title='python') == Match(title='python')
  9. .query().filter() chaining Copy is made on each change ! Same

    for other methods ! Except Aggs! s[0:10], s.using(es), s.index('today', 'yesterday'), ... s.aggs.bucket('per_tag').metric('avg').metric('max')! s.aggs.bucket('per_country').bucket('per_tag').metric('avg')! s.aggs['per_country'].metric(...) s2 = s1.query(Q(1))! s1 != s2
  10. Response Response object is returned:
 
 
 You can iterate

    over it and get hits:
 
 
 Aggregations can be accessed: for h in response:! print(h._meta.id, h.title) top_tag = response.aggregations.per_tag.buckets[0] response = s.execute()! if not response.success(): print("Partial results!")
  11. Migration Path query = {! "query": {! "filtered": {! "query":

    {! "bool": {! "must": [{"match": {"title": "python"}}],! "must_not": [{"match": {"title": "ruby"}}]! ! }! },! "filter": {! "range": {"creation_date": {"from": date(2012, 1, 1)}}! }! }! }! }! ! s = Search.from_dict(query)! s = ...! query = s.to_dict()
  12. Put your data in.... def sync_to_es(instance, **kwargs):! es.index(! index=settings.ES_INDEX,! doc_type=str(instance._meta),!

    id=instance.pk,! body=instance.to_json()) from elasticsearch.helpers import bulk! ! es.indices.put_mapping(index=settings.ES_INDEX, body={...})! ! bulk(es,! map(methodcaller('to_dict'), Model.objects.iterator()),! index=settings.ES_INDEX,! doc_type=str(Model._meta)) Bulk load - mgmt command Sync after change - signals
  13. Thank You! 
 Honza Král twitter: @honzakral email: [email protected] !

    Support: http://elasticsearch.com/support Training: http://training.elasticsearch.com/ We are hiring: http://elasticsearch.com/about/jobs/