Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch 5.x You Know, for Search

Elasticsearch 5.x You Know, for Search

a getting started guide

Avatar for Shai Gabai

Shai Gabai

May 24, 2017
Tweet

More Decks by Shai Gabai

Other Decks in Technology

Transcript

  1. What is Elasticsearch? • Full-text search & Analytics engine •

    Open source • NoSQL database • “Schemaless” • Inverted indices - Lucene based
  2. Elastic stack Versions 16-Aug 16-Mar 16-Feb 15-Nov 15-Oct 2.4 2.3

    2.2 2.1 2.0 Elasticsearch 2.4 2.3 2.2 2.1 2.0 Logstash 4.6 4.5 4.4 4.3 4.2 Kibana 1.3 1.2 1.1 1.0 Beats
  3. Use Cases • Full text search • Logging & Analysis

    • Event data • Analytics & Aggregations • Data visualization • Alerting & Classification • Suggestions & Autocomplete • Performance monitoring
  4. Concepts • Index • Type • Document • Field •

    Mapping • Everything is Indexed • Query DSL • Database • Table • Row • Column • Schema • Index • SQL RDBMS Elasticsearch
  5. Add Failover • All primary and replica shards are allocated

    Concepts Idx-1-sh1 Idx-1-sh2 Idx-1-sh3 Node-1 Node-2 Idx-1-R1 Idx-1-R2 Idx-1-R3
  6. Scale Horizontally • shards have been reallocated to spread the

    load Concepts Idx-1-sh2 Idx-1-sh3 Node-1 Node-2 Idx-1-R1 Idx-1-R2 Idx-1-sh1 Node-3 Idx-1-R3
  7. Scale Some More • Increasing the number of replicas to

    2 Concepts Idx-1-sh2 Idx-1-sh3 Node-1 Node-2 Idx-1-R1 Idx-1-R2 Idx-1-sh1 Node-3 Idx-1-R3 Idx-1-R3 Idx-1-R2 Idx-1-R1
  8. Cluster after killing one node • A cluster must have

    a master node in order to function correctly Concepts Idx-1-sh2 Idx-1-sh3 Node-2 Idx-1-sh1 Node-3 Idx-1-R3 Idx-1-R2 Idx-1-R1 Node-1
  9. Getting Started http://localhost:9200/ { "name" : "node-01", "cluster_name" : "shagaba",

    "cluster_uuid" : "3p3dLj8bQYqOZGLBX3GAHg", "version" : { "number" : "5.1.1", "build_hash" : "5395e21", "build_date" : "2016-12-06T12:36:15.409Z", "build_snapshot" : false, "lucene_version" : "6.3.0" }, "tagline" : "You Know, for Search" }
  10. Field datatypes • Core datatypes • String – text, keyword

    • Numeric – long, integer, short, byte, double, float • Date – date • Boolean – boolean • Complex datatypes • Array • Object
  11. Field datatypes • Geo datatypes • geo_point – for lat/lon

    points • geo_shape – for complex shapes like polygons • Specialised datatypes • ip – for IPv4 and IPv6 addresses • completion – to provide auto-complete suggestions
  12. Document { "name": "John Smith", "age": 42, "confirmed": true, "join_date":

    "2017-01-01", "home": { "lat": 51.5, "lon": 0.1 }, "accounts": [ { "type": "facebook", "id": "johnsmith" } ] }
  13. Document Metadata • _index Where the document lives • _type

    The class of object that the document represents • _id The unique identifier for the document • _version Enables optimistic concurrency control on a single document level • _source The original document that was indexed
  14. Indexing a Document • Using Our Own ID PUT /{index}/{type}/{id}

    { "field": "value", ... } PUT /website/blog/123 { "title": "My first blog entry", "text": "Just trying this out...", "date": "2017/01/01" }
  15. Indexing a Document • Elasticsearch responds { "_index": "website", "_type":

    "blog", "_id": "123", "_version": 1, "created": true } 201 (CREATED) if it's a newly created doc 200 (OK) if the doc was updated (replaced/reindexed)
  16. Indexing a Document • Autogenerating IDs POST /website/blog/ { "title":

    "My second blog entry", "text": "Still trying this out...", "date": "2017/01/01" } { "_index": "website", "_type": "blog", "_id": "AVFgSgVHUP18jI2wRx0w", "_version": 1, "created": true }
  17. Retrieving a Document • Retrieving the whole Document GET /website/blog/123?pretty

    { "_index" : "website", "_type" : "blog", "_id" : "123", "_version" : 1, "found" : true, "_source" : { "title": "My first blog entry", "text": "Just trying this out...", "date": "2017/01/01" } } 200 (OK) if exists 404 (NOT FOUND) if doesn’t exist
  18. Retrieving a Document • Not Found GET /website/blog/456?pretty HTTP/1.1 404

    Not Found Content-Type: application/json; charset=UTF-8 Content-Length: 83 { "_index" : "website", "_type" : "blog", "_id" : “456", "found" : false }
  19. Retrieving a Document • Retrieving Part of a Document GET

    /website/blog/123?_source=title,text { "_index" : "website", "_type" : "blog", "_id" : "123", "_version" : 1, "found" : true, "_source" : { "title": "My first blog entry", "text": "Just trying this out..." } }
  20. Retrieving a Document • Retrieving Fields without Metadata GET /website/blog/123/_source

    { "title": "My first blog entry", "text": "Just trying this out...", "date": "2017/01/01" }
  21. Checking whether a Document Exists • Check if a document

    is in the index • Without the overhead of loading it HEAD /website/blog/7890 HTTP/1.1 200 OK Content-Type: application/json; charset=UTF-8 Content-Length: 0 HTTP/1.1 404 Not Found Content-Type: application/json; charset=UTF-8 Content-Length: 0 200 (OK) if _id exists 404 (NOT FOUND) if _id doesn’t exist
  22. Updating a Whole Document • Documents are immutable PUT /website/blog/123

    { "title": "My first blog entry", "text": "I am starting to get the hang of this...", "date": "2017/01/01" } { "_index" : "website", "_type" : "blog", "_id" : "123", "_version" : 2, "created": false }
  23. Partial Update • Partial document merged with existing document POST

    /website/blog/123/_update { "title": "Partial Update to Document" } { "_index" : "website", "_type" : "blog", "_id" : "123", "_version" : 2, "created": false }
  24. Deleting a Document • When the Document is found DELETE

    /website/blog/123 { "found" : true, "_index" : "website", "_type" : "blog", "_id" : "123", "_version" : 3 } 200 (OK) if exists 404 (NOT FOUND) if doesn’t exist
  25. Deleting a Document • When the Document isn’t found DELETE

    /website/blog/123 { "found" : false, "_index" : "website", "_type" : "blog", "_id" : "123", "_version" : 4 } • Elasticsearch does keep records of deletes, but forgets about them after 60 second. • This is called deletes garbage collection
  26. The need for Text Analysis Exact values • 156 •

    1.9 • 2017-01-01 • true / false • “Hello World” Full text “The quick brown fox jumped over the lazy dog”
  27. The need for Text Analysis • Stopwords • "a", "and",

    "but", "how", "or", "what", "else", "etc", "the“… • Case sensitivity • "Hello World", "hello world", "HELLO WORLD"... • Grammar • "jumps", "jumping", "jumped“, "jump“ • Synonyms • "walk", "hike", "tour", "parade", "march“ • Relevance scoring
  28. Inverted Index • Elasticsearch uses a structure called an inverted

    index, which is designed to allow very fast full-text searches. • An inverted index consists of a list of all the unique words that appear in any document, and for each word, a list of the documents in which it appears.
  29. Inverted Index “The quick brown fox jumped over the lazy

    dog” “Quick brown foxes leap over lazy dogs in summer “
  30. Inverted Index • List docs containing terms Term Doc_1 Doc_2

    ------------------------- Quick | | X The | X | brown | X | X dog | X | dogs | | X fox | X | foxes | | X in | | X jumped | X | lazy | X | X leap | | X over | X | X quick | X | summer | | X the | X | ------------------------
  31. Inverted Index • Search for “quick brown” Term Doc_1 Doc_2

    ------------------------- brown | X | X quick | X | ------------------------ Total | 2 | 1 Both documents match, but the first document has more matches than the second
  32. Inverted Index • Few problems • “Quick” and “quick” •

    “The” and “the” • “fox” and “foxes” • “dog” and “dogs” • “jumped” and “leap” • “the” doesn’t bring much value Term Doc_1 Doc_2 ------------------------- Quick | | X The | X | brown | X | X dog | X | dogs | | X fox | X | foxes | | X in | | X jumped | X | lazy | X | X leap | | X over | X | X quick | X | summer | | X the | X | ------------------------
  33. Inverted Index • Normalize into a standard format • “Quick”

    can be lowercased to become “quick”. • “The” can be lowercased to become “the”. • “foxes” can be stemmed to its root form: “fox”. • “dogs” could be stemmed to “dog”. • “jumped” and “leap” are synonyms and can be indexed as just the single term “jump”. Term Doc_1 Doc_2 ------------------------- brown | X | X dog | X | X fox | X | X in | | X jump | X | X lazy | X | X over | X | X quick | X | X summer | | X ------------------------
  34. Inverted Index • Search for “Quick fox“ would fail Term

    Doc_1 Doc_2 ------------------------- brown | X | X dog | X | X fox | X | X in | | X jump | X | X lazy | X | X over | X | X quick | X | X summer | | X ------------------------ We no longer have the exact term “Quick” in our index.
  35. Inverted Index • Solution • apply the same normalization rules

    that we used on the content field to our query string, it would become a query for ”quick fox” Term Doc_1 Doc_2 ------------------------- brown | X | X dog | X | X fox | X | X in | | X jump | X | X lazy | X | X over | X | X quick | X | X summer | | X ------------------------
  36. Analysis • The process of converting text into tokens or

    terms which are added to the inverted index for searching. • Tokenization – tokenizing a block of text into individual terms suitable for use in an inverted index. • Normalization – normalizing these terms into a standard form to improve their “searchability”.
  37. Introduction • Special algorithms that determine how a string field

    in a document is transformed into terms in an inverted index. • Character filters – replaces characters for analyzed text • Tokenizers – break text down into terms • Token filters – add/ change/ remove terms • Build in analyzers • Custom analyzers • When analyzers are used? • Index time • Search time
  38. • Standard analyzer • Simple analyzer • Whitespace analyzer •

    Stop analyzer • Keyword analyzer • Pattern analyzer • Language analyzer • Fingerprint analyzer • Custom analyzer Built in Analyzers
  39. Standard Analyzer • Standard Tokenizer [ The, 2, QUICK, Brown,

    Foxes, jumped, over, the, lazy, dog's, bone ]
  40. Standard Analyzer • Lowercase Filter [ the, 2, quick, brown,

    foxes, jumped, over, the, lazy, dog's, bone ]
  41. Standard Analyzer • Stopwords Filter (disabled by default) [ 2,

    quick, brown, foxes, jumped, over, lazy, dog's, bone ]
  42. Simple Analyzer "The 2 QUICK Brown-Foxes jumped over the lazy

    dog's bone." 1. Lowercase Tokenizer [ the, quick, brown, foxes, jumped, over, the, lazy, dog, s, bone ] Equivalent to the letter tokenizer combined with the lowercase token filter, but is more efficient as it performs both steps in a single pass
  43. Stop Analyzer "The 2 QUICK Brown-Foxes jumped over the lazy

    dog's bone." 1. Lowercase Tokenizer [ the, quick, brown, foxes, jumped, over, the, lazy, dog, s, bone ] 2. Stop Filter [ quick, brown, foxes, jumped, over, lazy, dog, s, bone ]
  44. Whitespace Analyzer "The 2 QUICK Brown-Foxes jumped over the lazy

    dog's bone." 1. Whitespace Tokenizer [ The, 2, QUICK, Brown-Foxes, jumped, over, the, lazy, dog's, bone. ]
  45. English Analyzer "The Quick Brown Fox jumped over the Lazy

    Dog!" 1. Standard Tokenizer [ The, Quick, Brown, Fox, jumped, over, the, Lazy, Dog ] 2. Lowercase Filter [ the, quick, brown, fox, jumped, over, the, lazy, dog ] 3. English Stemmer [ the, quick, brown, fox, jump, over, the, lazy, dog ] 4. English Stopwords [ quick, brown, fox, jump, over, lazy, dog ]
  46. HebMorph • Open source AGPL3 • Commercial Option Available •

    Itamar Syn-Hershko • https://github.com/synhershko/HebMorph • http://code972.com/hebmorph
  47. • Performs the analysis process on a text and return

    the tokens breakdown of the text. GET /_analyze { "analyzer": "english", "text": "The 2 QUICK Foxes jumped." } Testing Analyzer
  48. { "token": "fox", "start_offset": 12, "end_offset": 17, "type": "<ALPHANUM>", "position":

    3 }, { "token": "jump", "start_offset": 18, "end_offset": 25, "type": "<ALPHANUM>", "position": 4 } ] } { "tokens": [ { "token": "2", "start_offset": 4, "end_offset": 5, "type": "<NUM>", "position": 1 }, { "token": "quick", "start_offset": 6, "end_offset": 11, "type": "<ALPHANUM>", "position": 2 }, Testing Analyzer
  49. PUT my_index { "settings": { "analysis": { "analyzer": { "my_custom_analyzer":

    { "type": "custom", "char_filter": [ "html_strip" ], "tokenizer": "standard", "filter": [ "lowercase" ] } } } } } Custom Analyzer
  50. • Specifying an index – performs the analysis process on

    a text. GET my_index/_analyze { "analyzer": "my_custom_analyzer", "text": "The 2 <b>QUICK Foxes</b>" } POST my_index/_analyze { "analyzer": "my_custom_analyzer", "text": "The 2 <b>QUICK Foxes</b>" } Custom Analyzer
  51. { "token": "quick", "start_offset": 9, "end_offset": 14, "type": "<ALPHANUM>", "position":

    2 }, { "token": "foxes", "start_offset": 15, "end_offset": 24, "type": "<ALPHANUM>", "position": 3 } ] } { "tokens": [ { "token": "the", "start_offset": 0, "end_offset": 3, "type": "<ALPHANUM>", "position": 0 }, { "token": "2", "start_offset": 4, "end_offset": 5, "type": "<NUM>", "position": 1 }, Custom Analyzer
  52. Mapping • Dynamic field mapping • Put mappings • Get

    mappings • Mapping analyzer • Multilingual documents
  53. Dynamic Field Mapping JSON datatype • null • true/false •

    floating point number • integer • object • array • string Elasticsearch datatype • No field is added. • boolean field • float field • long field • object field • Depends on the first non-null value • Either a date field (date detection), a double or long field (numeric detection) or a text field, with a keyword sub-field.
  54. • Creates an index called twitter with the message field

    in the tweet mapping type PUT twitter { "mappings": { "tweet": { "properties": { "message": { "type": "text" } } } } } Put Mapping
  55. • Uses the PUT mapping API to add a new

    field called user_name to the tweet mapping type. PUT twitter/_mapping/tweet { "properties": { "user_name": { "type": "text" } } } Put Mapping
  56. GET /twitter/_mapping/tweet { "twitter": { "mappings": { "tweet": { "properties":

    { "message": { "type": "text" }, "user_name": { "type": "text" } } } } } } Get Mapping
  57. Get Mapping • Get mappings for tweet and user types

    • GET /_mapping/tweet,user • GET /_all/_mapping/tweet,user • Get mappings of all indices and types • GET /_all/_mapping • GET /_mapping
  58. Mapping Analyzer PUT /my_index { "mappings": { "my_type": { "properties":

    { "text4u": { "type": "text" }, "english4u": { "type": "text", "analyzer": "english" } } } } }
  59. • Finding the right strategy for handling documents written in

    several languages can be challenging. • Mixing languages in the same inverted index can be problematic. • We must take into consideration • Index Time • Search Time Multilingual documents
  60. At Index Time • Multilingual documents come in three main

    varieties: • One predominant language per document, which may contain snippets from other languages (One Language per Document) • One predominant language per field, which may contain snippets from other languages (One Language per Field) • A mixture of languages per field (Mixed-Language Fields.) Multilingual documents
  61. At Query Time • Identify the main language: • the

    language that the user chosen from the UI • the accept-language HTTP header from the user’s browser. • User searches also come in three main varieties: • Users search for words in their main language. • Users search for words in a different language, but expect results in their main language. • Users search for words in a different language, and expect results in that language. Multilingual documents
  62. PUT /blogs-en { "mappings": { "post": { "properties": { "title":

    { "type": "string", "fields": { "stemmed": { "type": "string", "analyzer": "english" } }}}}}} One Language per Document
  63. PUT /blogs-fr { "mappings": { "post": { "properties": { "title":

    { "type": "string", "fields": { "stemmed": { "type": "string", "analyzer": "french" } }}}}}} One Language per Document
  64. GET /blogs-*/post/_search { "query": { "multi_match": { "query": "deja vu",

    "fields": [ "title", "title.stemmed" ] "type": "most_fields" } } } One Language per Document
  65. PUT /movies { "mappings": { "movie": { "properties": { "title":

    { "type": "string"}, "title_br": { "type": "string", "analyzer": "brazilian" }, "title_cz": { "type": "string", "analyzer": "czech" }, "title_en": { "type": "string", "analyzer": "english" }, "title_es": { "type": "string", "analyzer": "spanish" } } } } } One Language per Field
  66. GET /movies/movie/_search { "query": { "multi_match": { "query": "club de

    la lucha", "fields": [ "title*"] "type": "most_fields" } } } One Language per Field
  67. PUT /movies { "mappings": { "movie": { "properties": { "title":

    { "type": "string", "fields": { "de": { "type": "string", "analyzer": "german" }, "en": { "type": "string", "analyzer": "english" }, "fr": { "type": "string", "analyzer": "french" }, "es": { "type": "string", "analyzer": "spanish" } }}}}}} Mixed-Language Fields
  68. GET /movies/movie/_search { "query": { "multi_match": { "query": "club de

    la lucha", "fields": [ "title*"] "type": "most_fields“ "minimum_should_match": "75%" } } } Mixed-Language Fields
  69. • Compact Language Detector (CLD) from Google • Open source

    – Apache License 2.0 • It is small, fast, and accurate, and can detect 160+ languages from as little as two sentences. • It can even detect multiple languages within a single block of text • https://github.com/CLD2Owners/cld2 Identifying Language
  70. Query DSL • Structure • Introducing the Query • Executing

    Searches and Filters • Match All queries • Full text queries • Term level queries • Compound queries • Joining queries • Geo queries
  71. Structure • Based on JSON • Flexible • Powerful •

    Leaf and Compound query clauses • Query and Filter context
  72. Query Context • Relevance – Score • Full text •

    Not cached • Slower "How well does this document match this query clause?" Structure
  73. Filter Context • Boolean true/false • Exact values • Cached

    • Faster "Does this document match this query clause?" Structure
  74. Introducing the Query GET /website/blog/_search { "query": { ... }

    } GET /website/blog/_search { "query" : { "match_all": {} } }
  75. Introducing the Query GET /website/blog/_search { "query" : { "match_all":

    {}, "from" : 10, "size" : 10, "sort": { "title" : { "order" : "desc" } } } }
  76. Executing Searches • Returns all documents in the account type

    within bank index GET /bank/account/_search { "query": { "match_all": {} } } • Returns all documents in the bank index GET /bank/_search { "query": { "match_all": {} } }
  77. Executing Searches • Returns the account numbered 20 GET /bank/account/_search

    { "query": { "match": { "account" : 20 } } } • Returns all accounts containing the term "mill" in the address GET /bank/account/_search { "query": { "match" : { "address" : "mill" } } }
  78. Executing Searches • Returns all accounts containing the term "mill"

    or "lane" in the address GET /bank/account/_search { "query": { "match": { "address": "mill lane" } } } • Returns all accounts containing the phrase "mill lane" in the address GET /bank/account/_search { "query": { "match_phrase": { "address": "mill lane" } } }
  79. Executing Searches • Returns all accounts containing "mill" and "lane"

    in the address GET /bank/account/_search { "query": { "bool": { "must": [ { "match": { "address": "mill" } }, { "match": { "address": "lane" } } ] } } }
  80. Executing Searches • Returns all accounts containing "mill" or "lane"

    in the address GET /bank/account/_search { "query": { "bool": { "should": [ { "match": { "address": "mill" } }, { "match": { "address": "lane" } } ] } } }
  81. Executing Searches • Returns all accounts that contain neither "mill"

    nor "lane" in the address GET /bank/account/_search { "query": { "bool": { "must_not": [ { "match": { "address": "mill" } }, { "match": { "address": "lane" } } ] } } }
  82. Executing Searches • Returns all accounts of anybody who is

    40 years old but doesn’t live in ID(aho) GET /bank/account/_search { "query": { "bool": { "must": [ { "match": { "age": "40" } } ], "must_not": [ { "match": { "state": "ID" } } ] } } }
  83. • Returns all accounts with balances between 20000 and 30000,

    inclusive. GET /bank/account/_search { "query": { "bool": { "must": { "match_all": {} }, "filter": { "range": { "balance": { "gte": 20000, "lte": 30000 } } } } } } Executing Filters
  84. GET /bank/account/_search { "query": { "bool": { "must": [ {

    "match": { "title": "Search" }}, { "match": { "content": "Elasticsearch" }} ], "filter": [ { "term": { "status": "published" }}, { "range": { "publish_date": { "gte": "2015-01-01" }}} ] } } } Query and Filter Context
  85. • Matches all documents, giving them all a _score of

    1.0. GET /bank/account/_search { "query": { "match_all": {} } } • The inverse of the match_all query, which matches no documents. GET /bank/account/_search { "query": { "match_none": {} } } Match All Queries
  86. Full text queries The high-level full text queries understand how

    the field being queried is analyzed and will apply each field's analyzer (or search_analyzer) to the query string before executing. • match • match_phrase • match_phrase_prefix • multi_match • common_terms • query_string • simple_query_string
  87. Match Query GET /_search { "query": { "match" : {

    "message" : "QUICK BROWN FOX" } } } GET /_search { "query": { "match" : { "message" : "QUICK BROWN FOX" }, "operator" : "and" } } • minimum_should_match • fuzziness - levenshtein edit distance: kiuck > qiuck > quick • zero_terms_query
  88. Match Query GET /_search { "query": { "match" : {

    "message" : "QUICK BROWN FOX", "operator" : "and" } } } • minimum_should_match • fuzziness - levenshtein edit distance: kiuck > qiuck > quick • zero_terms_query
  89. Match Phrase Query Analyzes the text and creates a phrase

    query out of the analyzed text A phrase query matches terms up to a configurable slop (which defaults to 0) in any order. GET /_search { "query": { "match_phrase" : { "message" : "QUICK BROWN FOX" } } }
  90. Match Phrase Query GET /_search { "query": { "match_phrase" :

    { "message" : "BROWN QUICK FOX", "slop" : "10" } } }
  91. Multi Match Query Match query on multiple fields GET '/_search

    { "query" : { "multi_match" : { "query" : "this is a test", "fields" : [ "subject", "message" ] } } }
  92. Multi Match Query GET /_search { "query": { "multi_match" :

    { "query": "brown fox", "type": "best_fields", "fields": [ "subject", "message" ] } } } • best_fields • most_fields • cross_fields • phrase • phrase_prefix
  93. Term level queries The term-level queries operate on the exact

    terms that are stored in the inverted index. Used for structured data like numbers, dates, and enums. • wildcard • regexp • *fuzzy • type • ids • term • terms • range • exists • prefix
  94. Term Query Lets create a document for this example PUT

    my_index/my_type/1 { "full_text": "Quick Foxes!", "exact_value": "Quick Foxes!" } • full_text - inverted index will contain the terms: [quick, foxes] • exact_value - inverted index will contain the exact term: [Quick Foxes!]
  95. Term Query This query matches because the exact_value field contains

    the exact term Quick Foxes! GET my_index/my_type/_search { "query": { "term": { "exact_value": "Quick Foxes!" } } } • exact_value - inverted index will contain the exact term: [Quick Foxes!]
  96. Term Query This query does not match, because the full_text

    field only contains the terms quick and foxes. It does not contain the exact term Quick Foxes! GET my_index/my_type/_search { "query": { "term": { "full_text": "Quick Foxes!" } } } • full_text - inverted index will contain the terms: [quick, foxes]
  97. Term Query A term query for the term foxes matches

    the full_text field. GET my_index/my_type/_search { "query": { "term": { "full_text": "foxes" } } } • full_text - inverted index will contain the terms: [quick, foxes]
  98. Term Query This match query on the full_text field first

    analyzes the query string, then looks for documents containing quick or foxes or both. GET my_index/my_type/_search { "query": { "match": { "full_text": "Quick Foxes!" } } } • full_text - inverted index will contain the terms: [quick, foxes]
  99. Ranges on date fields GET _search { "query": { "range"

    : { "date" : { "gte" : "now-1d/d", "lt" : "now/d" } } } } • date math • date format • timezone
  100. Exists Query Returns documents that have at least one non-null

    value in the original field GET /_search { "query": { "exists" : { "field" : "user" } } } These documents would all match the query { "user": "jane" } { "user": "" } { "user": "-" } { "user": ["jane"] } { "user": ["jane", null ] } These documents would not match the query: { "user": null } { "user": [] } { "user": [null] }
  101. Compound queries Compound queries wrap other compound or queries, either

    to combine their results and scores, to change their behaviour, or to switch from query to filter context. • constant_score • bool • dis_max • function_score • boosting • indices
  102. Bool Query A query that matches documents matching boolean combinations

    of other queries. • must • filter • should • must_not
  103. Bool Query POST _search { "query": { "bool" : {

    "must" : { "term" : { "user" : "SHAGABA" } }, "filter": { "term" : { "tag" : "tikal" } }, "must_not" : { "range" : { "age" : { "gte" : 1, "lte" : 21} } }, "should" : [ { "term" : { "tag" : "spark" } }, { "term" : { "tag" : "elasticsearch" } } ] } } }
  104. Summary • Elasticsearch Concepts • Getting started • CRUD •

    Inverted index • Text analysis • Analyzers • Mapping • Query DSL
  105. Links • Elastic • https://www.elastic.co/ • Elasticsearch: The Definitive Guide

    • https://www.elastic.co/guide/en/elasticsearch/guide/current/index.html • Elasticsearch Reference – 5.1 • https://www.elastic.co/guide/en/elasticsearch/reference/5.1/index.html • Elastic video & webinars • https://www.elastic.co/videos