Save 37% off PRO during our Black Friday Sale! »

Elasticsearch - Diagram Time!

Dd9d954997353b37b4c2684f478192d3?s=47 Elastic Co
June 23, 2017
170

Elasticsearch - Diagram Time!

Presented at Southeast LinuxFest on Friday, June 9

Dd9d954997353b37b4c2684f478192d3?s=128

Elastic Co

June 23, 2017
Tweet

Transcript

  1. ‹#› Nik Everett, Software Engineer 2017-06-09
 Elasticsearch:
 Diagram time!

  2. Elasticsearch is a distributed search and analytics engine. 2

  3. Indexing a document 3 curl \ -XPUT \ -H "Content-Type:

    Application/json" \ localhost:9200/enwiki/doc/1 \ -d \ '{ "title":"Cat", "text":"The domestic cat (Latin: Felis catus) is a small, typically furry, carnivorous mammal." }'
  4. Indexing a document (CONSOLE syntax) 4 PUT /enwiki/doc/1 { "title":"Cat",

    "text":"The domestic cat (Latin: Felis catus) is a small, typically furry, carnivorous mammal.", "popularity_score":2.990724241035e-5 }
  5. Indexing a document PUT /enwiki/doc/1 5 Node 0 enwiki 0

    enwiki 1 Node 1 enwiki 0 Node 2 enwiki 1
  6. Getting a document GET /enwiki/doc/1 6 Node 0 enwiki 0

    enwiki 1 Node 1 enwiki 0 Node 2 enwiki 1
  7. Searching 7 POST /enwiki/_search { "query": { "match": { "text":

    "small domestic" } } } A simple search
  8. Searching POST /enwiki/_search 8 Node 0 enwiki 0 enwiki 1

    Node 1 enwiki 0 Node 2 enwiki 1
  9. Searching POST /enwiki/_search 9 Node 0 Node 1 Node 2

    Query Phase Fetch Phase
  10. Searching Query Phase: Finding 10 Term Document Ids a 1,

    2, 3, 4, 5, 6, 7, 8, 9, 10 cat 1, 4, 9, 10 carnivorous 1, 6, 10 domestic 1, 4, 8, 9 furry 1, 8, 10 c a r t n i … s d
  11. Searching Fetch Phase: Stored fields 11 Docs Stored Fields 1-3

    id: 1, type: doc, _source:{“title”: “Cat”, “text”: “The domestic cat…”} id: 2, type: doc, _source:{“title”: “Dog”, “text”: “The domestic dog…”} id: 3, type: doc, _source:{“title”: “Bird”, “Birds (aves), a subgroup of reptiles…”} 4-10 …
  12. Searching 12 POST /enwiki/_search { "query": {"match": {"text": "small domestic"}},

    "aggs": { "score": { "extended_stats": { "field": "popularity_score" } } } } Aggregations
  13. Searching Doc values 13 • stored fields: doc id →

    field → values • doc values: field → doc id → values • Go watch: • Amusing Algorithms and Details on Data Structures • All About Elasticsearch Algorithms and Data Structures
  14. Searching Segments 14

  15. Analysis Example The domestic cat (Latin: Felis catus) is a

    small, typically furry, carnivorous mammal. The domestic cat (Latin: Felis catus) is a small, typically furry, carnivorous mammal. the domestic cat (latin: felis catus) is a small, typically furry, carnivorous mammal. the domestic cat latin felis catus is a small typically furry carnivorous mammal the domst cat latin feli catu is a small typic furri carnivor mammal POST /enwiki/_search { "query": { "term": {"text": "carnivore"} } } POST /enwiki/_search { "query": { "match": {"text": "carnivore"} } }
  16. Analysis Boost precise matches the domestic cat latin felis catus

    is a small typically furry carnivorous mammal the domst cat latin feli catu is a small typic furri carnivor mammal POST /enwiki/_search { "query": { "bool": { "should": [ {"match":{"text.precise": {"query": "carnivore", "boost": 5}}}, {"match":{"text.stemmed": {"query": "carnivore", "boost": 1}}} ] } } }
  17. Summary 17 • Requests bounce from node to node asynchronously

    • Indexes logically hold documents • Shard physically hold documents • Data is written to disk multiple times in different ways to optimize access patterns • Each data structure is immutable and optimized by a background process • Analyze text for finding things better and faster
  18. ‹#› Except where otherwise noted, this work is licensed under

    http://creativecommons.org/licenses/by-nd/4.0/ Creative Commons and the double C in a circle are registered trademarks of Creative Commons in the United States and other countries. Third party marks and brands are the property of their respective holders. 18