Slide 1

Slide 1 text

Karel Minařík elasticsearch in 15 minutes

Slide 2

Slide 2 text

Plug & Play

Slide 3

Slide 3 text

Installation $ wget https://download.elasticsearch.org/... $ tar -xf elasticsearch-0.90.2.tar.gz $ ./elasticsearch-0.90.2/bin/elasticsearch -f ... [INFO ][node][Ghost Maker] {0.90.2}[5645]: initializing ...

Slide 4

Slide 4 text

Index a document... $ curl -X PUT localhost:9200/products/product/1 -d '{ "title" : "Welcome!" }'

Slide 5

Slide 5 text

Update a document... $ curl -X PUT localhost:9200/products/product/1 -d '{ "title" : "Welcome to the Elasticsearch meetup!" }'

Slide 6

Slide 6 text

Search for documents.... $ curl -X GET localhost:9200/products/_search?q=welcome

Slide 7

Slide 7 text

Add a node... $ ./elasticsearch-0.90.2/bin/elasticsearch -f -D es.node.name=Node2 ...[cluster.service] [Node2] detected_master [Node1] ...

Slide 8

Slide 8 text

Add another node... $ ./elasticsearch-0.90.2/bin/elasticsearch -f -D es.node.name=Node3 ...[cluster.service] [Node3] detected_master [Node1] ...

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

Until you know what to tweak...

Slide 11

Slide 11 text

Shard & Cluster

Slide 12

Slide 12 text

A curl  -­‐XPUT  'http://localhost:9200/a/'  -­‐d  '{        "settings"  :  {                "index"  :  {                        "number_of_shards"      :  3,                        "number_of_replicas"  :  1                }        } }' Index is partitioned into 3 primary shards, each is duplicated in 1 replica shard A1 A2 A3 Replicas Primaries A1' A2' A3'

Slide 13

Slide 13 text

1 node 2 nodes 3 nodes Demo "index.routing.allocation.exclude.name"      :  "Node1" "cluster.routing.allocation.exclude.name"  :  "Node3" ... http://git.io/elasticat

Slide 14

Slide 14 text

JSON & HTTP

Slide 15

Slide 15 text

{    "id"        :  "abc123",    "title"  :  "A  JSON  Document",    "body"    :  "A  JSON  document  is  a  ...",    "published_on"  :  "2013/06/27  10:00:00",    "featured"          :  true,        "tags"    :  ["search",  "json"],    "author"  :  {        "first_name"  :  "Clara",        "last_name"    :  "Rice",        "email"            :  "[email protected]"    } } Documents as JSON Data structure with basic types, arrays and deep hierarchies

Slide 16

Slide 16 text

Documents as JSON https://wiki.postgresql.org/images/b/b4/Pg-as-nosql-pgday-fosdem-2013.pdf

Slide 17

Slide 17 text

http:// Lingua Franca of APIs Also supported: Native Java protocol, Thrift, Memcached

Slide 18

Slide 18 text

Search & Find

Slide 19

Slide 19 text

Terms apple apple  iphone Phrases "apple  iphone" Proximity "apple  safari"~5 Fuzzy apple~0.8 Wildcards app* *pp* Boosting apple^10  safari Range [2011/05/01  TO  2011/05/31] [java  TO  json] Boolean apple  AND  NOT  iphone +apple  -­‐iphone (apple  OR  iphone)  AND  NOT  review Fields title:iphone^15  OR  body:iphone published_on:[2011/05/01  TO  "2011/05/27  10:00:00"] http://lucene.apache.org/java/3_1_0/queryparsersyntax.html $  curl  -­‐X  GET  "http://localhost:9200/_search?q="

Slide 20

Slide 20 text

curl  -­‐X  GET  localhost:9200/articles/_search  -­‐d  '{ "query" : { "filtered" : { "query" : { "bool" : { "must" : { "match" : { "author.first_name" : { "query" : "claire", "fuzziness" : 0.1 } } }, "must" : { "multi_match" : { "query" : "elasticsearch", "fields" : ["title^10", "body"] } } } }, "filter": { "and" : [ { "terms" : { "tags" : ["search"] } }, { "range" : { "published_on": {"from": "2013"} } }, { "term" : { "featured" : true } } ] } } } }' JSON-based Query DSL

Slide 21

Slide 21 text

curl  -­‐X  GET  localhost:9200/articles/_search  -­‐d  '{ "query" : { "filtered" : { "query" : { "bool" : { "must" : { "match" : { "author.first_name" : { "query" : "claire", "fuzziness" : 0.1 } } }, "must" : { "multi_match" : { "query" : "elasticsearch", "fields" : ["title^10", "body"] } } } }, "filter": { "and" : [ { "terms" : { "tags" : ["search"] } }, { "range" : { "published_on": {"from": "2013"} } }, { "term" : { "featured" : true } } ] } } } }' JSON-based Query DSL

Slide 22

Slide 22 text

curl  -­‐X  GET  localhost:9200/articles/_search  -­‐d  '{ "query" : { "filtered" : { "query" : { "bool" : { "must" : { "match" : { "author.first_name" : { "query" : "claire", "fuzziness" : 0.1 } } }, "must" : { "multi_match" : { "query" : "elasticsearch", "fields" : ["title^10", "body"] } } } }, "filter": { "and" : [ { "terms" : { "tags" : ["search"] } }, { "range" : { "published_on": {"from": "2013"} } }, { "term" : { "featured" : true } } ] } } } }' JSON-based Query DSL

Slide 23

Slide 23 text

curl  -­‐X  GET  localhost:9200/articles/_search  -­‐d  '{ "query" : { "filtered" : { "query" : { "bool" : { "must" : { "match" : { "author.first_name" : { "query" : "claire", "fuzziness" : 0.1 } } }, "must" : { "multi_match" : { "query" : "elasticsearch", "fields" : ["title^10", "body"] } } } }, "filter": { "and" : [ { "terms" : { "tags" : ["search"] } }, { "range" : { "published_on": {"from": "2013"} } }, { "term" : { "featured" : true } } ] } } } }' JSON-based Query DSL

Slide 24

Slide 24 text

curl  -­‐X  GET  localhost:9200/articles/_search  -­‐d  '{ "query" : { "filtered" : { "query" : { "bool" : { "must" : { "match" : { "author.first_name" : { "query" : "claire", "fuzziness" : 0.1 } } }, "must" : { "multi_match" : { "query" : "elasticsearch", "fields" : ["title^10", "body"] } } } }, "filter": { "and" : [ { "terms" : { "tags" : ["search"] } }, { "range" : { "published_on": {"from": "2013"} } }, { "term" : { "featured" : true } } ] } } } }' JSON-based Query DSL

Slide 25

Slide 25 text

“Find all articles with ‘search’ in their title or body, give matches in titles higher score” Full-text Search “Find all articles from year 2013 tagged ‘search’” Structured Search See custom_score and custom_filters_score queries Custom Scoring

Slide 26

Slide 26 text

Fetch document field ➝ Pick configured analyzer ➝ Parse text into tokens ➝ Apply token filters ➝ Store into index How Search Engine Works? Result Results Query How Users See Search?

Slide 27

Slide 27 text

Mapping curl -X PUT localhost:9200/articles/_mapping -d '{ "article" : { "properties" : { "title" : { "type" : "string", "analyzer" : "czech" } } } }' Configuring document properties for the search engine

Slide 28

Slide 28 text

_analyze?pretty&format=text&text=Žluťoučký+kůň+skákal+přes+potok The _analyze API [žluťoučký:0-­‐>9:] \n\n2:  \n[kůň:10-­‐ >13:]\n\n3:   \n[skákal:14-­‐>20:] \n\n4:  \n[přes:21-­‐ >25:]\n\n5:   \n[potok:26-­‐>31:] _analyze?pretty&format=text&text=Žluťoučký+kůň+skákal+přes +potok&analyzer=czech [žluťoučk:0-­‐>9:]\n \n2:  \n[koň:10-­‐ >13:]\n\n3:   \n[skákal:14-­‐>20:] \n\n5:  \n[potok:26-­‐ >31:]\n _analyze?text=...&tokenizer=X&filters=A,B,C

Slide 29

Slide 29 text

Slice & Dice

Slide 30

Slide 30 text

Query Facets

Slide 31

Slide 31 text

Location Product Tim e OLAP Cube Dimensions, measures, aggregations

Slide 32

Slide 32 text

Slice Dice Drill Down / Roll Up Show me sales numbers for all products across all locations in year 2013 Show me product A sales numbers across all locations over all years Show me products sales numbers in location X over all years

Slide 33

Slide 33 text

curl -X POST 'localhost:9200/articles/_search?search_type=count&pretty' -d '{ "facets": { "tag-cloug": { "terms" : { "field" : "tags" } } } }' “Tag Cloud” With the terms Facet "facets"  :  {        "tag-­‐cloug"  :  {            "terms"  :  [  {                "term"  :  "ruby",                "count"  :  3            },  {                "term"  :  "java",                "count"  :  2            },            ...            }  ]        }    } Simplest “map/reduce” aggregation: document count per tag

Slide 34

Slide 34 text

curl -X GET 'localhost:9200/scores/_search/?search_type=count&pretty' -d '{ "facets": { "scores-per-subject" : { "terms_stats" : { "key_field" : "subject", "value_field" : "score" } } } }' Statistics on Student Scores With the terms_stats Facet "facets"  :  {        "scores-­‐per-­‐subject"  :  {            "_type"  :  "terms_stats",            "missing"  :  0,            "terms"  :  [  {                "term"  :  "math",                "count"  :  4,                "total_count"  :  4,                "min"  :  25.0,                "max"  :  92.0,                "total"  :  267.0,                "mean"  :  66.75            },  ...  ]        }    } Aggregating statistics per subject

Slide 35

Slide 35 text

curl -X GET 'localhost:9200/demo-scores/_search/?search_type=count&pretty' '{ "query" : { "match" : { "student" : "john" } }, "facets": { "scores-per-subject" : { "terms_stats" : { "key_field" : "subject", "value_field" : "score" } } } }' Statistics on Student Scores With the terms_stats Facet "facets"  :  {        "scores-­‐per-­‐subject"  :  {            "_type"  :  "terms_stats",            "missing"  :  0,            "terms"  :  [    {                "term"  :  "math",                "count"  :  1,                "total_count"  :  1,                "min"  :  85.0,                "max"  :  85.0,                "total"  :  85.0,                "mean"  :  85.0            },  ...  ]        }    } Realtime filtering with queries and filters

Slide 36

Slide 36 text

Facets Terms Terms Stats Statistical Range Histogram Date Histogram Filter Query Geo Distance

Slide 37

Slide 37 text

Above & Beyond

Slide 38

Slide 38 text

Above & Beyond Bulk operations (For indexing and search operations) Percolator (“reversed search” — alerts, classification, …) Suggesters (“Did you mean …?”) Index aliases (Grouping or “renaming” of indices) Index templates (Automatic index configuration) Monitoring API (Amount of memory used, number of operations, …) …

Slide 39

Slide 39 text

thanks!