Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch workshop

David Pilato
October 04, 2013

Elasticsearch workshop

Elasticsearch basic workshop: discover elasticsearch

David Pilato

October 04, 2013
Tweet

More Decks by David Pilato

Other Decks in Programming

Transcript

  1. Elasticsearch - The Company • Founded in 2012 • By

    the people behind the elasticsearch project • http://www.elasticsearch.com • Professional services • Training (public & onsite) • Development support • Production support subscription • targeting production • 3 levels of SLAs • differing in response times and availability dimanche 6 octobre 13
  2. Agenda • Elasticsearch overview • Workshop 0: getting started •

    Workshop 1: let’s index some documents • Workshop 2: let’s search them • Workshop 3: let’s pull some analytics • Workshop 4: let’s add a powerful live UI on top dimanche 6 octobre 13
  3. A search engine • A search engine: • Create indexes

    from documents • Search in indexes dimanche 6 octobre 13
  4. Elasticsearch • Cloud based search engine • Based on Lucene

    • Hide Lucene complexity by exposing all services • HTTP / REST / JSON • Works with all technologies • Horizontal scaling, replication, fail over • Blazing fast! • It’s a search engine! Not a search tool in a box! dimanche 6 octobre 13
  5. Think document! • Change your mindset: • Forget SQL! •

    Index what you want to find • A document: • A JSON Object • Core field types (Strings, Numbers, Booleans) • Complex field types (Arrays, Objects) • Additional field types (GeoPoints, Binaries, Attachments) dimanche 6 octobre 13
  6. Organize your documents! • Documents coordinates: • index (holds setup)

    • type (holds mapping) • id (can be auto-generated) { "text" : "Welcome the the #elasticsearch #workshop", "created_at": "2012-04-06T20:45:36.000Z", "truncated": false, "retweet_count": 34, "hashtag": [ { "text": "elasticsearch", "start": 27, "end": 40 },! { "text": "workshop", "start": 47, "end": 55 } ], "user": {! "id": 51172224, "name": "David Pilato",! "screen_name": "dadoonet" } } dimanche 6 octobre 13
  7. Glossary • Node • a running Elasticsearch instance (JVM process)

    • Cluster • a group of nodes • Shard • a part of an index • a Lucene index under the hood • primary: unique in the cluster • replica: one or more copy of the primary dimanche 6 octobre 13
  8. Workshop 0: setup • Get elasticsearch 0.90.5 • Edit config/elasticsearch.yml

    • Install head plugin curl -OL -k http://download.elasticsearch.org/elasticsearch/elasticsearch/ elasticsearch-0.90.5.zip cluster.name: handson discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["localhost"] bin/plugin -install mobz/elasticsearch-head dimanche 6 octobre 13
  9. Workshop 0: play with nodes • Start an elasticsearch node

    • Open head bin/elasticsearch -f open http://localhost:9200/_plugin/head/ dimanche 6 octobre 13
  10. Indexing a document curl -XPOST localhost:9200/person/person -d '{ "name":"Anaelle Alessio",

    "dateOfBirth":"2009-09-05", "gender":"female", "marketing":{ "shoes":1000, "fashion":1200, "music":800 }, "address":{ "country":"England", "zipcode":"5226", "city":"Plymouth", "countrycode":"GB" } }' { "ok":true, "_index":"person", "_type":"person", "_id":"zvb2udm2TSa8Zyp9LnD1nA", "_version":1 } dimanche 6 octobre 13
  11. Getting a document curl localhost:9200/person/person/zvb2udm2TSa8Zyp9LnD1nA?pretty { "_index":"person", "_type":"person", "_id":"zvb2udm2TSa8Zyp9LnD1nA", "_version":1,

    "exists":true, "_source":{ "name":"Anaelle Alessio", "dateOfBirth":"2009-09-05", "gender":"female", "marketing":{ "shoes":1000, "fashion":1200, "music":800 }, "address":{ "country":"England", "zipcode":"5226", "city":"Plymouth", "countrycode":"GB" } } } dimanche 6 octobre 13
  12. Updating a document curl -XPUT localhost:9200/person/person/zvb2udm2TSa8Zyp9LnD1nA -d '{ "name":"Anaelle Alessio",

    "dateOfBirth":"2009-09-05", "gender":"female", "marketing":{ "shoes":1001, "fashion":1200, "music":800 }, "address":{ "country":"England", "zipcode":"5226", "city":"Plymouth", "countrycode":"GB" } }' { "ok":true, "_index":"person", "_type":"person", "_id":"zvb2udm2TSa8Zyp9LnD1nA", "_version":2 } dimanche 6 octobre 13
  13. Deleting a document curl -XDELETE localhost:9200/person/person/zvb2udm2TSa8Zyp9LnD1nA { "ok":true, "found" :

    true, "_index":"person", "_type":"person", "_id":"zvb2udm2TSa8Zyp9LnD1nA", "_version":3 } dimanche 6 octobre 13
  14. Workshop 1: Index some persons curl -XPUT localhost:9200/person/person/1 -d '{

    "name":"Anaelle Alessio" }' curl -XPUT localhost:9200/person/person/1 -d '{ "name":"Anaelle Alessio", "dateOfBirth":"2009-09-05" }' curl -XPUT localhost:9200/person/person/2 -d '{ "name":"Joe Smith" }' curl -XPUT localhost:9200/person/person/2 -d '{ "name":"Joe Smith", "gender":"male" }' dimanche 6 octobre 13
  15. Workshop 1: 500 000 persons • Use injector script •

    See effect in head plugin • You can start more than one node open http://localhost:9200/_plugin/head/ java -jar injector.jar 500000 10000 bin/elasticsearch -f bin/elasticsearch -f bin/elasticsearch -f ... dimanche 6 octobre 13
  16. Search types Name Description Match All Get all elements (useful

    associated with filters) QueryString Full text search (analyzed). Wildcards allowed (Lucene syntax: +, -, FROM, TO, ^) Term Search for a Term within a field (not analyzed) Match Search for text within a field (analyzed) (OR search by default) Wildcard Search with wildcards (*, ?) Bool Multi criteria search (MUST, MUST NOT, SHOULD) Range Range search (>, >=, <, <=) Prefix Begin with search (more efficient than wildcard*) Filtered Apply filters on queries (filters are cached!) Fuzzy like this Approximate matching (think misspelling) dimanche 6 octobre 13
  17. Searching persons in Germany curl localhost:9200/person/person/_search?pretty -d '{ "query": {

    "term": { "address.country": { "value": "Germany" } } } }' { "took" : 3, "hits" : { "total" : 0, "max_score" : null, "hits" : [ ] } } dimanche 6 octobre 13
  18. Searching persons in germany curl localhost:9200/person/person/_search?pretty -d '{ "query": {

    "term": { "address.country": { "value": "germany" } } } }' { "took" : 4, "hits" : { "total" : 30004, "max_score" : 2.100946, "hits" : [ { "_index" : "person", "_type" : "person", "_id" : "SUy7Py3zSvqhjQroJPVFCw", "_score" : 2.100946, "_source" : {"name":"Fadi Norah", "address": {"country":"Germany"}}}, ... ] } } dimanche 6 octobre 13
  19. Searching persons in Germany curl localhost:9200/person/person/_search?pretty -d '{ "query": {

    "match": { "address.country": "Germany" } } }' { "took" : 4, "hits" : { "total" : 30004, "max_score" : 2.100946, "hits" : [ { "_index" : "person", "_type" : "person", "_id" : "SUy7Py3zSvqhjQroJPVFCw", "_score" : 2.100946, "_source" : {"name":"Fadi Norah", "address": {"country":"Germany"}}}, ... ] } } dimanche 6 octobre 13
  20. Workshop 2: Search for persons • Born in 1970 and

    live in Germany curl localhost:9200/person/person/_search?pretty -d '{ "query": { "bool": { "must": [ { "match": { "address.country": "Germany" } }, { "range": { "dateOfBirth": { "from": "1970", "to": "1971" } } } ] } } }' dimanche 6 octobre 13
  21. Break by country curl "localhost:9200/person/person/_search?search_type=count&pretty" -d '{ "facets": { "by_country":

    { "terms": { "field": "address.country" } } } }' { ..., "facets" : { "by_country" : { "_type" : "terms", "missing" : 0, "total" : 90001, "other" : 0, "terms" : [ { "term" : "england", "count" : 30051 }, { "term" : "germany", "count" : 30004 }, { "term" : "france", "count" : 15034 }, { "term" : "spain", "count" : 14912 } ]}}} 17% 17% 33% 33% england germany france spain dimanche 6 octobre 13
  22. Date of birth histogram curl "localhost:9200/person/person/_search?search_type=count&pretty" -d '{ "facets": {

    "by_date": { "date_histogram": { "field": "dateOfBirth", "interval": "3650d" } } } }' { ..., "facets": { "by_date": { "_type": "date_histogram", "entries": [ { "time": -946080000000, "count": 39 }, { "time": -630720000000, "count": 12677 }, { "time": -315360000000, "count": 12936 }, ... ] } }} 0 7500 15000 22500 30000 1940 1960 1980 2000 dimanche 6 octobre 13
  23. Workshop 2: Search for persons • Born in 1970 and

    live in Germany with gender repartition curl localhost:9200/person/person/_search?pretty -d '{ "query": { "bool": { "must": [ { "match": { "address.country": "Germany"} }, { "range": { "dateOfBirth": { "from": "1970", "to": "1971" }}} ]}}, "facets": { "by_gender": { "terms": { "field": "gender" } } } }' dimanche 6 octobre 13
  24. Workshop 0: setup • Get kibana • Open kibana •

    Build your dashboard as you need! bin/plugin -install elasticsearch/kibana # or curl -OL -k http://download.elasticsearch.org/kibana/kibana/kibana-latest.zip open http://localhost:9200/_plugin/kibana/ dimanche 6 octobre 13