Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch workshop

David Pilato
October 04, 2013

Elasticsearch workshop

Elasticsearch basic workshop: discover elasticsearch

David Pilato

October 04, 2013
Tweet

More Decks by David Pilato

Other Decks in Programming

Transcript

  1. Elasticsearch
    Workshop
    dimanche 6 octobre 13

    View full-size slide

  2. Elasticsearch - The Company
    • Founded in 2012
    • By the people behind the elasticsearch project
    • http://www.elasticsearch.com
    • Professional services
    • Training (public & onsite)
    • Development support
    • Production support subscription
    • targeting production
    • 3 levels of SLAs
    • differing in response times and availability
    dimanche 6 octobre 13

    View full-size slide

  3. Agenda
    • Elasticsearch overview
    • Workshop 0: getting started
    • Workshop 1: let’s index some documents
    • Workshop 2: let’s search them
    • Workshop 3: let’s pull some analytics
    • Workshop 4: let’s add a powerful live UI on top
    dimanche 6 octobre 13

    View full-size slide

  4. elasticsearch
    fundamentals
    dimanche 6 octobre 13

    View full-size slide

  5. A search engine
    • A search engine:
    • Create indexes from documents
    • Search in indexes
    dimanche 6 octobre 13

    View full-size slide

  6. Elasticsearch
    • Cloud based search engine
    • Based on Lucene
    • Hide Lucene complexity by exposing all services
    • HTTP / REST / JSON
    • Works with all technologies
    • Horizontal scaling, replication, fail over
    • Blazing fast!
    • It’s a search engine! Not a search tool in a box!
    dimanche 6 octobre 13

    View full-size slide

  7. Think document!
    • Change your mindset:
    • Forget SQL!
    • Index what you want to find
    • A document:
    • A JSON Object
    • Core field types (Strings, Numbers, Booleans)
    • Complex field types (Arrays, Objects)
    • Additional field types (GeoPoints, Binaries, Attachments)
    dimanche 6 octobre 13

    View full-size slide

  8. Organize your documents!
    • Documents coordinates:
    • index (holds setup)
    • type (holds mapping)
    • id (can be auto-generated)
    {
    "text" : "Welcome the the #elasticsearch #workshop",
    "created_at": "2012-04-06T20:45:36.000Z",
    "truncated": false,
    "retweet_count": 34,
    "hashtag": [
    { "text": "elasticsearch", "start": 27, "end": 40 },!
    { "text": "workshop", "start": 47, "end": 55 }
    ],
    "user": {!
    "id": 51172224,
    "name": "David Pilato",!
    "screen_name": "dadoonet"
    }
    }
    dimanche 6 octobre 13

    View full-size slide

  9. Glossary
    • Node
    • a running Elasticsearch instance (JVM process)
    • Cluster
    • a group of nodes
    • Shard
    • a part of an index
    • a Lucene index under the hood
    • primary: unique in the cluster
    • replica: one or more copy of the primary
    dimanche 6 octobre 13

    View full-size slide

  10. Workshop 0
    setup
    dimanche 6 octobre 13

    View full-size slide

  11. Workshop 0: setup
    • Get elasticsearch 0.90.5
    • Edit config/elasticsearch.yml
    • Install head plugin
    curl -OL -k http://download.elasticsearch.org/elasticsearch/elasticsearch/
    elasticsearch-0.90.5.zip
    cluster.name: handson
    discovery.zen.ping.multicast.enabled: false
    discovery.zen.ping.unicast.hosts: ["localhost"]
    bin/plugin -install mobz/elasticsearch-head
    dimanche 6 octobre 13

    View full-size slide

  12. Workshop 0: play with nodes
    • Start an elasticsearch node
    • Open head
    bin/elasticsearch -f
    open http://localhost:9200/_plugin/head/
    dimanche 6 octobre 13

    View full-size slide

  13. Workshop 1
    We index persons
    dimanche 6 octobre 13

    View full-size slide

  14. Indexing a document
    curl -XPOST localhost:9200/person/person -d '{
    "name":"Anaelle Alessio",
    "dateOfBirth":"2009-09-05",
    "gender":"female",
    "marketing":{
    "shoes":1000,
    "fashion":1200,
    "music":800
    },
    "address":{
    "country":"England",
    "zipcode":"5226",
    "city":"Plymouth",
    "countrycode":"GB"
    }
    }'
    {
    "ok":true,
    "_index":"person",
    "_type":"person",
    "_id":"zvb2udm2TSa8Zyp9LnD1nA",
    "_version":1
    }
    dimanche 6 octobre 13

    View full-size slide

  15. Getting a document
    curl localhost:9200/person/person/zvb2udm2TSa8Zyp9LnD1nA?pretty
    {
    "_index":"person",
    "_type":"person",
    "_id":"zvb2udm2TSa8Zyp9LnD1nA",
    "_version":1,
    "exists":true,
    "_source":{
    "name":"Anaelle Alessio",
    "dateOfBirth":"2009-09-05",
    "gender":"female",
    "marketing":{
    "shoes":1000,
    "fashion":1200,
    "music":800
    },
    "address":{
    "country":"England",
    "zipcode":"5226",
    "city":"Plymouth",
    "countrycode":"GB"
    }
    }
    }
    dimanche 6 octobre 13

    View full-size slide

  16. Updating a document
    curl -XPUT localhost:9200/person/person/zvb2udm2TSa8Zyp9LnD1nA -d '{
    "name":"Anaelle Alessio",
    "dateOfBirth":"2009-09-05",
    "gender":"female",
    "marketing":{
    "shoes":1001,
    "fashion":1200,
    "music":800
    },
    "address":{
    "country":"England",
    "zipcode":"5226",
    "city":"Plymouth",
    "countrycode":"GB"
    }
    }'
    {
    "ok":true,
    "_index":"person",
    "_type":"person",
    "_id":"zvb2udm2TSa8Zyp9LnD1nA",
    "_version":2
    }
    dimanche 6 octobre 13

    View full-size slide

  17. Deleting a document
    curl -XDELETE localhost:9200/person/person/zvb2udm2TSa8Zyp9LnD1nA
    {
    "ok":true,
    "found" : true,
    "_index":"person",
    "_type":"person",
    "_id":"zvb2udm2TSa8Zyp9LnD1nA",
    "_version":3
    }
    dimanche 6 octobre 13

    View full-size slide

  18. Workshop 1: Index some persons
    curl -XPUT localhost:9200/person/person/1 -d '{
    "name":"Anaelle Alessio"
    }'
    curl -XPUT localhost:9200/person/person/1 -d '{
    "name":"Anaelle Alessio",
    "dateOfBirth":"2009-09-05"
    }'
    curl -XPUT localhost:9200/person/person/2 -d '{
    "name":"Joe Smith"
    }'
    curl -XPUT localhost:9200/person/person/2 -d '{
    "name":"Joe Smith",
    "gender":"male"
    }'
    dimanche 6 octobre 13

    View full-size slide

  19. Workshop 1: 500 000 persons
    • Use injector script
    • See effect in head plugin
    • You can start more than one node
    open http://localhost:9200/_plugin/head/
    java -jar injector.jar 500000 10000
    bin/elasticsearch -f
    bin/elasticsearch -f
    bin/elasticsearch -f
    ...
    dimanche 6 octobre 13

    View full-size slide

  20. Workshop 2
    We search for persons
    dimanche 6 octobre 13

    View full-size slide

  21. Search types
    Name Description
    Match All Get all elements (useful associated with filters)
    QueryString
    Full text search (analyzed). Wildcards allowed (Lucene
    syntax: +, -, FROM, TO, ^)
    Term Search for a Term within a field (not analyzed)
    Match
    Search for text within a field (analyzed) (OR search by
    default)
    Wildcard Search with wildcards (*, ?)
    Bool Multi criteria search (MUST, MUST NOT, SHOULD)
    Range Range search (>, >=, <, <=)
    Prefix Begin with search (more efficient than wildcard*)
    Filtered Apply filters on queries (filters are cached!)
    Fuzzy like this Approximate matching (think misspelling)
    dimanche 6 octobre 13

    View full-size slide

  22. Searching persons in Germany
    curl localhost:9200/person/person/_search?pretty -d '{
    "query": {
    "term": {
    "address.country": {
    "value": "Germany"
    }
    }
    }
    }'
    {
    "took" : 3,
    "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
    }
    }
    dimanche 6 octobre 13

    View full-size slide

  23. Searching persons in germany
    curl localhost:9200/person/person/_search?pretty -d '{
    "query": {
    "term": {
    "address.country": {
    "value": "germany"
    }
    }
    }
    }'
    {
    "took" : 4,
    "hits" : {
    "total" : 30004,
    "max_score" : 2.100946,
    "hits" : [ {
    "_index" : "person",
    "_type" : "person",
    "_id" : "SUy7Py3zSvqhjQroJPVFCw",
    "_score" : 2.100946, "_source" : {"name":"Fadi Norah", "address":
    {"country":"Germany"}}}, ... ]
    }
    }
    dimanche 6 octobre 13

    View full-size slide

  24. Searching persons in Germany
    curl localhost:9200/person/person/_search?pretty -d '{
    "query": {
    "match": {
    "address.country": "Germany"
    }
    }
    }'
    {
    "took" : 4,
    "hits" : {
    "total" : 30004,
    "max_score" : 2.100946,
    "hits" : [ {
    "_index" : "person",
    "_type" : "person",
    "_id" : "SUy7Py3zSvqhjQroJPVFCw",
    "_score" : 2.100946, "_source" : {"name":"Fadi Norah", "address":
    {"country":"Germany"}}}, ... ]
    }
    }
    dimanche 6 octobre 13

    View full-size slide

  25. Workshop 2: Search for persons
    • Born in 1970 and live in Germany
    curl localhost:9200/person/person/_search?pretty -d '{
    "query": {
    "bool": {
    "must": [ {
    "match": {
    "address.country": "Germany"
    }
    }, {
    "range": {
    "dateOfBirth": {
    "from": "1970",
    "to": "1971"
    }
    }
    }
    ]
    }
    }
    }'
    dimanche 6 octobre 13

    View full-size slide

  26. Workshop 3
    Make sense of our data: facets!
    dimanche 6 octobre 13

    View full-size slide

  27. Break by country
    curl "localhost:9200/person/person/_search?search_type=count&pretty" -d '{
    "facets": {
    "by_country": {
    "terms": {
    "field": "address.country"
    }
    }
    }
    }'
    { ..., "facets" : {
    "by_country" : {
    "_type" : "terms",
    "missing" : 0,
    "total" : 90001,
    "other" : 0,
    "terms" : [ {
    "term" : "england",
    "count" : 30051
    }, {
    "term" : "germany",
    "count" : 30004
    }, {
    "term" : "france",
    "count" : 15034
    }, {
    "term" : "spain",
    "count" : 14912
    } ]}}}
    17%
    17%
    33%
    33% england
    germany
    france
    spain
    dimanche 6 octobre 13

    View full-size slide

  28. Date of birth histogram
    curl "localhost:9200/person/person/_search?search_type=count&pretty" -d '{
    "facets": {
    "by_date": {
    "date_histogram": {
    "field": "dateOfBirth",
    "interval": "3650d"
    }
    }
    }
    }'
    { ..., "facets": {
    "by_date": {
    "_type": "date_histogram",
    "entries": [
    {
    "time": -946080000000,
    "count": 39
    },
    {
    "time": -630720000000,
    "count": 12677
    },
    {
    "time": -315360000000,
    "count": 12936
    }, ...
    ]
    }
    }}
    0
    7500
    15000
    22500
    30000
    1940 1960 1980 2000
    dimanche 6 octobre 13

    View full-size slide

  29. Workshop 2: Search for persons
    • Born in 1970 and live in Germany with gender
    repartition
    curl localhost:9200/person/person/_search?pretty -d '{
    "query": { "bool": { "must": [
    { "match": { "address.country": "Germany"} },
    { "range": { "dateOfBirth": { "from": "1970", "to": "1971" }}}
    ]}},
    "facets": {
    "by_gender": {
    "terms": {
    "field": "gender"
    }
    }
    }
    }'
    dimanche 6 octobre 13

    View full-size slide

  30. Workshop 4
    Click and play!
    dimanche 6 octobre 13

    View full-size slide

  31. Workshop 0: setup
    • Get kibana
    • Open kibana
    • Build your dashboard as you need!
    bin/plugin -install elasticsearch/kibana
    # or
    curl -OL -k http://download.elasticsearch.org/kibana/kibana/kibana-latest.zip
    open http://localhost:9200/_plugin/kibana/
    dimanche 6 octobre 13

    View full-size slide

  32. Elasticsearch Workshop
    Thanks!
    dimanche 6 octobre 13

    View full-size slide