Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Quick Introduction to Elasticsearch

Quick Introduction to Elasticsearch

Elasticsearch and MIT Sloan Data Analytics Hackathon
Cambridge, MA - May 10, 2014

Igor Motov

May 10, 2014
Tweet

More Decks by Igor Motov

Other Decks in Programming

Transcript

  1. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Elasticsearch and MIT Sloan Data Analytics Hackathon Cambridge, MA - May 10, 2014 Elasticsearch Quick Introduction
  2. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited About Me • Igor Motov • Developer at Elasticsearch Inc. • Github: imotov • Twitter: @imotov
  3. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited About Elasticsearch Inc. • Founded in 2012 By the people behind the Elasticsearch and Apache Lucene http://www.elasticsearch.com Headquarters: Amsterdam and Los Altos, CA • We provide Training (public & onsite) Development support Production support subscription (SLA)
  4. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited About Elasticsearch • Real time search and analytics engine JSON-oriented, Apache Lucene-based • Automatic Schema Detection Enables control of it when needed • Distributed Scales Up+Out, Highly Available • Multi-tenancy Dynamically create/delete indices • API centric Most functionality is exposed through an API
  5. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Basic Concepts • Cluster a group of nodes sharing the same set of indices • Node a running Elasticsearch instance (typically JVM process) • Index a set of documents of possibly different types stored in one or more shards • Type a set of documents in an index that share the same schema • Shard a Lucene index, allocated on one of the nodes
  6. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Basic Concepts - Document • JSON Object ! ! ! ! ! ! • Identified by index/type/id { "rank": 21, "city": "Boston", "state": "Massachusetts", "population2010": 617594, "land_area": 48.277, "density": 12793, "ansi": 619463, "location": { "lat": 42.332, "lon": 71.0202 }, "abbreviation": "MA" }
  7. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Downloading elasticsearch • http://www.elasticsearch.org/download/ Windows Everything else
  8. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited What’s in a distribution? . ├── LICENSE.txt ├── NOTICE.txt ├── README.textile ├── bin │ ├── elasticsearch │ ├── elasticsearch.in.sh │ └── plugin ├── config │ ├── elasticsearch.yml │ └── logging.yml ├── data │ └── elasticsearch ├── lib │ ├── elasticsearch-x.y.z.jar │ ├── ... │ └── └── logs ├── elasticsearch.log └── elasticsearch_index_search_slowlog.log executable scripts node config files data storage libs log files
  9. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Configuration (multicast) • Configuration config/elasticsearch.yml cluster.name: "elasticsearch-imotov" unique name
  10. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Configuration (stand-alone) • Configuration config/elasticsearch.yml cluster.name: "elasticsearch-imotov" network.host: "127.0.0.1" discovery.zen.ping.multicast.enabled: false discovery.zen.ping.unicast.hosts: ["localhost:9300", "localhost:9301", “localhost:9302"] unique name listen only on localhost disable multicast search for other nodes on localhost
  11. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Starting elasticsearch • Foreground ! ! • Background $ bin/elasticsearch $ bin/elasticsearch -d
  12. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Is it running? { "status" : 200, "name" : "Kamal", "version" : { "number" : "1.1.1", "build_hash" : "f1585f096d3f3985e73456debdc1a0745f512bbc", "build_timestamp" : "2014-04-16T14:27:12Z", "build_snapshot" : false, "lucene_version" : "4.7" }, "tagline" : "You Know, for Search" } $ curl -XGET "http://localhost:9200/?pretty"
  13. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Communicating with Elasticsearch • REST API Curl Ruby Python PHP Perl JavaScript (community supported) • Binary Protocol Java
  14. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Pick your client • Java included in distribution • Ruby, PHP, Perl, Python http://www.elasticsearch.org/blog/unleash-the-clients-ruby- python-php-perl/ • Everything Else http://www.elasticsearch.org/guide/clients/
  15. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Indexing a document $ curl -XPUT "http://localhost:9200/test-data/cities/21" -d '{ "rank": 21, "city": "Boston", "state": "Massachusetts", "population2010": 617594, "land_area": 48.277, "density": 12793, "ansi": 619463, "location": { "lat": 42.332, "lon": 71.0202 }, "abbreviation": "MA" }' {"ok":true,"_index":"test-data","_type":"cities","_id":"21","_version":1}
  16. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Getting a document { "_index" : "test-data", "_type" : "cities", "_id" : "21", "_version" : 1, "exists" : true, "_source" : { "rank": 21, "city": "Boston", "state": "Massachusetts", "population2010": 617594, "land_area": 48.277, "density": 12793, "ansi": 619463, "location": { "lat": 42.332, "lon": 71.0202 }, "abbreviation": "MA" } } $ curl -XGET "http://localhost:9200/test-data/cities/21?pretty"
  17. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Updating a document $ curl -XPUT "http://localhost:9200/test-data/cities/21" -d '{ "rank": 21, "city": "Boston", "state": "Massachusetts", "population2010": 617594, "population2012": 636479, "land_area": 48.277, "density": 12793, "ansi": 619463, "location": { "lat": 42.332, "lon": 71.0202 }, "abbreviation": "MA" }' {"ok":true,"_index":"test-data","_type":"cities","_id":"21","_version":2}
  18. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Searching $ curl -XGET 'http://localhost:9200/test-data/cities/_search?pretty' -d '{ "query": { "match": { "city": "Boston" } } }'
  19. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Searching { "took" : 5, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 6.1357985, "hits" : [ { "_index" : "test-data", "_type" : "cities", "_id" : "21", "_score" : 6.1357985, "_source" : {"rank":"21","city":"Boston",...} } ] } }
  20. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Range Queries $ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{ "query": { "range": { "population2012": { "from": 500000, "to": 1000000 } } } }'
  21. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Boolean Queries $ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{ "query": { "bool": { "should": [{ "match": { "state": "Texas"} }, { "match": { "state": "California"} }], "must": { "range": { "population2012": { "from": 500000, "to": 1000000 } } }, "minimum_should_match": 1 } } }'
  22. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited MatchAll Query $ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{ "query": { "match_all": { } } }'
  23. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Sorting and Paging $ curl -XGET "http://localhost:9200/test-data/cities/_search?pretty" -d '{ "query": { "match_all": { } }, "sort": [ {"state": {"order": "asc"}}, {"population2010": {"order": "desc"}} ], "from": 0, "size": 20 }'
  24. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Analysis • By default string are - Divided into words (tokens) - All tokens are converted to lower-case
  25. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Analysis Example • “Elasticsearch is a powerful open source search and analytics engine.” 1. elasticsearch 2. is 3. a 4. powerful 5. open 6. source 7. search 8. and 9. analytics 10. engine
  26. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Customizing the mapping curl -XPUT 'http://localhost:9200/my_index/' -d '{ "settings": { "index": { "number_of_shards": 1, "number_of_replicas": 0 } }, "mappings": { "my_type": { "properties": { "description": { "type": "string" }, "sku": { "type": "string", "index": "not_analyzed" }, "count": { "type": "integer" }, "price": { "type": "float" }, "location": { "type": "geo_point" } } } } }' exact match analyzed text geo location
  27. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Elasticsearch Reference • http://www.elasticsearch.org/guide/
  28. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Ideas for hackathon • Explore data wikipedia twitter enron emails • Play with Kibana • Build Elasticsearch plugins • Get prizes
  29. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Elasticsearch Meetup http://www.meetup.com/Elasticsearch-Boston/
  30. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited We are hiring http://www.elasticsearch.com/about/jobs/