Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Defrag Conference 14: Elasticsearch

Peter Kim
November 18, 2014

Defrag Conference 14: Elasticsearch

Slides for my brief talk about Elasticsearch at Defrag Conference 2014, Broomfield, CO.

Peter Kim

November 18, 2014
Tweet

More Decks by Peter Kim

Other Decks in Technology

Transcript

  1. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. me • Solutions Architect @elasticsearch • Previously consulting engineer at Digital Reasoning, MarkLogic and Endeca • Married for ten years; two boys ages 2 and 4
  2. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. brief intro to elasticsearch • Full-text search and analytics engine • Distributed, horizontally- scalable • Blazing fast sub-second response times • JSON everywhere: documents, queries, responses • HTTP/REST API + client APIs for Java, Python, .NET, Ruby, PHP, Python, Javascript, and more • APIs for everything! Queries, inserts, administration, configuration, status, etc. • Open source with Apache 2 license
  3. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. full text search original query text stem-matched results snippeting contextual summarization (aka facets) geospatial search efficient, relevance-ranked pagination
  4. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. and it’s easy. almost magic. • Download and go! No complex Hadoop install required. • Developer experience is the same working with 1 node vs 100 node cluster. $ wget https://download.elasticsearch.org/... $ tar xf elasticsearch-1.4.0.tar.gz $ ./elasticsearch-1.4.0/bin/elasticsearch ... [2014-11-14 05:01:09,214][INFO ][node ] [Crusader] started ...
  5. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. add documents » curl -XPUT localhost:9200/library/book/1 -d ' { "title" : "Elasticsearch - The definitive guide", "authors" : [“Clinton Gormley”, “Zachary Tong”], "started" : "2013-02-04", "pages" : 230 }'
  6. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. search documents »curl -XGET localhost:9200/library/_search?q=elasticsearch { "took" : 2, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.076713204, "hits" : [ { "_index" : “books", "_type" : “book", "_id" : "1", "_score" : 0.076713204, "_source" : { "title" : "Elasticsearch - The definitive guide", "authors" : [ "Clinton Gormley", "Zachary Tong" ], "started" : “2013-02-04", "pages" : 230 } } ] } }
  7. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. what are people doing with elasticsearch? Centralized log analysis Enterprise search Source code search Location-aware mobile search SIEM Cloud search service Quality of service monitoring E-commerce search and navigation Social media analytics Open data APIs
  8. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. the big data holy grail (near) real-time full-text search and analytics scale-out architecture it’s easy! + + =
  9. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. counting • Much of what a search engine does starts with the simple act of counting • TF/IDF to calculate relevance • Elasticsearch knows the counts of everything in its index More alike than you think!
  10. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. the typical nature of words and documents the 100 0 beer and Kibana % of all documents containing word Kibana defrag cake A small number of very common words Many, many uncommon words
  11. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. random samples should hold no surprises the 100 0 beer and Kibana % of all documents containing word cake % of documents in a random sample 100 In a random sample of documents, words appear with their normal degrees of popularity Kibana defrag
  12. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. 100 words used in the search results for “denver” vs. all documents denver the 100 0 high and Broncos % of all documents containing word beer % of search results containing the word Area of uncommonly common terms Hancock Elway
  13. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. many use cases • Root cause analysis in fault reports • Detecting credit card fraud • Making product recommendations • Finding unusual crime patterns • Refining searches + training classifiers • …
  14. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. Top crime types by the straight numbers
  15. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited. Top crime types using significant terms