is strictly prohibited. me • Solutions Architect @elasticsearch • Previously consulting engineer at Digital Reasoning, MarkLogic and Endeca • Married for ten years; two boys ages 2 and 4
is strictly prohibited. brief intro to elasticsearch • Full-text search and analytics engine • Distributed, horizontally- scalable • Blazing fast sub-second response times • JSON everywhere: documents, queries, responses • HTTP/REST API + client APIs for Java, Python, .NET, Ruby, PHP, Python, Javascript, and more • APIs for everything! Queries, inserts, administration, configuration, status, etc. • Open source with Apache 2 license
is strictly prohibited. full text search original query text stem-matched results snippeting contextual summarization (aka facets) geospatial search efficient, relevance-ranked pagination
is strictly prohibited. and it’s easy. almost magic. • Download and go! No complex Hadoop install required. • Developer experience is the same working with 1 node vs 100 node cluster. $ wget https://download.elasticsearch.org/... $ tar xf elasticsearch-1.4.0.tar.gz $ ./elasticsearch-1.4.0/bin/elasticsearch ... [2014-11-14 05:01:09,214][INFO ][node ] [Crusader] started ...
is strictly prohibited. what are people doing with elasticsearch? Centralized log analysis Enterprise search Source code search Location-aware mobile search SIEM Cloud search service Quality of service monitoring E-commerce search and navigation Social media analytics Open data APIs
is strictly prohibited. counting • Much of what a search engine does starts with the simple act of counting • TF/IDF to calculate relevance • Elasticsearch knows the counts of everything in its index More alike than you think!
is strictly prohibited. the typical nature of words and documents the 100 0 beer and Kibana % of all documents containing word Kibana defrag cake A small number of very common words Many, many uncommon words
is strictly prohibited. random samples should hold no surprises the 100 0 beer and Kibana % of all documents containing word cake % of documents in a random sample 100 In a random sample of documents, words appear with their normal degrees of popularity Kibana defrag
is strictly prohibited. 100 words used in the search results for “denver” vs. all documents denver the 100 0 high and Broncos % of all documents containing word beer % of search results containing the word Area of uncommonly common terms Hancock Elway
is strictly prohibited. many use cases • Root cause analysis in fault reports • Detecting credit card fraud • Making product recommendations • Finding unusual crime patterns • Refining searches + training classifiers • …