Slide 1

Slide 1 text

Alexander Reelsen @spinscale [email protected] elasticsearch beyond full-text search #gotoaar #elasticsearch

Slide 2

Slide 2 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited About me • Elasticsearch core developer Features, bug fixing, package maintenance, documentation, blog posts • Development support • Production support • Trainings • Conferences & talks • Interests: Java, JavaScript, web apps

Slide 3

Slide 3 text

Beyond full-text search?

Slide 4

Slide 4 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Unstructured search

Slide 5

Slide 5 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Structured search

Slide 6

Slide 6 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Enrichment

Slide 7

Slide 7 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Sorting

Slide 8

Slide 8 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Pagination

Slide 9

Slide 9 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Aggregation

Slide 10

Slide 10 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Suggestions

Slide 11

Slide 11 text

Introduction

Slide 12

Slide 12 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Elasticsearch in 10 seconds • Schema-free, REST & JSON based distributed document store • Open source: Apache License 2.0 • Zero configuration • Used by github, mozilla, soundcloud, stack overflow, foursquare, fog creek, stumbleupon

Slide 13

Slide 13 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Zero configuration $ wget https://download.elasticsearch.org/... $ tar -xf elasticsearch-0.90.5.tar.gz $ ./elasticsearch-0.90.5/bin/elasticsearch -f ... [INFO ][node][Ghost Maker] {0.90.5}[5645]: initializing ...

Slide 14

Slide 14 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Index & search data curl  -­‐X  PUT  localhost:9200/products/product/1  -­‐d  ' {    "created_at"  :  "2013/09/05  15:45:10",    "name"  :  "Macbook  Air",    "price"  :  {        "net"  :  1699,        "tax"  :  322.81,    } }' curl  -­‐X  GET  'localhost:9200/products/product/_search?q=macbook'

Slide 15

Slide 15 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Distributed • Replication: Data duplication Read scalability Removing SPOF • Sharding: Data partitioning Split logical data over several machines Write scalability Control data flows

Slide 16

Slide 16 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Distributed node 1 orders products 1 4 1 2 2 2 curl  -­‐X  PUT  localhost:9200/orders  -­‐d  '{    "settings.index.number_of_shards"  :  4    "settings.index.number_of_replicas"  :  1 }' curl  -­‐X  PUT  localhost:9200/products  -­‐d  '{    "settings.index.number_of_shards"  :  2    "settings.index.number_of_replicas"  :  0 }'

Slide 17

Slide 17 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Distributed node 1 orders products 2 1 4 1 node 2 orders products 2 2 3 3 4 1

Slide 18

Slide 18 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Distributed node 1 orders products 2 1 4 1 node 2 orders products 2 2 node 3 orders products 3 4 1 3

Slide 19

Slide 19 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Ecosystem • Plugins • Clients for many languages Ruby, Python, PHP, Perl Javascript, Scala, Clojure • Kibana & Logstash • Hadoop integration

Slide 20

Slide 20 text

From data to information

Slide 21

Slide 21 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited What is data? • Whatever provides value for your business • Domain data Internal: Orders, products External: Social media streams, email • Application data Log files Metrics

Slide 22

Slide 22 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Asking questions to your data • How many orders were created? • How many orders were created in the last month? • How many orders were created every day in the last month? • What is the average revenue per shopping cart? • What is the average shopping cart size per order (EUR or #items)? Per hour?

Slide 23

Slide 23 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Order as JSON curl  -­‐X  PUT  localhost:9200/orders/order/1  -­‐d  ' {    "created_at"  :  "2013/09/05  15:45:10",    "items"  :  [        ...    ]    "total"  :  245.37 }'

Slide 24

Slide 24 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Asking questions to your data • How many orders were created? • How many orders were created in the last month? • How many orders were created every day in the last month? • What is the average revenue per shopping cart? • What is the average shopping cart size per order (EUR or #items)? Per hour? curl -X GET http://localhost:9200/orders/order/_count Count

Slide 25

Slide 25 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Asking questions to your data • How many orders were created? • How many orders were created in the last month? • How many orders were created every day in the last month? • What is the average revenue per shopping cart? • What is the average shopping cart size per order (EUR or #items)? Per hour? curl  -­‐X  GET  http://localhost:9200/orders/order/_count  -­‐d  '{    "range":  {        "created_at":  {            "gte":  "2013/09/01",            "lt":    "2013/10/01"        }    } }' filter

Slide 26

Slide 26 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Asking questions to your data • How many orders were created? • How many orders were created in the last month? • How many orders were created every day in the last month? • What is the average revenue per shopping cart? • What is the average shopping cart size per order (EUR or #items)? Per hour? filter count/day

Slide 27

Slide 27 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Asking questions to your data • How many orders were created? • How many orders were created in the last month? • How many orders were created every day in the last month? • What is the average revenue per shopping cart? • What is the average shopping cart size per order (EUR or #items)? Per hour? curl  -­‐X  GET  http://localhost:9200/orders/order/_search  -­‐d  '{    "facets":  {        "created":  {            "date_histogram"  :  {                "field"  :  "created_at",                "interval"  :  "1d"            },            "facet_filter"  :  {                "range":  {                    "created_at":  {                        "gte":  "2013/09/01",                        "lt"  :  "2013/10/01"                    }                }            }        }    } }' count/day filter

Slide 28

Slide 28 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Asking questions to your data • How many orders were created? • How many orders were created in the last month? • How many orders were created every day in the last month? • What is the average revenue per shopping cart? • What is the average shopping cart size per order (EUR or #items)? Per hour? filter scripting stats

Slide 29

Slide 29 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Asking questions to your data • How many orders were created? • How many orders were created in the last month? • How many orders were created every day in the last month? • What is the average revenue per shopping cart? • What is the average shopping cart size per order (EUR or #items)? Per hour? curl  -­‐X  GET  http://localhost:9200/orders/order/_search  -­‐d  '{    "facets":  {        "avg_revenue":  {            "facet_filter"  :  {                "range":  {                    "created_at":  {                        "gte":  "2013/09/01",                        "lt"  :  "2013/10/01"                    }                }            },            "statistical"  :  {                "script"  :  "doc[\u0027total\u0027].value  *  0.1  +  2"            }        }    } }' filter scripting stats

Slide 30

Slide 30 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Asking questions to your data • How many orders were created? • How many orders were created in the last month? • How many orders were created every day in the last month? • What is the average revenue per shopping cart? • What is the average shopping cart size per order (EUR or #items)? Per hour? filter scripting stats per

Slide 31

Slide 31 text

From data to visualization

Slide 32

Slide 32 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited From numbers to simplicity • JSON is not a management compatible notation • Writing your own visulization app for all the different data is tedious • Enter Kibana!

Slide 33

Slide 33 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Kibana

Slide 34

Slide 34 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Kibana

Slide 35

Slide 35 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Kibana

Slide 36

Slide 36 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Kibana

Slide 37

Slide 37 text

From data to notification

Slide 38

Slide 38 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Houston, we have a problem! • The average response time of your payment API just increased over 2 seconds over the last 15 minutes • A credit card fraud detection kicks in • Visits are exploding after the television commercial • The “win-a-car” voucher has reached its usage limit • Memory usage exceeds physical memory

Slide 39

Slide 39 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Meet the metrics library! • Measure inside your application • Gauges, Timers, Counters, Meters, Histograms • Healthchecks • Report to elasticsearch

Slide 40

Slide 40 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Meet the metrics library! MetricRegistry  metrics  =  new  MetricRegistry(); Meter  requestsMeter  =  metrics.meter("incoming-­‐http-­‐requests"); //  in  your  app  code requestsMeter.mark(1); Timer responses = metrics.timer("responses")); Timer.Context context = responses.time(); try { // etc; return "OK"; } finally { context.stop(); }

Slide 41

Slide 41 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Metrics elasticsearch reporter • Reports from your application into elasticsearch • Uses HTTP, no elasticsearch dependency • Realtime notification via percolation Sent an email, a pager alert or a MQ message

Slide 42

Slide 42 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Percolation • Normal: Index documents, run queries • Percolator: Register queries, run against documents • Use-case: Price agent, contextual ads, classification before indexing (geo, tag, categorization), metrics

Slide 43

Slide 43 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Percolation support ElasticsearchReporter  reporter  =          ElasticsearchReporter.forRegistry(registry)                .percolateNotifier(new  PagerNotifier())                .percolateMetrics(".*")                .build(); reporter.start(60,  TimeUnit.SECONDS); public  class  PagerNotifier  implements  Notifier  {    @Override    public  void  notify(JsonMetric  metric,  String  id)  {        //  send  pager  duty  here    } }

Slide 44

Slide 44 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Cockpit - Sample App

Slide 45

Slide 45 text

From data to insight

Slide 46

Slide 46 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Know it all! • Long term data required (index everything!) • Visualization is a great start • Deep insight into your data required Know your data Know your data format Concrete questions with lots of dimensions

Slide 47

Slide 47 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Aggegrations • aka: composable facets • Take the output of a facet operation • Use it as an input of another facet operation • Remember: What is the average shopping cart value per order per hour?

Slide 48

Slide 48 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Aggegrations curl  -­‐X  GET  'http://localhost:9200/orders/order/_search'  -­‐d  '{ "aggs"  :  {        "avg_shopping_cart_per_hour"  :  {            "filter"  :  {                "range":  {                    "created_at":  {                        "gte":  "2013/09/01",                        "lt"  :  "2013/10/01"                    }                }            },            "date_histogram"  :  {                "field"  :  "created_at",                "interval"  :  "1h"            },            "aggregations"  :  {                "avg"  :  {  "avg"  :  {  "field"  :  "total"  }  }            } }  }  }'

Slide 49

Slide 49 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Aggegrations curl  -­‐X  GET  'http://localhost:9200/orders/order/_search'  -­‐d  '{ "aggs"  :  {        "avg_shopping_cart_per_hour"  :  {            "filter"  :  {                "range":  {                    "created_at":  {                        "gte":  "2013/09/01",                        "lt"  :  "2013/10/01"                    }                }            },            "histogram"  :  {                "script"  :  "doc[\u0027created_at\u0027].date.hourOfDay",            },            "aggregations"  :  {                "avg"  :  {  "avg"  :  {  "field"  :  "total"  }  }            } }  }  }'

Slide 50

Slide 50 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Ask complex questions • Product pageviews Sum of page views per price range including price statistics (min/max/avg/sum/count) • Geo location Physical store: Home of buyers per weekday combined with money spent • Protip: Reduce memory consumption using probalistic data structures, losing precision

Slide 51

Slide 51 text

roundup

Slide 52

Slide 52 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Roundup Insight Visualization Notification

Slide 53

Slide 53 text

Thanks for listening! Alexander Reelsen @spinscale [email protected] We’re hiring http://www.elasticsearch.com/about/jobs #gotoaar #elasticsearch

Slide 54

Slide 54 text

roadmap

Slide 55

Slide 55 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Roadmap • Elasticsearch 1.0 Distributed percolator (already in master) Aggregations Snapshot/Restore

Slide 56

Slide 56 text

links

Slide 57

Slide 57 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Links • Elasticsearch http://www.elasticsearch.org • Logstash http://logstash.net • Kibana http://three.kibana.org • elasticsearch-metrics-reporter https://github.com/elasticsearch/metrics-elasticsearch- reporter-java

Slide 58

Slide 58 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Links • Clients http://www.elasticsearch.org/blog/unleash-the-clients- ruby-python-php-perl/ • Metrics http://metrics.codahale.com/ • Aggregations https://github.com/elasticsearch/elasticsearch/issues/ 3300 • Elasticsearch Hadoop integration https://github.com/elasticsearch/elasticsearch-hadoop

Slide 59

Slide 59 text

Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited Links • Talk on probalistic data structures http://www.infoq.com/presentations/scalability-data- mining • Icons http://www.doublejdesign.co.uk/ http://www.iconarchive.com/