Upgrade to Pro — share decks privately, control downloads, hide ads and more …

New features in Elasticsearch 1.0 & 1.1

New features in Elasticsearch 1.0 & 1.1

Presented by Luca Cavanna at the Amsterdam Elasticsearch meetup April 2014 (http://www.meetup.com/ElasticSearch-NL/events/171263412/)

Elasticsearch Inc

April 03, 2014

More Decks by Elasticsearch Inc

Other Decks in Technology


  1. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited 0 Elasticsearch 1.1 New features in @lucacavanna
  2. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited JSON distributed real-time analytics RESTful Lucene open source schema-free document oriented search
  3. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Setup $ wget https://download.elasticsearch.org/elasticsearch/ elasticsearch/elasticsearch-1.1.0.zip ! $ unzip elasticsearch-1.1.0.zip ! $ cd elasticsearch-1.1.0 ! $ bin/elasticsearch
  4. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Is it alive? $ curl localhost:9200 ! { "status" : 200, "name" : "Moondark", "version" : { "number" : “1.1.0", "build_hash" : "2181e113dea80b4a9e31e58e9686658a2d46e363", "build_timestamp" : "2014-03-25T15:59:51Z", "build_snapshot" : false, "lucene_version" : "4.7" }, "tagline" : "You Know, for Search" }
  5. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Index $ curl -XPUT localhost:9200/twitter/status/1 -d ' { "text" : "New features in elasticsearch 1.1", "user" : { "name" : "Luca Cavanna", "screen_name" : "lucacavanna" }, "place" : { "country" : "Netherlands", "country_code" : "nl" }, "created_at" : "2014-04-03", "retweet_count" : 50 } '
  6. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Get $ curl -XGET localhost:9200/twitter/status/1 Delete $ curl -XDELETE localhost:9200/twitter/status/1
  7. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Search $ curl -XGET localhost:9200/_search?q=elasticsearch $ curl -XGET localhost:9200/_search -d ' { "query" : { "query_string" : { "query" : "elasticsearch AND features" } } } '
  8. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Search - query DSL $ curl -XGET localhost:9200/_search -d ' { "query" : { "filtered" : { "query" : { "bool" : { "must" : [ { "match" : { "text" : { "query" : "elasticsearch features", "operator" : "AND" }}} ], "should" : [ { "match" : {"text" : "pizza"} } ] } }, "filter" : { "range" : { "created_at" : {"from" : "2014-04-01"} } } } } } '
  9. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited snapshot & restore Photo by John http://www.flickr.com/people/60026579@N00
  10. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited backup in 0.90 • disable flush • find all primary shards location (optional) • copy files (rsync) • re-enable flush
  11. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited backup in 1.0 - repository $ curl -XPUT localhost:9200/_snapshot/local -d ' { "type" : "fs", "settings" : { "location" : "/data/es/backup" } } '
  12. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited backup in 1.0 - snapshot $ curl -XPUT localhost:9200/_snapshot/local/backup_1 -d ' { "indices" : "*,-twitter*" } '
  13. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited restore in 0.90 • close the index • find all existing shards • replace files with ones from backup • re-open the index
  14. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited restore in 1.0 $ curl -XPOST localhost:9200/2014-*/_close • close the index/indices $ curl -XPOST localhost:9200/_snapshot/local/backup_1/_restore -d ' { "indices" : "2014-*" } ' • restore existing snapshot
  15. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Facets in 0.90 • terms / terms stats • range • histogram / date histogram • statistical • geo distance • filter / query
  16. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited retweets stats per user $ curl -XGET localhost:9200/twitter/_search -d ' { "facets" : { "retweets_per_user" : { "terms_stats" : { "key_field" : "user.screen_name", "value_field" : "retweet_count" } } } } '
  17. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited retweets stats per user { "facets" : { "retweets_per_user" : { "_type" : "terms_stats", "missing" : 0, "terms" : [{ "term" : "lucacavanna", "count" : 1, "total_count" : 1, "min" : 50.0, "max" : 50.0, "total" : 50.0, "mean" : 50.0 }] } } }
  18. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited give me the retweets stats per month, per user… cool, then…
  19. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited retweets stats per month per user $ curl -XGET localhost:9200/twitter/_search -d ' { "aggs" : { "month" : { "date_histogram" : { "field" : "created_at", "interval" : "month" }, "aggs" : { "user" : { "terms" : { "field" : "user.screen_name" }, "aggs" : { "retweets" : { "stats" : { "field" : "retweet_count" } } } } } } } } '
  20. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited retweets stats per month per user { "aggregations" : { "month" : { "buckets" : [ { "key_as_string" : "Tue Apr 01 00:00:00 +0000 2014", "key" : 1396310400000, "doc_count" : 1, "user" : { "buckets" : [ { "key" : "lucacavanna", "doc_count" : 1, "retweets" : { "count" : 1, "min" : 50, "max" : 50, "avg" : 50, "sum" : 50 } } ] } } ] } }
  21. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited buckets • global • filter • missing • terms • range • date_range • ipv4_range • histogram • date_histogram • geo_distance • geohash_grid • nested
  22. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited metrics • value_count • stats • extended_stats • avg • min • max • sum
  23. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited register query $ curl -XPUT localhost:9200/twitter/.percolator/es-features -d ' { "query" : { "match" : { "text" : "elasticsearch AND features" } }, "alert_type" : "mention" } '
  24. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited percolate document $ curl -XGET localhost:9200/twitter/tweet/_percolate -d ' { "doc" : { "text": "New features in elasticsearch 1.0", "user" : { "name" : "Luca Cavanna", "screen_name" : "lucacavanna" }, "created_at" : "2014-03-18" } }' { … "total" : 1, "matches" : [{ "_index" : "twitter", "_id" : "es-features" }] }
  25. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited 0.90 VS 1.x • single shard • sequential execution • _percolator index • single index percolation • arbitrary number of shards • parallel execution • .percolator type (any index) • multi index percolation
  26. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited new percolation features in 1.0 • percolate existing documents • percolate count api • filter support (in addition to queries) • highlighting • scoring • multi percolate • support for aggregations
  27. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Which node is the master? (0.90) { "cluster_name" : "elasticsearch", "master_node" : "yT4GUfIWTY6aJdQtWVEFpw", "nodes” : { "R-5_0LiORAWmr_cYLXO69Q" : { "name" : "Woodgod", "transport_address" : "inet[/]", "attributes" : {} }, "yT4GUfIWTY6aJdQtWVEFpw" : { "name” : "Moondark", "transport_address" : "inet[/]", "attributes" : {} }, "pR0NmKeGTVGget2O1qSqCQ" : { "name" : "Adaptoid", "transport_address" : "inet[/]", "attributes" : {} } } } $ curl localhost:9200/cluster/_state/nodes,master_node?pretty
  28. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Which node is the master? (0.90) $ curl localhost:9200/cluster/_state/nodes,master_node?pretty ! { "cluster_name" : "elasticsearch", "master_node" : "yT4GUfIWTY6aJdQtWVEFpw", "nodes” : { "R-5_0LiORAWmr_cYLXO69Q" : { "name" : "Woodgod", "transport_address" : "inet[/]", "attributes" : {} }, "yT4GUfIWTY6aJdQtWVEFpw" : { "name” : "Moondark", "transport_address" : "inet[/]", "attributes" : {} }, "pR0NmKeGTVGget2O1qSqCQ" : { "name" : "Adaptoid", "transport_address" : "inet[/]", "attributes" : {} } } }
  29. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited $ curl localhost:9200/cluster/_state/nodes,master_node?pretty ! { "cluster_name" : "elasticsearch", "master_node" : "yT4GUfIWTY6aJdQtWVEFpw", "nodes” : { "R-5_0LiORAWmr_cYLXO69Q" : { "name" : "Woodgod", "transport_address" : "inet[/]", "attributes" : {} }, "yT4GUfIWTY6aJdQtWVEFpw" : { "name” : "Moondark", "transport_address" : "inet[/]", "attributes" : {} }, "pR0NmKeGTVGget2O1qSqCQ" : { "name" : "Adaptoid", "transport_address" : "inet[/]", "attributes" : {} } } } Which node is the master? (0.90)
  30. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited $ curl localhost:9200/cluster/_state/nodes,master_node?pretty ! { "cluster_name" : "elasticsearch", "master_node" : "yT4GUfIWTY6aJdQtWVEFpw", "nodes” : { "R-5_0LiORAWmr_cYLXO69Q" : { "name" : "Woodgod", "transport_address" : "inet[/]", "attributes" : {} }, "yT4GUfIWTY6aJdQtWVEFpw" : { "name” : "Moondark", "transport_address" : "inet[/]", "attributes" : {} }, "pR0NmKeGTVGget2O1qSqCQ" : { "name" : "Adaptoid", "transport_address" : "inet[/]", "attributes" : {} } } } Which node is the master? (0.90)
  31. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Which node is the master? (1.0) $ curl localhost:9200/_cat/master yT4GUfIWTY6aJdQtWVEFpw Lucas-MacBook-Air.local Moondark
  32. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited _cat*/api • /_cat/aliases • /_cat/allocation • /_cat/count • /_cat/health • /_cat/indices • /_cat/master • /_cat/nodes • /_cat/pending_tasks • /_cat/thread_pool • /_cat/shards • /_cat/plugins
  33. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited …and more • disk based field data (aka doc values) • field data circuit breaker • tribe node (aka federated search)
  34. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited …count all the things cardinality aggregation
  35. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited How many distinct users tweeted, per month, per country?
  36. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Distinct users per month, per country $ curl -XGET localhost:9200/twitter/_search -d ' { "aggs" : { "month" : { "date_histogram" : { "field" : "created_at", "interval" : "month" }, "aggs" : { "country" : { "terms" : { "field" : "place.country.keyword" }, "aggs" : { "distinct_users" : { "cardinality" : { "field" : "user.screen_name" } } } } } } } } '
  37. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited { "aggregations" : { "month" : { "buckets" : [ { "key_as_string" : "2014-04-01T00:00:00.000Z", "key" : 1396310400000, "doc_count" : 1097354, "country" : { "buckets" : [ { "key" : "United States", "doc_count" : 501244, "distinct_users" : { "value" : 471504 } }, { "key" : "Indonesia", "doc_count" : 452933, "distinct_users" : { "value" : 312002 } } ] } } ] } } } Distinct users per month, per country
  38. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited cardinality aggregation • HyperLogLog++ algorithm • Approximate counts based on the hashes of the field values • Configurable precision: how to trade memory for accuracy • Allows to provide hashes while indexing • Allows to compute hashes at index time • Scripting support
  39. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited …know your data percentiles aggregation
  40. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited How is the number of retweets distributed, per month, per country?
  41. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Retweets stats per month, per country $ curl -XGET localhost:9200/twitter/_search -d ' { "aggs" : { "month" : { "date_histogram" : { "field" : "created_at", "interval" : "month" }, "aggs" : { "country" : { "terms" : { "field" : "place.country.keyword" }, "aggs" : { "retweets" : { "stats" : { "field" : "retweet_count" } } } } } } } } '
  42. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited { "aggregations" : { "month" : { "buckets" : [ { "key_as_string" : "2014-04-01T00:00:00.000Z", "key" : 1396310400000, "doc_count" : 1097354, "country" : { "buckets" : [ { "key" : "United States", "doc_count" : 169442, "retweets" : { "min": 0, "max": 230681, "avg": 946.0165939898582, "sum": 47945067 } } ] } } ] } } } Retweets stats per month, per country
  43. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited …but how about the outliers? Interesting
  44. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Retweets per month, per country $ curl -XGET localhost:9200/twitter/_search -d ' { "aggs" : { "month" : { "date_histogram" : { "field" : "created_at", "interval" : "month" }, "aggs" : { "country" : { "terms" : { "field" : "place.country.keyword" }, "aggs" : { "retweets" : { "percentiles" : { "field" : "retweet_count" } } } } } } } } '
  45. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited { "aggregations" : { "month" : { "buckets" : [ { "key_as_string" : "2014-04-01T00:00:00.000Z", "key" : 1396310400000, "doc_count" : 1097354, "country" : { "buckets" : [ { "key" : "United States", "doc_count" : 169442, "retweets" : { "1.0": 1, "5.0": 1, "25.0": 2, "50.0": 21.927867004790084, "75.0": 218.26625104274626, "95.0": 3199.6148040638604, "99.0": 15889.028205128077 } } ] } } ] } } } Retweets per month, per country
  46. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited percentiles aggregation • t-digest algorithm • Approximate percentiles • Configurable compression: trade memory for accuracy • Request specific percentiles only • Scripting support
  47. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited …revealing the uncommonly common significant terms
  48. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited What’s the right hashtag for… “tulip”?
  49. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Right hashtag for “tulip” $ curl -XGET localhost:9200/twitter/_search -d ' { "query" : { "match" : {"text" : "tulip"} }, "aggs" : { "interesting_tags" : { "significant_terms" : { "field" : "hashtags.text" } } } } '
  50. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited { "aggregations" : { "interesting_tags" : { "doc_count" : 40, "buckets" : [ { "key” : "spring", "doc_count" : 38, "score" : 3397.32, "bg_count" : 45 } ] } } } Right hashtag for “tulip”
  51. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited What tweets have the wrong hashtag or don’t have one?
  52. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Look for a specific hashtag $ curl -XGET localhost:9200/twitter/_search -d ' { "query" : { "match" : {"hashtags.text" : "elasticsearch"} }, "aggs" : { "interesting_tags" : { "significant_terms" : { "field" : "text" } } } } '
  53. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited Like this but not this query $ curl -XGET localhost:9200/twitter/_search -d ' { "query" : { "bool" : { "must_not" : [ {"match" : {"hashtags.text" : "elasticsearch"}} ], "must" : [{ "terms" : { "text": ["lucene", “aggregations”, …] } }] } } } '
  54. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited significant terms • Background set: the whole index • Foreground set: documents matching the query • Approximate counts • Configurable shard_size for better accuracy
  55. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited …and more • better cross field queries • search templates • aliases support in index templates • recovery api & _cat/recovery • _cat/segments
  56. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited thank you! Support: http://elasticsearch.com/support Training: http://training.elasticsearch.com ! We are hiring: http://elasticsearch.com/about/jobs/ @lucacavanna