Upgrade to Pro — share decks privately, control downloads, hide ads and more …

New Features in Elasticsearch v1.0

Ba4e53e23b4b96623fdbcd0eae2af923?s=47 Igor Motov
November 04, 2013

New Features in Elasticsearch v1.0

Boston Elasticsearch Meetup
Nov 4, 2013

Ba4e53e23b4b96623fdbcd0eae2af923?s=128

Igor Motov

November 04, 2013
Tweet

Transcript

  1. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited Igor Motov New Features in Elasticsearch 1.0
  2. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited about • Developer at Elasticsearch Inc joined Elasticsearch Inc.: Oct 2012 Elasticsearch contributor since Apr 2011 ! • Elasticsearch Inc founded: July 2012 headquarters: Amsterdam and Los Altos, CA provides: training (public & onsite), development support, production support subscription
  3. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited v1.0 ? • v0.4.0 - Feb 8, 2010 • v0.5.0 - Mar 5, 2010 • … • v0.19.0 - Mar 1, 2012 • v0.20.0 - Dec 7, 2012 • v0.90.0 - Apr 29, 2013 • v1.0 - Soon
  4. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited v1.0 • rolling upgrades because not everyone can afford having “scheduled maintenance” • ability to backup data because “rm -rf" happens !
  5. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited v1.0 • rolling upgrades • snapshot/restore (backup) • _cat API • aggregations • distributed percolator
  6. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited rolling upgrades Photo by Kamyar Adl
  7. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited snapshot and restore Photo by John http://www.flickr.com/people/60026579@N00
  8. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited snapshot and restore backup
  9. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited backup in 0.90 1. disable flush 2. find all primary shard location (optional) 3. copy files from primary shards (rsync) 4. enable flush
  10. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited backup in v1.0 $ curl -XPUT localhost:9200/_snapshot/my_backup/snapshot_20131010 snapshot name repository
  11. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited repositories • Snapshot Storage Shared File System - v1.0 S3 - v1.0 HDFS Google Compute Engine Microsoft Azure ...
  12. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited register repository $ curl -XPUT "localhost:9200/_snapshot/my_backup" -d '{! "type": "fs", ! "settings": {! "location":"/mnt/es-test-repo"! }! }' location repository repository type
  13. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited start snapshot $ curl -XPUT "localhost:9200/_snapshot/my_backup/snapshot_20131010" -d '{! "indices":"+test_*,-test_4"! }' snapshot name repository index list (optional)
  14. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited restore in 0.90 1. close the index (shutdown the cluster) 2. find all existing index shards 3. replace all index shards with data from backup 4. open the index (start the cluster)
  15. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited restore in 1.0 $ curl -XPOST "localhost:9200/test_*/_close" snapshot name close all indices that start with test_ $ curl -XPOST "localhost:9200/_snapshot/my_backup/snapshot_20131010" -d '{! "indices":"test_*"! }' repository name index list
  16. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited https://github.com/elasticsearch/ elasticsearch/issues/3826
  17. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited distributed percolator Image Source: Wikipedia,
  18. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited percolator • reverse search • alerts • updatable search results
  19. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited registering percolator in 0.90 $ curl -XPUT “localhost:9200/_percolator/tweeter/es-tweets" -d ‘{! “query”: {! “match”: { “text”: “elasticsearch” }! }! }’! target index query id
  20. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited document percolation in 0.90 $ curl -XGET “localhost:9200/twitter/tweet/_percolate” -d ‘{! “doc”: {! “text”: “#elasticsearch is awesome”! “nick”: “@imotov”! “name”: “Igor Motov”! “date”: “2013-11-03” ! }! }’ target index percolation end point document to be percolated {! “ok”: true! “matches”: [“es-tweets”]! } matching queries
  21. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited how does it work in 0.90? • all queries are stored in special _percolate index • _percolate index has 1 primary shard which is replicated to every node • each percolated document is indexed in memory • all queries are executed against this document sequentially • execution time is linear to number of queries!
  22. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited how does it work in 0.90? • all queries are stored in special _percolate index • _percolate index has 1 primary shard which is replicated to every node • each percolated document is indexed in memory • all queries are executed against this document sequentially • execution time is linear to number of queries!
  23. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited that doesn’t scale
  24. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited registering percolator in 1.0 $ curl -XPUT “localhost:9200/some_index/_percolator/es-tweets” -d ‘{! “query”: {! “match”: { “body”: “elasticsearch” }! }! }’! reserved percolator type query id any index with as many shards as you need
  25. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited now that can scale!
  26. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited multi index support $ curl -XGET “localhost:9200/twitter,facebook/_percolate” -d ‘{! “doc”: {! “body”: “#elasticsearch is awesome”! “nick”: “@imotov”! “name”: “Igor Motov”! “date”: “2013-11-03” ! }! }’ document to be percolated
  27. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited full alias support $ curl -XGET “localhost:9200/soc_media_alias/_percolate” -d ‘{! “doc”: {! “body”: “#elasticsearch is awesome”! “nick”: “@imotov”! “name”: “Igor Motov”! “date”: “2013-11-03” ! }! }’ document to be percolated
  28. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited other features • percolation of existing document • percolate count api • filter support (in addition to queries in 0.90) • highlighting • scoring • multi percolate (bulk percolation)
  29. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited https://github.com/elasticsearch/ elasticsearch/issues/3173
  30. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited _cat/* api Image Source: Wikipedia,
  31. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited _cat/* api no, it will not help you organize your massive collection of cat pictures
  32. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited _cat/* api It’s because humans suck at reading JSON
  33. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited Which one is the master? $ curl "localhost:9200/_cluster/state?pretty&filter_metadata=true&! filter_routing_table=true"! {! "cluster_name" : "elasticsearch",! "master_node" : "GNf0hEXlTfaBvQXKBF300A",! "blocks" : { },! "nodes" : {! "ObdRqLHGQ6CMI5rOEstA5A" : {! "name" : "Triton",! "transport_address" : “inet[/10.0.1.11:9300]”,! "attributes" : { }! },! "4C7pKbfhTvu0slcSy_G4_w" : {! "name" : "Kid Colt",! "transport_address" : "inet[/10.0.1.12:9300]",! "attributes" : { }! },! "GNf0hEXlTfaBvQXKBF300A" : {! "name" : "Lang, Steven",! "transport_address" : "inet[/10.0.1.13:9300]",! "attributes" : { }! }! }! }
  34. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited Which one is the master? (v0.90) $ curl "localhost:9200/_cluster/state? pretty&filter_metadata=true&filter_routing_table=true"! {! "cluster_name" : "elasticsearch",! "master_node" : "GNf0hEXlTfaBvQXKBF300A",! "blocks" : { },! "nodes" : {! "ObdRqLHGQ6CMI5rOEstA5A" : {! "name" : "Triton",! "transport_address" : “inet[/10.0.1.11:9300]”,! "attributes" : { }! },! "4C7pKbfhTvu0slcSy_G4_w" : {! "name" : "Kid Colt",! "transport_address" : "inet[/10.0.1.12:9300]",! "attributes" : { }! },! "GNf0hEXlTfaBvQXKBF300A" : {! "name" : "Lang, Steven",! "transport_address" : "inet[/10.0.1.13:9300]",! "attributes" : { }! }! }! }
  35. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited Which one is the master? (v1.0) $ curl localhost:9200/_cat/master GNf0hEXlTfaBvQXKBF300A 10.0.1.13 Lang, Steven
  36. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited /cat/count $ curl localhost:9200/_cat/count! 1383501234301 12:53:54 3344067 count
  37. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited _cat/* api • /_cat/allocation • /_cat/count • /_cat/health • /_cat/master • /_cat/nodes • /_cat/recovery • /_cat/shards • /_cat/indices
  38. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited aggregations
  39. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited facets in 0.90 facets
  40. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited What’s wrong with facets in 0.90?
  41. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited Nothing!
  42. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited We just want MOAR features!!!
  43. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited facets in 0.90 • terms / terms stats • range • histogram / date histogram • filter/query • statistical • geo distance
  44. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited terms facet • Divides documents into buckets based on a value of a selected term • Calculates statistics on some other field of these document for each bucket
  45. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited index of large US cities {! "rank": "21",! "city": "Boston",! "state": "MA",! "population2012": "636479",! "population2010": "617594",! "land_area": "48.277",! "density": "12793",! "ansi": "619463",! "location": {! "lat": "42.332",! "lon": "71.0202"! }! }!
  46. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited terms facet request $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{! "facets": {! "stat1": {! "terms_stats": {! "key_field": "state",! "value_field": "density"! }! }! }! }' group by this field calculate stats for this field
  47. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited terms facet response "facets" : {! "stat1" : {! "_type" : "terms_stats",! "missing" : 0,! "terms" : [ {! "term" : "CA",! "count" : 69,! "total_count" : 69,! "min" : 1442.0,! "max" : 17179.0,! "total" : 383545.0,! "mean" : 5558.623188405797! }, {! "term" : "TX",! "count" : 32,! "total_count" : 32,! "min" : 1096.0,! "max" : 3974.0,! "total" : 79892.0,! "mean" : 2496.625! }, {! "term" : "FL",! "count" : 20,! group by field stats
  48. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited range facet request curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{! "facets": {! "population_ranges": {! "histogram": {! "key_field": "population2012",! "value_field": "density",! "interval": 500000! }! }! }! }' group by this field calculate stats by this field
  49. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited terms facet response "facets" : {! "population_ranges" : {! "_type" : "histogram",! "entries" : [ {! "key" : 0,! "count" : 255,! "min" : 171.0,! "max" : 17346.0,! "total" : 980306.0,! "total_count" : 252,! "mean" : 3890.1031746031745! }, {! "key" : 500000,! "count" : 25,! "min" : 956.0,! "max" : 17179.0,! "total" : 116597.0,! "total_count" : 25,! "mean" : 4663.88! }, {! "key" : 1000000,! "count" : 4,! "min" : 2798.0,! group by field (population) stats (density)
  50. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited MOAR!!! But what if I want an average density by population histogram for each state?
  51. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited aggregations Buckets Calculators
  52. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited aggs = buckets + calcs CA TX MA CO AZ "facets" : {! "population_ranges" : {! "_type" : "histogram",! "entries" : [ {! "key" : 0,! "count" : 255,! "min" : 171.0,! "max" : 17346.0,! "total" : 980306.0,! "total_count" : 252,! "mean" : 3890.1031746031745! }, {! "key" : 500000,! "count" : 25,! "min" : 956.0,! "max" : 17179.0,! "total" : 116597.0,! "total_count" : 25,! "mean" : 4663.88! }, {! "key" : 1000000,! "count" : 4,! "min" : 2798.0,! "max" : 4020.0,! "total" : 13216.0,! "total_count" : 4,! "mean" : 3304.0! }, {!
  53. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited aggs = buckets + calcs CA TX MA CO AZ
  54. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited density by state aggregation $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{! "aggs" : {! "mean_density_by_state" : {! "terms" : {! "field" : "state" ! }, ! "aggs": {! "mean_density": {! "avg" : { ! "field" : "density" ! } ! }! }! }! }! }' group by this field calculate stats for this field
  55. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited aggregation response "aggregations" : {! "mean_density_by_state" : {! "terms" : [ {! "term" : "CA",! "doc_count" : 69,! "mean_density" : {! "value" : 5558.623188405797! }! }, {! "term" : "TX",! "doc_count" : 32,! "mean_density" : {! "value" : 2496.625! }! }, {! "term" : "FL",! "doc_count" : 20,! "mean_density" : {! "value" : 4006.6! }! }, {! "term" : "CO",! "doc_count" : 11,! group by state density stats
  56. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited density by population aggregation $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{! "aggs" : {! "mean_density_by_population" : {! "histogram" : { ! "field" : "population2012", ! "interval": 500000 ! }, ! "aggs": {! "mean_density": {! "avg" : { ! "field" : "density" ! } ! }! }! }! }! }' group by population calculate stats density
  57. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited aggregation response "aggregations" : {! "mean_density_by_population" : [ {! "key" : 0,! "doc_count" : 255,! "mean_density" : {! "value" : 3890.1031746031745! }! }, {! "key" : 500000,! "doc_count" : 25,! "mean_density" : {! "value" : 4663.88! }! }, {! "key" : 1000000,! "doc_count" : 4,! "mean_density" : {! "value" : 3304.0! }! }, {! "key" : 1500000,! "doc_count" : 1,! "mean_density" : {! group by population density stats
  58. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited density by population by state $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{! "aggs" : {! "mean_density_by_population_by_state": {! "terms" : { "field" : "state" }, ! "aggs": {! "mean_density_by_population" : {! "histogram" : { ! "field" : "population2012", ! "interval": 500000 ! }, ! "aggs": {! "mean_density": {! "avg" : { ! "field" : "density" ! } ! }! }! }! } ! }! }! }' group by population calculate stats on density group by state
  59. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited aggregation response "aggregations" : {! "mean_density_by_population_by_state" : {! "terms" : [ {! "term" : "CA",! "doc_count" : 69,! "mean_density_by_population" : [ {! "key" : 0,! "doc_count" : 64,! "mean_density" : {! "value" : 5382.453125! }! }, {! "key" : 500000,! "doc_count" : 3,! "mean_density" : {! "value" : 8985.333333333334! }! }, {! "key" : 1000000,! "doc_count" : 1,! "mean_density" : {! "value" : 4020.0! }! }, {! "key" : 3500000,! "doc_count" : 1,! "mean_density" : {! "value" : 8092.0! }! } ]! }, {! "term" : "TX",! "doc_count" : 32,! group by population stats on density group by state
  60. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited calc aggregators • avg • min • max • sum • count • stats • extended stats
  61. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited bucket aggregators • global • filter • missing • terms • range • date range • ip range • histogram • date histogram • geo distance • nested
  62. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited https://github.com/elasticsearch/ elasticsearch/issues/3300
  63. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission

    is strictly prohibited thank you! Igor Motov twitter: @imotov email: igor.motov@elasticsearch.com ! ! ! ! ! ! ! ! • Support: http://elasticsearch.com/support • Training: http://training.elasticsearch.com/ • We are hiring: http://elasticsearch.com/about/jobs/