$30 off During Our Annual Pro Sale. View Details »

New Features in Elasticsearch v1.0

Igor Motov
November 04, 2013

New Features in Elasticsearch v1.0

Boston Elasticsearch Meetup
Nov 4, 2013

Igor Motov

November 04, 2013
Tweet

More Decks by Igor Motov

Other Decks in Programming

Transcript

  1. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Igor Motov
    New Features in Elasticsearch 1.0

    View Slide

  2. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    about
    • Developer at Elasticsearch Inc
    joined Elasticsearch Inc.: Oct 2012
    Elasticsearch contributor since Apr 2011
    !
    • Elasticsearch Inc
    founded: July 2012
    headquarters: Amsterdam and Los Altos, CA
    provides: training (public & onsite), development support,
    production support subscription

    View Slide

  3. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    v1.0 ?
    • v0.4.0 - Feb 8, 2010
    • v0.5.0 - Mar 5, 2010
    • …
    • v0.19.0 - Mar 1, 2012
    • v0.20.0 - Dec 7, 2012
    • v0.90.0 - Apr 29, 2013
    • v1.0 - Soon

    View Slide

  4. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    v1.0
    • rolling upgrades
    because not everyone can afford having “scheduled
    maintenance”
    • ability to backup data
    because “rm -rf" happens
    !

    View Slide

  5. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    v1.0
    • rolling upgrades
    • snapshot/restore (backup)
    • _cat API
    • aggregations
    • distributed percolator

    View Slide

  6. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    rolling upgrades
    Photo by Kamyar Adl

    View Slide

  7. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    snapshot and restore
    Photo by John http://www.flickr.com/people/60026579@N00

    View Slide

  8. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    snapshot and restore
    backup

    View Slide

  9. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    backup in 0.90
    1. disable flush
    2. find all primary shard location (optional)
    3. copy files from primary shards (rsync)
    4. enable flush

    View Slide

  10. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    backup in v1.0
    $ curl -XPUT localhost:9200/_snapshot/my_backup/snapshot_20131010
    snapshot

    name
    repository

    View Slide

  11. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    repositories
    • Snapshot Storage
    Shared File System - v1.0
    S3 - v1.0
    HDFS
    Google Compute Engine
    Microsoft Azure
    ...

    View Slide

  12. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    register repository
    $ curl -XPUT "localhost:9200/_snapshot/my_backup" -d '{!
    "type": "fs", !
    "settings": {!
    "location":"/mnt/es-test-repo"!
    }!
    }'
    location
    repository
    repository

    type

    View Slide

  13. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    start snapshot
    $ curl -XPUT "localhost:9200/_snapshot/my_backup/snapshot_20131010" -d '{!
    "indices":"+test_*,-test_4"!
    }'
    snapshot

    name
    repository
    index list

    (optional)

    View Slide

  14. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    restore in 0.90
    1. close the index (shutdown the cluster)
    2. find all existing index shards
    3. replace all index shards with data from backup
    4. open the index (start the cluster)

    View Slide

  15. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    restore in 1.0
    $ curl -XPOST "localhost:9200/test_*/_close"
    snapshot

    name
    close all indices

    that start with test_
    $ curl -XPOST "localhost:9200/_snapshot/my_backup/snapshot_20131010" -d
    '{!
    "indices":"test_*"!
    }'
    repository

    name
    index

    list

    View Slide

  16. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    https://github.com/elasticsearch/
    elasticsearch/issues/3826

    View Slide

  17. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    distributed percolator
    Image Source: Wikipedia,

    View Slide

  18. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    percolator
    • reverse search
    • alerts
    • updatable search results

    View Slide

  19. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    registering percolator in 0.90
    $ curl -XPUT “localhost:9200/_percolator/tweeter/es-tweets" -d ‘{!
    “query”: {!
    “match”: { “text”: “elasticsearch” }!
    }!
    }’!
    target

    index
    query id

    View Slide

  20. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    document percolation in 0.90
    $ curl -XGET “localhost:9200/twitter/tweet/_percolate” -d ‘{!
    “doc”: {!
    “text”: “#elasticsearch is awesome”!
    “nick”: “@imotov”!
    “name”: “Igor Motov”!
    “date”: “2013-11-03” !
    }!
    }’
    target

    index
    percolation

    end point
    document

    to be percolated
    {!
    “ok”: true!
    “matches”: [“es-tweets”]!
    }
    matching

    queries

    View Slide

  21. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    how does it work in 0.90?
    • all queries are stored in special _percolate index
    • _percolate index has 1 primary shard which is
    replicated to every node
    • each percolated document is indexed in memory
    • all queries are executed against this document
    sequentially
    • execution time is linear to number of queries!

    View Slide

  22. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    how does it work in 0.90?
    • all queries are stored in special _percolate index
    • _percolate index has 1 primary shard which is
    replicated to every node
    • each percolated document is indexed in memory
    • all queries are executed against this document
    sequentially
    • execution time is linear to number of queries!

    View Slide

  23. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    that doesn’t scale

    View Slide

  24. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    registering percolator in 1.0
    $ curl -XPUT “localhost:9200/some_index/_percolator/es-tweets” -d ‘{!
    “query”: {!
    “match”: { “body”: “elasticsearch” }!
    }!
    }’!
    reserved percolator

    type
    query id
    any index with as
    many shards as you
    need

    View Slide

  25. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    now that can scale!

    View Slide

  26. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    multi index support
    $ curl -XGET “localhost:9200/twitter,facebook/_percolate” -d ‘{!
    “doc”: {!
    “body”: “#elasticsearch is awesome”!
    “nick”: “@imotov”!
    “name”: “Igor Motov”!
    “date”: “2013-11-03” !
    }!
    }’
    document

    to be percolated

    View Slide

  27. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    full alias support
    $ curl -XGET “localhost:9200/soc_media_alias/_percolate” -d ‘{!
    “doc”: {!
    “body”: “#elasticsearch is awesome”!
    “nick”: “@imotov”!
    “name”: “Igor Motov”!
    “date”: “2013-11-03” !
    }!
    }’
    document

    to be percolated

    View Slide

  28. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    other features
    • percolation of existing document
    • percolate count api
    • filter support (in addition to queries in 0.90)
    • highlighting
    • scoring
    • multi percolate (bulk percolation)

    View Slide

  29. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    https://github.com/elasticsearch/
    elasticsearch/issues/3173

    View Slide

  30. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    _cat/* api
    Image Source: Wikipedia,

    View Slide

  31. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    _cat/* api
    no, it will not help you organize your
    massive collection of cat pictures

    View Slide

  32. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    _cat/* api
    It’s because humans suck
    at reading JSON

    View Slide

  33. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Which one is the master?
    $ curl "localhost:9200/_cluster/state?pretty&filter_metadata=true&!
    filter_routing_table=true"!
    {!
    "cluster_name" : "elasticsearch",!
    "master_node" : "GNf0hEXlTfaBvQXKBF300A",!
    "blocks" : { },!
    "nodes" : {!
    "ObdRqLHGQ6CMI5rOEstA5A" : {!
    "name" : "Triton",!
    "transport_address" : “inet[/10.0.1.11:9300]”,!
    "attributes" : { }!
    },!
    "4C7pKbfhTvu0slcSy_G4_w" : {!
    "name" : "Kid Colt",!
    "transport_address" : "inet[/10.0.1.12:9300]",!
    "attributes" : { }!
    },!
    "GNf0hEXlTfaBvQXKBF300A" : {!
    "name" : "Lang, Steven",!
    "transport_address" : "inet[/10.0.1.13:9300]",!
    "attributes" : { }!
    }!
    }!
    }

    View Slide

  34. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Which one is the master? (v0.90)
    $ curl "localhost:9200/_cluster/state?
    pretty&filter_metadata=true&filter_routing_table=true"!
    {!
    "cluster_name" : "elasticsearch",!
    "master_node" : "GNf0hEXlTfaBvQXKBF300A",!
    "blocks" : { },!
    "nodes" : {!
    "ObdRqLHGQ6CMI5rOEstA5A" : {!
    "name" : "Triton",!
    "transport_address" : “inet[/10.0.1.11:9300]”,!
    "attributes" : { }!
    },!
    "4C7pKbfhTvu0slcSy_G4_w" : {!
    "name" : "Kid Colt",!
    "transport_address" : "inet[/10.0.1.12:9300]",!
    "attributes" : { }!
    },!
    "GNf0hEXlTfaBvQXKBF300A" : {!
    "name" : "Lang, Steven",!
    "transport_address" : "inet[/10.0.1.13:9300]",!
    "attributes" : { }!
    }!
    }!
    }

    View Slide

  35. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Which one is the master? (v1.0)
    $ curl localhost:9200/_cat/master
    GNf0hEXlTfaBvQXKBF300A 10.0.1.13 Lang, Steven

    View Slide

  36. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    /cat/count
    $ curl localhost:9200/_cat/count!
    1383501234301 12:53:54 3344067
    count

    View Slide

  37. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    _cat/* api
    • /_cat/allocation
    • /_cat/count
    • /_cat/health
    • /_cat/master
    • /_cat/nodes
    • /_cat/recovery
    • /_cat/shards
    • /_cat/indices

    View Slide

  38. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    aggregations

    View Slide

  39. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    facets in 0.90
    facets

    View Slide

  40. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    What’s wrong with facets in 0.90?

    View Slide

  41. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    Nothing!

    View Slide

  42. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    We just want MOAR features!!!

    View Slide

  43. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    facets in 0.90
    • terms / terms stats
    • range
    • histogram / date histogram
    • filter/query
    • statistical
    • geo distance

    View Slide

  44. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    terms facet
    • Divides documents into buckets based on a value
    of a selected term
    • Calculates statistics on some other field of these
    document for each bucket

    View Slide

  45. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    index of large US cities
    {!
    "rank": "21",!
    "city": "Boston",!
    "state": "MA",!
    "population2012": "636479",!
    "population2010": "617594",!
    "land_area": "48.277",!
    "density": "12793",!
    "ansi": "619463",!
    "location": {!
    "lat": "42.332",!
    "lon": "71.0202"!
    }!
    }!

    View Slide

  46. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    terms facet request
    $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{!
    "facets": {!
    "stat1": {!
    "terms_stats": {!
    "key_field": "state",!
    "value_field": "density"!
    }!
    }!
    }!
    }'
    group by

    this field
    calculate stats

    for this field

    View Slide

  47. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    terms facet response
    "facets" : {!
    "stat1" : {!
    "_type" : "terms_stats",!
    "missing" : 0,!
    "terms" : [ {!
    "term" : "CA",!
    "count" : 69,!
    "total_count" : 69,!
    "min" : 1442.0,!
    "max" : 17179.0,!
    "total" : 383545.0,!
    "mean" : 5558.623188405797!
    }, {!
    "term" : "TX",!
    "count" : 32,!
    "total_count" : 32,!
    "min" : 1096.0,!
    "max" : 3974.0,!
    "total" : 79892.0,!
    "mean" : 2496.625!
    }, {!
    "term" : "FL",!
    "count" : 20,!
    group by

    field
    stats

    View Slide

  48. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    range facet request
    curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{!
    "facets": {!
    "population_ranges": {!
    "histogram": {!
    "key_field": "population2012",!
    "value_field": "density",!
    "interval": 500000!
    }!
    }!
    }!
    }'
    group by

    this field
    calculate stats

    by this field

    View Slide

  49. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    terms facet response
    "facets" : {!
    "population_ranges" : {!
    "_type" : "histogram",!
    "entries" : [ {!
    "key" : 0,!
    "count" : 255,!
    "min" : 171.0,!
    "max" : 17346.0,!
    "total" : 980306.0,!
    "total_count" : 252,!
    "mean" : 3890.1031746031745!
    }, {!
    "key" : 500000,!
    "count" : 25,!
    "min" : 956.0,!
    "max" : 17179.0,!
    "total" : 116597.0,!
    "total_count" : 25,!
    "mean" : 4663.88!
    }, {!
    "key" : 1000000,!
    "count" : 4,!
    "min" : 2798.0,!
    group by

    field (population)
    stats

    (density)

    View Slide

  50. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    MOAR!!!
    But what if I want an average density by
    population histogram for each state?

    View Slide

  51. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    aggregations
    Buckets Calculators

    View Slide

  52. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    aggs = buckets + calcs
    CA
    TX
    MA
    CO
    AZ
    "facets" : {!
    "population_ranges" : {!
    "_type" : "histogram",!
    "entries" : [ {!
    "key" : 0,!
    "count" : 255,!
    "min" : 171.0,!
    "max" : 17346.0,!
    "total" : 980306.0,!
    "total_count" : 252,!
    "mean" : 3890.1031746031745!
    }, {!
    "key" : 500000,!
    "count" : 25,!
    "min" : 956.0,!
    "max" : 17179.0,!
    "total" : 116597.0,!
    "total_count" : 25,!
    "mean" : 4663.88!
    }, {!
    "key" : 1000000,!
    "count" : 4,!
    "min" : 2798.0,!
    "max" : 4020.0,!
    "total" : 13216.0,!
    "total_count" : 4,!
    "mean" : 3304.0!
    }, {!

    View Slide

  53. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    aggs = buckets + calcs
    CA
    TX
    MA
    CO
    AZ

    View Slide

  54. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    density by state aggregation
    $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{!
    "aggs" : {!
    "mean_density_by_state" : {!
    "terms" : {!
    "field" : "state" !
    }, !
    "aggs": {!
    "mean_density": {!
    "avg" : { !
    "field" : "density" !
    } !
    }!
    }!
    }!
    }!
    }'
    group by

    this field
    calculate stats

    for this field

    View Slide

  55. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    aggregation response
    "aggregations" : {!
    "mean_density_by_state" : {!
    "terms" : [ {!
    "term" : "CA",!
    "doc_count" : 69,!
    "mean_density" : {!
    "value" : 5558.623188405797!
    }!
    }, {!
    "term" : "TX",!
    "doc_count" : 32,!
    "mean_density" : {!
    "value" : 2496.625!
    }!
    }, {!
    "term" : "FL",!
    "doc_count" : 20,!
    "mean_density" : {!
    "value" : 4006.6!
    }!
    }, {!
    "term" : "CO",!
    "doc_count" : 11,!
    group by

    state
    density stats

    View Slide

  56. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    density by population aggregation
    $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{!
    "aggs" : {!
    "mean_density_by_population" : {!
    "histogram" : { !
    "field" : "population2012", !
    "interval": 500000 !
    }, !
    "aggs": {!
    "mean_density": {!
    "avg" : { !
    "field" : "density" !
    } !
    }!
    }!
    }!
    }!
    }'
    group by

    population
    calculate stats

    density

    View Slide

  57. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    aggregation response
    "aggregations" : {!
    "mean_density_by_population" : [ {!
    "key" : 0,!
    "doc_count" : 255,!
    "mean_density" : {!
    "value" : 3890.1031746031745!
    }!
    }, {!
    "key" : 500000,!
    "doc_count" : 25,!
    "mean_density" : {!
    "value" : 4663.88!
    }!
    }, {!
    "key" : 1000000,!
    "doc_count" : 4,!
    "mean_density" : {!
    "value" : 3304.0!
    }!
    }, {!
    "key" : 1500000,!
    "doc_count" : 1,!
    "mean_density" : {!
    group by

    population
    density stats

    View Slide

  58. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    density by population by state
    $ curl -XGET "localhost:9200/test-data/cities/_search?pretty" -d '{!
    "aggs" : {!
    "mean_density_by_population_by_state": {!
    "terms" : { "field" : "state" }, !
    "aggs": {!
    "mean_density_by_population" : {!
    "histogram" : { !
    "field" : "population2012", !
    "interval": 500000 !
    }, !
    "aggs": {!
    "mean_density": {!
    "avg" : { !
    "field" : "density" !
    } !
    }!
    }!
    }!
    } !
    }!
    }!
    }'
    group by

    population
    calculate stats

    on density
    group by

    state

    View Slide

  59. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    aggregation response
    "aggregations" : {!
    "mean_density_by_population_by_state" : {!
    "terms" : [ {!
    "term" : "CA",!
    "doc_count" : 69,!
    "mean_density_by_population" : [ {!
    "key" : 0,!
    "doc_count" : 64,!
    "mean_density" : {!
    "value" : 5382.453125!
    }!
    }, {!
    "key" : 500000,!
    "doc_count" : 3,!
    "mean_density" : {!
    "value" : 8985.333333333334!
    }!
    }, {!
    "key" : 1000000,!
    "doc_count" : 1,!
    "mean_density" : {!
    "value" : 4020.0!
    }!
    }, {!
    "key" : 3500000,!
    "doc_count" : 1,!
    "mean_density" : {!
    "value" : 8092.0!
    }!
    } ]!
    }, {!
    "term" : "TX",!
    "doc_count" : 32,!
    group by

    population
    stats on density
    group by state

    View Slide

  60. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    calc aggregators
    • avg
    • min
    • max
    • sum
    • count
    • stats
    • extended stats

    View Slide

  61. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    bucket aggregators
    • global
    • filter
    • missing
    • terms
    • range
    • date range
    • ip range
    • histogram
    • date histogram
    • geo distance
    • nested

    View Slide

  62. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    https://github.com/elasticsearch/
    elasticsearch/issues/3300

    View Slide

  63. Copyright Elasticsearch 2013. Copying, publishing and/or distributing without written permission is strictly prohibited
    thank you!
    Igor Motov
    twitter: @imotov
    email: [email protected]
    !
    !
    !
    !
    !
    !
    !
    !
    • Support: http://elasticsearch.com/support
    • Training: http://training.elasticsearch.com/
    • We are hiring: http://elasticsearch.com/about/jobs/

    View Slide