Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch - aggregations

Elasticsearch Inc
May 26, 2014
2.6k

Elasticsearch - aggregations

"Elasticsearch - aggregations" at Berlin Buzzwords 2014

Elasticsearch Inc

May 26, 2014
Tweet

Transcript

  1. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited outline • what aggregations are • why we built them • how they work what the trade-offs are
  2. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited aggregations • analytics histograms, distributions, statistics • over any partition of your data anything that can be selected with queries/filters • in near real time computed on the fly, ~1s refresh interval • that can be composed unlike facets
  3. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited bucket / metrics • bucket terms histogram range filter geohash grid • metrics stats min / max / avg / sum percentiles cardinality root aggregation: collects everything inner aggregation: bucket leaf aggregation: bucket or metric
  4. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited traffic analysis { “source_ip" : “77.104.12.13”, “timestamp” : “2014-05-25T23:44:12.779Z” } Unique visitors per day 0 27,5 55 82,5 110 Mon Tue Wed Thu Fri Sat Sun histogram (timestamp) cardinality (source_ip) root
  5. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited performance analysis { “resp_time” : 205, “timestamp” : “2014-05-25T23:44:12.779Z” } Median, 90th, 99th percentiles over time 0 125 250 375 500 0:00 3:00 6:00 9:00 12:00 15:00 18:00 21:00 histogram (timestamp) percentiles (resp_time) root
  6. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited e-commerce { “category” : “Dresses”, “site” : “Zalando”, “brand” : “Desigual”, “designation”: “dress”, “price”: 85 } • Dresses: 23 offers, 9 sites • Urbanist: 12 min_price: 60 • Desigual: 8 min_price: 85 • Life: 3 min_price: 52 • Shoes: 19, 3 sites • Skirts: 8, 5 sites terms (category) cardinality (site) terms (brand) min (price) root
  7. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited e-commerce { “category” : “Dresses”, “site” : “Zalando”, “brand” : “Desigual”, “designation”: “dress”, “price”: 85 } • Dresses: 23 offers, 9 sites • Urbanist: 12 min_price: 60 • Desigual: 8 min_price: 85 • Life: 3 min_price: 52 • Shoes: 19, 3 sites • Skirts: 8, 5 sites terms (category) cardinality (site) terms (brand) min (price) root
  8. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited e-commerce { “category” : “Dresses”, “site” : “Zalando”, “brand” : “Desigual”, “designation”: “dress”, “price”: 85 } • Dresses: 23 offers, 9 sites • Urbanist: 12 min_price: 60 • Desigual: 8 min_price: 85 • Life: 3 min_price: 52 • Shoes: 19, 3 sites • Skirts: 8, 5 sites terms (category) cardinality (site) terms (brand) min (price) root
  9. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited e-commerce { “category” : “Dresses”, “site” : “Zalando”, “brand” : “Desigual”, “designation”: “dress”, “price”: 85 } • Dresses: 23 offers, 9 sites • Urbanist: 12 min_price: 60 • Desigual: 8 min_price: 85 • Life: 3 min_price: 52 • Shoes: 19, 3 sites • Skirts: 8, 5 sites terms (category) cardinality (site) terms (brand) min (price) root
  10. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited e-commerce { “category” : “Dresses”, “site” : “Zalando”, “brand” : “Desigual”, “designation”: “dress”, “price”: 85 } • Dresses: 23 offers, 9 sites • Urbanist: 12 min_price: 60 • Desigual: 8 min_price: 85 • Life: 3 min_price: 52 • Shoes: 19, 3 sites • Skirts: 8, 5 sites terms (category) cardinality (site) terms (brand) min (price) root
  11. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited why on elasticsearch? • powerful when combined with search data exploration • search engines have had faceted search for a very long time storage is optimized for such a workload • aggregations are a new iteration with increased capabilities / flexibility
  12. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited why is it fast? • data stored to make information retrieval fast yet indexing remains faster than what you expect • optimized data structures compressed columnar storage (field data / doc values) strings are enums (per segment) • single pass on your data no matter how many levels of aggregations there are
  13. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (shard level) inverted index top hits collector aggregations collector
  14. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 terms (category) min (price) bucket of the parent aggregation
  15. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 Shoes 1 60 terms (category) min (price)
  16. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 Shoes 1 60 Clothing 1 80 terms (category) min (price)
  17. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 Shoes 2 50 Clothing 1 80 terms (category) min (price)
  18. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 Shoes 2 50 Clothing 1 80 Sports 1 10 terms (category) min (price)
  19. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 Shoes 2 50 Clothing 1 80 Sports 2 10 terms (category) min (price)
  20. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (cluster level) Clothing 5 45 Shoes 3 60 Accessories 12 5 Shoes 2 50 Clothing 1 80 Sports 2 10
  21. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (cluster level) Clothing 5 45 Shoes 3 60 Accessories 12 5 Shoes 2 50 Clothing 1 80 Sports 2 10 Shoes 5 50
  22. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (cluster level) Clothing 5 45 Shoes 3 60 Accessories 12 5 Shoes 2 50 Clothing 1 80 Sports 2 10 Shoes 5 50 Clothing 6 45
  23. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (cluster level) Clothing 5 45 Shoes 3 60 Accessories 12 5 Shoes 2 50 Clothing 1 80 Sports 2 10 Shoes 5 50 Clothing 6 45 Sports 2 10
  24. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited how it works (cluster level) Clothing 5 45 Shoes 3 60 Accessories 12 5 Shoes 2 50 Clothing 1 80 Sports 2 10 Shoes 5 50 Clothing 6 45 Sports 2 10 Accessories 12 5
  25. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited goodies • support for document relations via nested documents and the nested/reverse_nested aggs no parent/child support (yet?) • significant_terms find the uncommonly common • upcoming top_hits aggregations in 1.3 compute top hits on each bucket • performance / memory usage improved in 1.2 Upgrade if you rely on aggregations