Slide 1

Slide 1 text

Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission is strictly prohibited Adrien Grand @jpountz aggregations

Slide 2

Slide 2 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited outline • what aggregations are • why we built them • how they work what the trade-offs are

Slide 3

Slide 3 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited aggregations • analytics histograms, distributions, statistics • over any partition of your data anything that can be selected with queries/filters • in near real time computed on the fly, ~1s refresh interval • that can be composed unlike facets

Slide 4

Slide 4 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited bucket / metrics • bucket terms histogram range filter geohash grid • metrics stats min / max / avg / sum percentiles cardinality root aggregation: collects everything inner aggregation: bucket leaf aggregation: bucket or metric

Slide 5

Slide 5 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited traffic analysis { “source_ip" : “77.104.12.13”, “timestamp” : “2014-05-25T23:44:12.779Z” } Unique visitors per day 0 27,5 55 82,5 110 Mon Tue Wed Thu Fri Sat Sun histogram (timestamp) cardinality (source_ip) root

Slide 6

Slide 6 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited performance analysis { “resp_time” : 205, “timestamp” : “2014-05-25T23:44:12.779Z” } Median, 90th, 99th percentiles over time 0 125 250 375 500 0:00 3:00 6:00 9:00 12:00 15:00 18:00 21:00 histogram (timestamp) percentiles (resp_time) root

Slide 7

Slide 7 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited e-commerce { “category” : “Dresses”, “site” : “Zalando”, “brand” : “Desigual”, “designation”: “dress”, “price”: 85 } • Dresses: 23 offers, 9 sites • Urbanist: 12 min_price: 60 • Desigual: 8 min_price: 85 • Life: 3 min_price: 52 • Shoes: 19, 3 sites • Skirts: 8, 5 sites terms (category) cardinality (site) terms (brand) min (price) root

Slide 8

Slide 8 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited e-commerce { “category” : “Dresses”, “site” : “Zalando”, “brand” : “Desigual”, “designation”: “dress”, “price”: 85 } • Dresses: 23 offers, 9 sites • Urbanist: 12 min_price: 60 • Desigual: 8 min_price: 85 • Life: 3 min_price: 52 • Shoes: 19, 3 sites • Skirts: 8, 5 sites terms (category) cardinality (site) terms (brand) min (price) root

Slide 9

Slide 9 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited e-commerce { “category” : “Dresses”, “site” : “Zalando”, “brand” : “Desigual”, “designation”: “dress”, “price”: 85 } • Dresses: 23 offers, 9 sites • Urbanist: 12 min_price: 60 • Desigual: 8 min_price: 85 • Life: 3 min_price: 52 • Shoes: 19, 3 sites • Skirts: 8, 5 sites terms (category) cardinality (site) terms (brand) min (price) root

Slide 10

Slide 10 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited e-commerce { “category” : “Dresses”, “site” : “Zalando”, “brand” : “Desigual”, “designation”: “dress”, “price”: 85 } • Dresses: 23 offers, 9 sites • Urbanist: 12 min_price: 60 • Desigual: 8 min_price: 85 • Life: 3 min_price: 52 • Shoes: 19, 3 sites • Skirts: 8, 5 sites terms (category) cardinality (site) terms (brand) min (price) root

Slide 11

Slide 11 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited e-commerce { “category” : “Dresses”, “site” : “Zalando”, “brand” : “Desigual”, “designation”: “dress”, “price”: 85 } • Dresses: 23 offers, 9 sites • Urbanist: 12 min_price: 60 • Desigual: 8 min_price: 85 • Life: 3 min_price: 52 • Shoes: 19, 3 sites • Skirts: 8, 5 sites terms (category) cardinality (site) terms (brand) min (price) root

Slide 12

Slide 12 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited why on elasticsearch? • powerful when combined with search data exploration • search engines have had faceted search for a very long time storage is optimized for such a workload • aggregations are a new iteration with increased capabilities / flexibility

Slide 13

Slide 13 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited why is it fast? • data stored to make information retrieval fast yet indexing remains faster than what you expect • optimized data structures compressed columnar storage (field data / doc values) strings are enums (per segment) • single pass on your data no matter how many levels of aggregations there are

Slide 14

Slide 14 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (shard level) inverted index top hits collector aggregations collector

Slide 15

Slide 15 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 terms (category) min (price) bucket of the parent aggregation

Slide 16

Slide 16 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 Shoes 1 60 terms (category) min (price)

Slide 17

Slide 17 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 Shoes 1 60 Clothing 1 80 terms (category) min (price)

Slide 18

Slide 18 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 Shoes 2 50 Clothing 1 80 terms (category) min (price)

Slide 19

Slide 19 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 Shoes 2 50 Clothing 1 80 Sports 1 10 terms (category) min (price)

Slide 20

Slide 20 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (shard level) Shoes Clothing Shoes Sports Sports Category Price 60 80 50 10 35 Shoes 2 50 Clothing 1 80 Sports 2 10 terms (category) min (price)

Slide 21

Slide 21 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (cluster level) Clothing 5 45 Shoes 3 60 Accessories 12 5 Shoes 2 50 Clothing 1 80 Sports 2 10

Slide 22

Slide 22 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (cluster level) Clothing 5 45 Shoes 3 60 Accessories 12 5 Shoes 2 50 Clothing 1 80 Sports 2 10 Shoes 5 50

Slide 23

Slide 23 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (cluster level) Clothing 5 45 Shoes 3 60 Accessories 12 5 Shoes 2 50 Clothing 1 80 Sports 2 10 Shoes 5 50 Clothing 6 45

Slide 24

Slide 24 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (cluster level) Clothing 5 45 Shoes 3 60 Accessories 12 5 Shoes 2 50 Clothing 1 80 Sports 2 10 Shoes 5 50 Clothing 6 45 Sports 2 10

Slide 25

Slide 25 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited how it works (cluster level) Clothing 5 45 Shoes 3 60 Accessories 12 5 Shoes 2 50 Clothing 1 80 Sports 2 10 Shoes 5 50 Clothing 6 45 Sports 2 10 Accessories 12 5

Slide 26

Slide 26 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited goodies • support for document relations via nested documents and the nested/reverse_nested aggs no parent/child support (yet?) • significant_terms find the uncommonly common • upcoming top_hits aggregations in 1.3 compute top hits on each bucket • performance / memory usage improved in 1.2 Upgrade if you rely on aggregations

Slide 27

Slide 27 text

Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission is strictly prohibited thank you!