Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch Aggregations

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

Elasticsearch Aggregations

Avatar for Kanji Yomoda

Kanji Yomoda

May 23, 2022
Tweet

More Decks by Kanji Yomoda

Other Decks in Technology

Transcript

  1. Confidential & Proprietary 2021 Type of aggregations • Metric aggregations

    => calculate metrics, such as a sum or average, from field values. • Bucket aggregations => group documents into buckets, based on field values, ranges, or other criteria. • Pipeline aggregations => take input from other aggregations.
  2. Confidential & Proprietary 2021 Metrics aggregations • Avg • Boxplot

    • Cardinality • Extended stats • Geo-bounds • Geo-centroid • Geo-Line • Matrix stats • Max • Median absolute deviation • Min • Percentile ranks • Percentiles • Rate • Scripted metric • Stats • String stats • Sum • T-test • Top hits • Top metrics • Value count • Weighted avg
  3. Confidential & Proprietary 2021 Bucket aggregations • Adjacency matrix •

    Auto-interval date histogram • Categorize text • Children • Composite • Date histogram • Date range • Diversified sampler • Filter • Filters • Geo-distance • Geohash grid • Geohex grid • Geotile grid • Global • Histogram • IP prefix • IP range • Missing • Multi Terms • Nested • Parent • Random sampler • Range • Rare terms • Reverse nested • Sampler • Significant terms • Significant text • Terms • Variable width histogram • Subtleties of bucketing range fields
  4. Confidential & Proprietary 2021 Pipeline aggregations • Extended stats bucket

    • Inference bucket • Max bucket • Min bucket • Moving function • Moving percentiles • Normalize • Percentiles bucket • Serial differencing • Stats bucket • Sum bucket • Average bucket • Bucket script • Bucket count K-S test • Bucket correlation • Bucket selector • Bucket sort • Change point • Cumulative cardinality • Cumulative sum • Derivative
  5. Confidential & Proprietary 2021 random_sampler aggregation = roughly seeing only

    0.1% of the documents (1 in every 1000th doc) = needed to get consistent result
  6. Confidential & Proprietary 2021 Spec • Facet group: (name: colors,

    facets [{white:10}, { black: 6},...]) • Show top N facets for each metadata (category, brand, color, and etc) • Show all facet counts for it when filtered by itself • Show filtered facet count by the other applied filters
  7. Confidential & Proprietary 2021 Filters aggregation × post_filter Search Category

    aggs - brand filter - color filter Brand aggs - color filter post_filter - brand filter - color filter Color aggs - brand filter Response
  8. Confidential & Proprietary 2021 References • Aggregations | Elasticsearch Guide

    [8.2] • Aggregate data faster with new the random_sampler aggregation • Building faceted search with elasticsearch for e-commerce: part 1