Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elasticsearch Aggregations

Elasticsearch Aggregations

Kanji Yomoda

May 23, 2022
Tweet

More Decks by Kanji Yomoda

Other Decks in Technology

Transcript

  1. Confidential & Proprietary 2021 Type of aggregations • Metric aggregations

    => calculate metrics, such as a sum or average, from field values. • Bucket aggregations => group documents into buckets, based on field values, ranges, or other criteria. • Pipeline aggregations => take input from other aggregations.
  2. Confidential & Proprietary 2021 Metrics aggregations • Avg • Boxplot

    • Cardinality • Extended stats • Geo-bounds • Geo-centroid • Geo-Line • Matrix stats • Max • Median absolute deviation • Min • Percentile ranks • Percentiles • Rate • Scripted metric • Stats • String stats • Sum • T-test • Top hits • Top metrics • Value count • Weighted avg
  3. Confidential & Proprietary 2021 Bucket aggregations • Adjacency matrix •

    Auto-interval date histogram • Categorize text • Children • Composite • Date histogram • Date range • Diversified sampler • Filter • Filters • Geo-distance • Geohash grid • Geohex grid • Geotile grid • Global • Histogram • IP prefix • IP range • Missing • Multi Terms • Nested • Parent • Random sampler • Range • Rare terms • Reverse nested • Sampler • Significant terms • Significant text • Terms • Variable width histogram • Subtleties of bucketing range fields
  4. Confidential & Proprietary 2021 Pipeline aggregations • Extended stats bucket

    • Inference bucket • Max bucket • Min bucket • Moving function • Moving percentiles • Normalize • Percentiles bucket • Serial differencing • Stats bucket • Sum bucket • Average bucket • Bucket script • Bucket count K-S test • Bucket correlation • Bucket selector • Bucket sort • Change point • Cumulative cardinality • Cumulative sum • Derivative
  5. Confidential & Proprietary 2021 random_sampler aggregation = roughly seeing only

    0.1% of the documents (1 in every 1000th doc) = needed to get consistent result
  6. Confidential & Proprietary 2021 Spec • Facet group: (name: colors,

    facets [{white:10}, { black: 6},...]) • Show top N facets for each metadata (category, brand, color, and etc) • Show all facet counts for it when filtered by itself • Show filtered facet count by the other applied filters
  7. Confidential & Proprietary 2021 Filters aggregation × post_filter Search Category

    aggs - brand filter - color filter Brand aggs - color filter post_filter - brand filter - color filter Color aggs - brand filter Response
  8. Confidential & Proprietary 2021 References • Aggregations | Elasticsearch Guide

    [8.2] • Aggregate data faster with new the random_sampler aggregation • Building faceted search with elasticsearch for e-commerce: part 1