Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Druid GroupBy Query Performance Tuning

Sponsored · SiteGround - Reliable hosting with speed, security, and support you can count on.
Avatar for Jihoon Son Jihoon Son
October 19, 2017

Druid GroupBy Query Performance Tuning

Avatar for Jihoon Son

Jihoon Son

October 19, 2017
Tweet

Other Decks in Technology

Transcript

  1. Druid Aggregation Queries • Timeseries query ◦ Grouping over only

    the time dimension • Top-N query ◦ Grouping over only the time and a single dimensions and sorting over the dimension ◦ Approximate query • GroupBy query ◦ Grouping over any dimensions ◦ Flexible but slow
  2. GroupBy Query Processing • Overview Broker Processing threads HTTP threads

    Historical Processing threads HTTP threads Historical ... Query Result druid.processing.numThreads
  3. GroupBy Query Processing • Overview Broker Processing threads HTTP threads

    Historical Processing threads HTTP threads Historical ... Query Result Use a processing thread for processing a segment Use an HTTP thread to combine results per segment and return back to the broker
  4. GroupBy Query Processing • Two possible processing strategies ◦ GroupBy

    v2 ▪ Default strategy since 0.10.0 (added in 0.9.2) ▪ Hash aggregation using a fully off-heap map for per-segment result ▪ Sort-merge aggregation for merging per-segment results ▪ Better performance and memory management than v1 ◦ GroupBy v1 ▪ Legacy ▪ Hash aggregation using a map which is partially on-heap and partially off-heap for per-segment result ▪ Druid's indexing mechanism for merging per-segment results
  5. GroupBy Query v2 • Processing steps ◦ Per-segment aggregation in

    historicals ◦ Sorting and merging in historicals ◦ Sending merged results back to the broker ◦ Merging in the broker
  6. Per-segment Aggregation in Historicals • Spilling mode (concurrency = 4)

    Hash tables Processing threads Segments ... ... Spilled
  7. Improvements on GroupBy Query v2 • The hash aggregation cost

    is not small ◦ Hash + linear probing ◦ Growing on hash table full ◦ Key serialization to ByteBuffer • Array-based aggregation ◦ Available since 0.11.0 ◦ String type dimensions are encoded with the dictionary encoding ▪ Each string corresponds to an integer ◦ Use dictionary keys as grouping keys instead of actual strings ◦ No need for hashing and serializing keys, and growing hash table ◦ About 14% performance improvement
  8. Improvements on GroupBy Query v2 • Parallel sort Hash tables

    ... HTTP thread Sort Processing threads Merge
  9. Improvements on GroupBy Query v2 • Parallel sort ◦ Available

    since 0.11.0 ◦ About 10% performance improvement
  10. • Optimizing segment size ◦ Generally recommended to 200 -

    800MB per segment • Enabling disk spilling ◦ Disk spilling is disabled by default because it can increase the latency of other queries ▪ Processing threads are responsible for disk spilling ◦ Make sure that your queries are not able to be processed in memory and you need the correct result (See limit pushdown optimization) • Query result caching ◦ Query result caching is disabled for groupBy queries by default ▪ GroupBy queries usually return too many results Performance Tuning
  11. • Hash table parameter tuning ◦ Open addressing hash table

    + linear probing ◦ Gradually growing on hash table full ◦ bufferGrouperInitialBuckets: initial bucket number. 1024 by default ◦ bufferGrouperMaxLoadFactor: max load factor. 0.7 by default • Limit pushdown ◦ Available since 0.10.1 ◦ Similar to topN query ◦ Pushdown the limit spec to historicals for early pruning ◦ Enabled when all orderBy fields is a subset of grouping keys ◦ Set forceLimitPushDown to force to use this technique Performance Tuning
  12. • Parallel combine ◦ Available since 0.11.1 ◦ Applied only

    when data are spilled on disk ◦ About 45% performance improvement ◦ Set numParallelCombineThreads some value larger than 1 ◦ Using processing threads for combining may affect to the performance of other queries ▪ The processing thread pool is shared across all query processing Performance Tuning