All About Elasticsearch Algorithms and Data Structures

‹#› Colin Goodheart-Smithe (@colings86) Zachary Tong (@zacharytong) All About Elasticsearch
Algorithms and Data Structures

‹#› Roaring Bitmaps When you can’t decide if you’re data
is dense or sparse

3 Filter Caching • A filter either matches or does
not match a document • Due to immutable segments, we have an opportunity to   cache frequent filters Doc #1 Doc #2 Doc #3 Doc #4 Doc #5 Doc #6 Matches! Matches! Matches!

4 Filter Caching Doc #1 Doc #2 Doc #3 Doc
#4 Doc #5 Doc #6 [ 1, 0, 0, 1, 0, 1 ] Bitmap • A filter either matches or does not match a document • Due to immutable segments, we have an opportunity to   cache frequent filters

5 Some points to keep in mind • Each Lucene
segment can hold up to 231-1 documents (e.g. 4 byte IDs) • Stored in memory, so compression is important • However, usage must be faster than re-executing the filter

6 Approach #1: Sorted List • Store the ID’s in
a sorted list Doc #1 Doc #4 Doc #6 [1, 4, 6]

7 Approach #1: Sorted List • Very compact when filters
are sparse Doc #1 Doc #4 Doc #6 [1, 4, 6] 12 bytes yay!

8 Approach #1: Sorted List • Dense filters become problematic
Doc #1 Doc #4 Doc #6 [1, 2, …………………………… ………………………………… ………………………………… … 99999999, 100000000 ] 381mb oh no! Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #99999998 Doc #99999999 Doc #100000000 =(

9 Approach #2: Bitmaps • Save a single bit for
each matching document instead Doc #1 Doc #4 Doc #6 [1, 1, 1, 1, 1, 0, 1, 1, 1, 1 ….… ………………………………… ………………………………… … 1, 1 ] Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #99999998 Doc #99999999 Doc #100000000

10 Approach #2: Bitmaps • Save a single bit for
each matching document instead Doc #1 Doc #4 Doc #6 [1, 1, 1, 1, 1, 0, 1, 1, 1, 1 ….… ………………………………… ………………………………… … 1, 1 ] Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Down to 12mb! =) Doc #99999998 Doc #99999999 Doc #100000000

11 Approach #2: Bitmap • … except it’s identical for
the sparse case too. Doc #1 Doc #4 Doc #6 [1, 0, 0, 1, 0, 1, 0, 0, 0, 0 ….… ………………………………… ………………………………… … 0, 0 ] Hmm…still 12mb

12 Alternative #3: Various Compressed Bitmaps • Byte Aligned Bitmaps
(BBC) • Word-Aligned Hybrid (WAH) • PLWAH / EWAH variants • Compressed’n’Composable Integer Set (CONCISE) • Compressed Adaptive Index (COMPAX) • SECOMPAX / ICX • “Traditional” compression (LZ4, DEFLATE, etc)

13 Alternative #3: Various RLE Compressed Bitmaps • Good compression! 
  • Slower (relatively) than Sorted Lists or Raw Bitmaps • Slow random access to bits • May lose ability to bitwise AND/OR multiple bitmaps together

Overview so far 14 • Great for sparse • Expensive
for dense Sorted Lists • Great for Dense • Expensive for sparse Raw Bitmaps • Great compression for heterogeneous  • Slow(er) decoding • Slow random access RLE Compressed

Roaring Bitmaps 15 • Great for sparse • Expensive for
dense Sorted Lists • Great for Dense • Expensive for sparse Raw Bitmaps • Great compression for heterogeneous • Slow(er) decoding • Slow random access RLE Compressed

16 Partition into 216 chunks 0 1 2 3 …
… 65535 0 1 0 0 … … 1 Doc ID Match? 65536 65537 65538 65539 … .… 131071 1 1 1 1 … … 1 Doc ID Match? 131072 131073 131074 131075 … .… 196608 1 0 0 1 … … 0 Doc ID Match?

17 Store containers in vector 0 1 2 3 …
… 65535 0 1 0 0 … … 1 Doc ID Match? 65536 65537 65538 65539 … .… 131071 1 1 1 1 … … 1 Doc ID Match? 131072 131073 131074 131075 … .… 196608 1 0 0 1 … … 0 Doc ID Match? 0 1 2

18 Vector index == 16 least-significant bits 0 1 2
3 … … 65535 0 1 0 0 … … 1 Doc ID Match? 1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 Doc ID Match? 0 1 2 0 1 2 3 … … 65535 0 1 2 3 … … 65535

19 0 1 2 3 … … 6553 0 1
0 0 … … 1 Doc ID Match? 1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 Doc ID Match? 0 1 2 0 1 2 3 … … 65535 0 1 2 3 … … 6553 2 Bytes instead of 4 Vector index == 16 least-significant bits Implicit 16 bits of ID

20 0 1 2 3 … … 65535 0 1
0 0 … … 1 Doc ID Match? 1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 Doc ID Match? 0 1 2 0 1 2 3 … … 65535 0 1 2 3 … … 65535 Fewer than 4096 Values?

21 Fewer than 4096 Values? • Save as a Sorted
List 1 1920 3303 Doc ID 1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 Doc ID Match? 0 1 2 0 1 2 3 … … 65535 0 1 2 3 … … 65535

22 More than 4096 Values? 1 1920 3303 Doc ID
1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 Doc ID Match? 0 1 2 0 1 2 3 … … 65535 0 1 2 3 … … 65535

23 • Save as dense bitmap 1 1920 3303 Doc
ID 1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 State 0 1 2 0 1 2 3 … … 65535 More than 4096 Values?

24 • Super dense, relatively few zeros. Save as “inverted”
Sorted List 1 1920 3303 Doc ID 1 0 0 1 … … 0 State 0 1 2 More than 61440 values? 2382 9112 10229 Doc ID Lucene Contribution

25 Why 4096 cutoff?

26 Memory Footprint

27 More reading • https://www.elastic.co/blog/frame-of-reference-and-roaring-bitmaps  • http://roaringbitmap.org/  • https://issues.apache.org/jira/browse/LUCENE-5983

‹#› Simulated Annealing Quickly finding “good enough” parameters

29 Moving averages • Pipeline Aggs introduced moving averages

30 Variously weighted averages • Simple (no weighting) • Linear
• Exponential • Double-Exponential (Holt) • Triple-Exponential (Holt-Winters)

31 Variously weighted averages • Simple (no weighting) • Linear
• Exponential • Double-Exponential (Holt) • Triple-Exponential (Holt-Winters) Have configurable parameters

32 Configurable parameters α • Exponential • Holt • Holt-winters
“Level” β “Trend” • Holt • Holt-winters γ “Seasonal” • Holt-winters (that’s a gamma)

33 Turns out, tuning parameters is hard • Small changes
had large impact • Changing one parameter affected the other parameters • Not intuitive to mere mortals (e.g. me) • Frustrating user-experience

Black-box optimization 34 Because sometimes you just need a hammer

‹#› anneal: to heat and then slowly cool (metal, glass,
etc.) in order to make it stronger Merriam-Webster Dictionary

36 Simulated Annealing Process 1.Pick random neighbor * 2.Evaluate “cost”
• If “cost” > “best_cost”, keep solution • Otherwise discard   BUT with random probability p, keep solution anyway 3.Repeat, lowering probability p over time

37 Simulated Annealing Process 1.Pick random neighbor * 2.Evaluate “cost”
• If “cost” > “best_cost”, keep solution • Otherwise discard BUT with random probability 3.Repeat, lowering probability “Random neighbor” Mutate one of parameters, leave the rest constant

38 Simulated Annealing Score Solution Space Best Score: Temperature: 0
100

39 Score Solution Space Best Score: Temperature: 10 100 10
10 > 0

14 > 10

35 > 14

random chance!

Notice how it unsticks the “pretty good” solution 35 Local Minima

85 > 12 Which allows finding the better solution

As temp drops, chance of random changes decreases

48 Simulated Annealing • Randomness samples the entire solution space
• “Unsticks” from local minima • Over time, random changes less likely, “homing” in on a solution

49 Simulated Annealing in Elasticsearch • 100 Iterations per “round”
• Decreases temperature by 10% each round • Ends when temp < 0.0001 • ~ 6600 iterations total

50 Simulated Annealing in Elasticsearch • “Trains” on the last
window of data Training Window Forecasting Backcasting

‹#› T-Digest Percentiles

52 T-Digest Percentiles • The t-digest algorithm is used to
compute quantiles • Quite similar to k-means • Builds sorted centroids • Constraint: max size of a cluster: • 4 * count * q * (1 - q) / C • C = compression

53 T-Digest Percentiles • compression trades accuracy for memory usage
• about 5*C centroids • error almost always < 3/C • excellent accuracy on extreme quantiles thanks to the q (1 - q) factor • implemented on numbers but could work on anything that is comparable and can be averaged

54 Calculating T-digest Percentiles 4 7 10 2 1 -5
-9 1 3 2 4 8 2 5 2 2 5 2 3 1 • 40 values overall

-9 1 3 2 4 8 2 5 2 2 5 2 3 1 • 40 values overall • -9 is the value for 0 <= q < 1/40

-9 1 3 2 4 8 2 5 2 2 5 2 3 1 • 40 values overall • -5 is the value for 1/40 <= q < 4 / 40

-9 1 3 2 4 8 2 5 2 2 5 2 3 1 • 40 values overall • 1 is the value for 4/40 <= q < 6/40 • etc.

58 Inserting Values into T-digest 4 7 10 2 1
-5 -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • Inserting 8 into the histogram

-5 -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • Inserting 8 into the histogram • Find the centroid nearest the value

60 Inserting Values into T-digest 4 7.3 10 2 1
-5 -9 1 3 2 4 8 3 5 2 2 5 2 3 1 • Inserting 8 into the histogram • Increment the count for the centroid • Adjust the centroid value • Notice that the capacity for all centroids increases slightly

-5 -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • Inserting 5 into the histogram

-5 -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • Inserting 5 into the histogram • Find the centroid nearest the value

-5 -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • Inserting 5 into the histogram • Incrementing the count would exceed the threshold • Create new centroid with value 5 and count of 1 10 1

64 T-Digest Practical Notes • Adding a new value to
the bounds always creates a new centroid (because q(1-q) is 0) • When the histogram is too large: compress • reinsert in random order • when centroid count is > 20 * C in practice

‹#› HDRHistogram Percentiles

66 HDRHistogram Percentiles • Uses a combination of logarithmic and
linear bucketing • Conceptually buckets values in two levels: • Logarithmic scaled buckets • Linear scaled sub-buckets • No bound on the value in each buckets (in practice it is limited to a long value)

67 HDRHistogram Percentiles • Accuracy parameter is express as number
of significant figures of a value to store in the histogram • Can be between 0 and 5 • Number of significant figures trades accuracy for memory usage • Affects the number of linear sub-buckets used for each logarithmic bucket

68 HDRHistogram Bucketing (1 s.f.) 100 101 102 103 104
105 106 107 10 20 30 40 50 60 70 80 90 Logarithmic Buckets Linear Sub-Buckets

69 HDRHistogram Bucketing (2 s.f.) 100 Linear Sub-Buckets 110 120
130 990 980 970 960 950 100 102 103 104 105 106 107 Logarithmic Buckets

70 Calculating HDRHistogram Percentiles 10 20 30 40 50 60
70 80 90 1 3 2 4 6 2 5 2 5 1 2 3 4 5 6 7 8 9 1 3 2 4 6 2 5 2 5 100 101 102 103 104 105 106 107 • 250 values overall

70 80 90 1 3 2 4 6 2 5 2 5 1 2 3 4 5 6 7 8 9 1 3 2 4 6 2 5 2 5 100 101 102 103 104 105 106 107 • 250 values overall • 1 is the value for 0 <= q < 1/250

70 80 90 1 3 2 4 6 2 5 2 5 1 2 3 4 5 6 7 8 9 1 3 2 4 6 2 5 2 5 100 101 102 103 104 105 106 107 • 250 values overall • 2 is the value for 1/250 <= q < 4/250

70 80 90 1 3 2 4 6 2 5 2 5 1 2 3 4 5 6 7 8 9 1 3 2 4 6 2 5 2 5 100 101 102 103 104 105 106 107 • 250 values overall • 70 is the value for q = 0.2

74 Inserting Values into HDRHistogram 100 101 102 103 104
• Inserting 42 into the histogram

75 Inserting Values into HDRHistogram • Inserting 42 into the
histogram • Find the logarithmic bucket for the value 10 20 30 40 50 60 70 80 90 1 3 2 4 6 2 5 2 5 100 101 102 103 104

60 70 80 90 1 3 2 4 6 2 5 2 5 • Inserting 42 into the histogram • Find the sub-bucket for the value 100 101 102 103 104

77 Inserting Values into HDRHistogram • Inserting 42 into the
histogram • Increment the count for the bucket 10 20 30 40 50 60 70 80 90 1 3 2 5 6 2 5 2 5 100 101 102 103 104

• Inserting 1,400,300 into the histogram • No logarithmic bucket to hold the value

100 101 102 103 104 105 106 79 Inserting Values
into HDRHistogram • Inserting 1,400,300 into the histogram • Create logarithmic buckets (and sub- buckets) to include the new value 1.1 E5 1.2 E5 1.3 E5 1.4 E5 1.5 E5 1.6 E5 1.7 E5 1.8 E5 1.9 E5 0 0 0 0 0 0 0 0 0 1.1 E6 1.2 E6 1.3 E6 1.4 E6 1.5 E6 1.6 E6 1.7 E6 1.8 E6 1.9 E6 0 0 0 0 0 0 0 0 0

into HDRHistogram • Inserting 1,400,300 into the histogram • Find logarithmic bucket for the value 1.1 E6 1.2 E6 1.3 E6 1.4 E6 1.5 E6 1.6 E6 1.7 E6 1.8 E6 1.9 E6 0 0 0 0 0 0 0 0 0

into HDRHistogram • Inserting 1,400,300 into the histogram • Find sub-bucket bucket for the value 1.1 E6 1.2 E6 1.3 E6 1.4 E6 1.5 E6 1.6 E6 1.7 E6 1.8 E6 1.9 E6 0 0 0 0 0 0 0 0 0

82 Inserting Values into HDRHistogram • Inserting 1,400,300 into the
histogram • Increment the count for the bucket 1.1 E6 1.2 E6 1.3 E6 1.4 E6 1.5 E6 1.6 E6 1.7 E6 1.8 E6 1.9 E6 0 0 0 1 0 0 0 0 0 100 101 102 103 104 105 106

83 HDRHistogram Practical Notes • Implemented as flat long array
with base-2 logarithmic bucket values • Accuracy can be better than the set significant digits but can not be worse • Size of histogram in memory depends on the range of values and the number of significant digits • Implementation requires values as longs but wrapper implementation supporting doubles is available

84 Which should I use? • Default in Elasticsearch is
currently t-digest • Use t-digest when you are interested in the extreme values (e.g. 99.99th percentile) • T-Digest tries to adapt to the data so can be used for a wide variety of data as the expense of some time performance • HDRHistogram is fast as it has a fixed histogram which does not need compression or centroid re-calculations • HDRHistogram requires positive values and will be more beneficial when the data is zero based so cannot be applied to all use cases • HDRHistogram performs very well on latency data

‹#› Questions?

‹#› Please attribute Elastic with a link to elastic.co Except
where otherwise noted, this work is licensed under http://creativecommons.org/licenses/by-nd/4.0/ Creative Commons and the double C in a circle are registered trademarks of Creative Commons in the United States and other countries. Third party marks and brands are the property of their respective holders. 86

87 Alternative #3: Various RLE Compressed Bitmaps • Generally encode
“runs” with codewords All 0’s All 1’s 10110…1 10110…1 10110…1

“runs” with codewords All 0’s All 1’s 10110…1 10110…1 10110…1 31 bits “Dirty” 3x 31 bits “All Zero” 31 bits “Dirty” 31 bits “Dirty” 2x 31 bits “All One”

“runs” with codewords All 0’s All 1’s 10110…1 10110…1 10110…1 31 bits “Dirty” 3x 31 bits “All Zero” 31 bits “Dirty” 31 bits “Dirty” 2x 31 bits “All One” 10110…1 1 “3” (..100) 00 10110…1 1 10110…1 1 “2” (..010) 01

All About Elasticsearch Algorithms and Data Structures

All About Elasticsearch Algorithms and Data Structures

More Decks by Elastic Co

Other Decks in Technology

Featured

Transcript