Upgrade to Pro — share decks privately, control downloads, hide ads and more …

All About Elasticsearch Algorithms and Data Structures

Elastic Co
February 19, 2016

All About Elasticsearch Algorithms and Data Structures

Fast search usually boils down to data organization, which is why Elasticsearch is based on an inverted index. But sometimes speed comes from clever algorithms. Last year we looked at four such algorithms, but there are dozens more. In this talk we'll explore a new set of interesting algorithms in Elasticsearch.

Elastic Co

February 19, 2016
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. 3 Filter Caching • A filter either matches or does

    not match a document • Due to immutable segments, we have an opportunity to 
 cache frequent filters Doc #1 Doc #2 Doc #3 Doc #4 Doc #5 Doc #6 Matches! Matches! Matches!
  2. 4 Filter Caching Doc #1 Doc #2 Doc #3 Doc

    #4 Doc #5 Doc #6 [ 1, 0, 0, 1, 0, 1 ] Bitmap • A filter either matches or does not match a document • Due to immutable segments, we have an opportunity to 
 cache frequent filters
  3. 5 Some points to keep in mind • Each Lucene

    segment can hold up to 231-1 documents (e.g. 4 byte IDs) • Stored in memory, so compression is important • However, usage must be faster than re-executing the filter
  4. 6 Approach #1: Sorted List • Store the ID’s in

    a sorted list Doc #1 Doc #4 Doc #6 [1, 4, 6]
  5. 7 Approach #1: Sorted List • Very compact when filters

    are sparse Doc #1 Doc #4 Doc #6 [1, 4, 6] 12 bytes yay!
  6. 8 Approach #1: Sorted List • Dense filters become problematic

    Doc #1 Doc #4 Doc #6 [1, 2, …………………………… ………………………………… ………………………………… … 99999999, 100000000 ] 381mb oh no! Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #99999998 Doc #99999999 Doc #100000000 =(
  7. 9 Approach #2: Bitmaps • Save a single bit for

    each matching document instead Doc #1 Doc #4 Doc #6 [1, 1, 1, 1, 1, 0, 1, 1, 1, 1 ….… ………………………………… ………………………………… … 1, 1 ] Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #99999998 Doc #99999999 Doc #100000000
  8. 10 Approach #2: Bitmaps • Save a single bit for

    each matching document instead Doc #1 Doc #4 Doc #6 [1, 1, 1, 1, 1, 0, 1, 1, 1, 1 ….… ………………………………… ………………………………… … 1, 1 ] Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Doc #1 Doc #4 Doc #6 Down to 12mb! =) Doc #99999998 Doc #99999999 Doc #100000000
  9. 11 Approach #2: Bitmap • … except it’s identical for

    the sparse case too. Doc #1 Doc #4 Doc #6 [1, 0, 0, 1, 0, 1, 0, 0, 0, 0 ….… ………………………………… ………………………………… … 0, 0 ] Hmm…still 12mb
  10. 12 Alternative #3: Various Compressed Bitmaps • Byte Aligned Bitmaps

    (BBC) • Word-Aligned Hybrid (WAH) • PLWAH / EWAH variants • Compressed’n’Composable Integer Set (CONCISE) • Compressed Adaptive Index (COMPAX) • SECOMPAX / ICX • “Traditional” compression (LZ4, DEFLATE, etc)
  11. 13 Alternative #3: Various RLE Compressed Bitmaps • Good compression!


    
 • Slower (relatively) than Sorted Lists or Raw Bitmaps • Slow random access to bits • May lose ability to bitwise AND/OR multiple bitmaps together
  12. Overview so far 14 • Great for sparse • Expensive

    for dense Sorted Lists • Great for Dense • Expensive for sparse Raw Bitmaps • Great compression for heterogeneous
 • Slow(er) decoding • Slow random access RLE Compressed
  13. Roaring Bitmaps 15 • Great for sparse • Expensive for

    dense Sorted Lists • Great for Dense • Expensive for sparse Raw Bitmaps • Great compression for heterogeneous • Slow(er) decoding • Slow random access RLE Compressed
  14. 16 Partition into 216 chunks 0 1 2 3 …

    … 65535 0 1 0 0 … … 1 Doc ID Match? 65536 65537 65538 65539 … .… 131071 1 1 1 1 … … 1 Doc ID Match? 131072 131073 131074 131075 … .… 196608 1 0 0 1 … … 0 Doc ID Match?
  15. 17 Store containers in vector 0 1 2 3 …

    … 65535 0 1 0 0 … … 1 Doc ID Match? 65536 65537 65538 65539 … .… 131071 1 1 1 1 … … 1 Doc ID Match? 131072 131073 131074 131075 … .… 196608 1 0 0 1 … … 0 Doc ID Match? 0 1 2
  16. 18 Vector index == 16 least-significant bits 0 1 2

    3 … … 65535 0 1 0 0 … … 1 Doc ID Match? 1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 Doc ID Match? 0 1 2 0 1 2 3 … … 65535 0 1 2 3 … … 65535
  17. 19 0 1 2 3 … … 6553 0 1

    0 0 … … 1 Doc ID Match? 1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 Doc ID Match? 0 1 2 0 1 2 3 … … 65535 0 1 2 3 … … 6553 2 Bytes instead of 4 Vector index == 16 least-significant bits Implicit 16 bits of ID
  18. 20 0 1 2 3 … … 65535 0 1

    0 0 … … 1 Doc ID Match? 1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 Doc ID Match? 0 1 2 0 1 2 3 … … 65535 0 1 2 3 … … 65535 Fewer than 4096 Values?
  19. 21 Fewer than 4096 Values? • Save as a Sorted

    List 1 1920 3303 Doc ID 1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 Doc ID Match? 0 1 2 0 1 2 3 … … 65535 0 1 2 3 … … 65535
  20. 22 More than 4096 Values? 1 1920 3303 Doc ID

    1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 Doc ID Match? 0 1 2 0 1 2 3 … … 65535 0 1 2 3 … … 65535
  21. 23 • Save as dense bitmap 1 1920 3303 Doc

    ID 1 1 1 1 … … 1 Doc ID Match? 1 0 0 1 … … 0 State 0 1 2 0 1 2 3 … … 65535 More than 4096 Values?
  22. 24 • Super dense, relatively few zeros. Save as “inverted”

    Sorted List 1 1920 3303 Doc ID 1 0 0 1 … … 0 State 0 1 2 More than 61440 values? 2382 9112 10229 Doc ID Lucene Contribution
  23. 30 Variously weighted averages • Simple (no weighting) • Linear

    • Exponential • Double-Exponential (Holt) • Triple-Exponential (Holt-Winters)
  24. 31 Variously weighted averages • Simple (no weighting) • Linear

    • Exponential • Double-Exponential (Holt) • Triple-Exponential (Holt-Winters) Have configurable parameters
  25. 32 Configurable parameters α • Exponential • Holt • Holt-winters

    “Level” β “Trend” • Holt • Holt-winters γ “Seasonal” • Holt-winters (that’s a gamma)
  26. 33 Turns out, tuning parameters is hard • Small changes

    had large impact • Changing one parameter affected the other parameters • Not intuitive to mere mortals (e.g. me) • Frustrating user-experience
  27. ‹#› anneal: to heat and then slowly cool (metal, glass,

    etc.) in order to make it stronger Merriam-Webster Dictionary
  28. 36 Simulated Annealing Process 1.Pick random neighbor * 2.Evaluate “cost”

    • If “cost” > “best_cost”, keep solution • Otherwise discard 
 BUT with random probability p, keep solution anyway 3.Repeat, lowering probability p over time
  29. 37 Simulated Annealing Process 1.Pick random neighbor * 2.Evaluate “cost”

    • If “cost” > “best_cost”, keep solution • Otherwise discard BUT with random probability 3.Repeat, lowering probability “Random neighbor” Mutate one of parameters, leave the rest constant
  30. 45 Score Solution Space Best Score: Temperature: 12 75 12

    Notice how it unsticks the “pretty good” solution 35 Local Minima
  31. 46 Score Solution Space Best Score: Temperature: 85 70 85

    85 > 12 Which allows finding the better solution
  32. 47 Score Solution Space Best Score: Temperature: 85 15 18

    As temp drops, chance of random changes decreases
  33. 48 Simulated Annealing • Randomness samples the entire solution space

    • “Unsticks” from local minima • Over time, random changes less likely, “homing” in on a solution
  34. 49 Simulated Annealing in Elasticsearch • 100 Iterations per “round”

    • Decreases temperature by 10% each round • Ends when temp < 0.0001 • ~ 6600 iterations total
  35. 50 Simulated Annealing in Elasticsearch • “Trains” on the last

    window of data Training Window Forecasting Backcasting
  36. 52 T-Digest Percentiles • The t-digest algorithm is used to

    compute quantiles • Quite similar to k-means • Builds sorted centroids • Constraint: max size of a cluster: • 4 * count * q * (1 - q) / C • C = compression
  37. 53 T-Digest Percentiles • compression trades accuracy for memory usage

    • about 5*C centroids • error almost always < 3/C • excellent accuracy on extreme quantiles thanks to the q (1 - q) factor • implemented on numbers but could work on anything that is comparable and can be averaged
  38. 54 Calculating T-digest Percentiles 4 7 10 2 1 -5

    -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • 40 values overall
  39. 55 Calculating T-digest Percentiles 4 7 10 2 1 -5

    -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • 40 values overall • -9 is the value for 0 <= q < 1/40
  40. 56 Calculating T-digest Percentiles 4 7 10 2 1 -5

    -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • 40 values overall • -5 is the value for 1/40 <= q < 4 / 40
  41. 57 Calculating T-digest Percentiles 4 7 10 2 1 -5

    -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • 40 values overall • 1 is the value for 4/40 <= q < 6/40 • etc.
  42. 58 Inserting Values into T-digest 4 7 10 2 1

    -5 -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • Inserting 8 into the histogram
  43. 59 Inserting Values into T-digest 4 7 10 2 1

    -5 -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • Inserting 8 into the histogram • Find the centroid nearest the value
  44. 60 Inserting Values into T-digest 4 7.3 10 2 1

    -5 -9 1 3 2 4 8 3 5 2 2 5 2 3 1 • Inserting 8 into the histogram • Increment the count for the centroid • Adjust the centroid value • Notice that the capacity for all centroids increases slightly
  45. 61 Inserting Values into T-digest 4 7 10 2 1

    -5 -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • Inserting 5 into the histogram
  46. 62 Inserting Values into T-digest 4 7 10 2 1

    -5 -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • Inserting 5 into the histogram • Find the centroid nearest the value
  47. 63 Inserting Values into T-digest 4 5 7 2 1

    -5 -9 1 3 2 4 8 2 5 2 2 5 2 3 1 • Inserting 5 into the histogram • Incrementing the count would exceed the threshold • Create new centroid with value 5 and count of 1 10 1
  48. 64 T-Digest Practical Notes • Adding a new value to

    the bounds always creates a new centroid (because q(1-q) is 0) • When the histogram is too large: compress • reinsert in random order • when centroid count is > 20 * C in practice
  49. 66 HDRHistogram Percentiles • Uses a combination of logarithmic and

    linear bucketing • Conceptually buckets values in two levels: • Logarithmic scaled buckets • Linear scaled sub-buckets • No bound on the value in each buckets (in practice it is limited to a long value)
  50. 67 HDRHistogram Percentiles • Accuracy parameter is express as number

    of significant figures of a value to store in the histogram • Can be between 0 and 5 • Number of significant figures trades accuracy for memory usage • Affects the number of linear sub-buckets used for each logarithmic bucket
  51. 68 HDRHistogram Bucketing (1 s.f.) 100 101 102 103 104

    105 106 107 10 20 30 40 50 60 70 80 90 Logarithmic Buckets Linear Sub-Buckets
  52. 69 HDRHistogram Bucketing (2 s.f.) 100 Linear Sub-Buckets 110 120

    130 990 980 970 960 950 100 102 103 104 105 106 107 Logarithmic Buckets
  53. 70 Calculating HDRHistogram Percentiles 10 20 30 40 50 60

    70 80 90 1 3 2 4 6 2 5 2 5 1 2 3 4 5 6 7 8 9 1 3 2 4 6 2 5 2 5 100 101 102 103 104 105 106 107 • 250 values overall
  54. 71 Calculating HDRHistogram Percentiles 10 20 30 40 50 60

    70 80 90 1 3 2 4 6 2 5 2 5 1 2 3 4 5 6 7 8 9 1 3 2 4 6 2 5 2 5 100 101 102 103 104 105 106 107 • 250 values overall • 1 is the value for 0 <= q < 1/250
  55. 72 Calculating HDRHistogram Percentiles 10 20 30 40 50 60

    70 80 90 1 3 2 4 6 2 5 2 5 1 2 3 4 5 6 7 8 9 1 3 2 4 6 2 5 2 5 100 101 102 103 104 105 106 107 • 250 values overall • 2 is the value for 1/250 <= q < 4/250
  56. 73 Calculating HDRHistogram Percentiles 10 20 30 40 50 60

    70 80 90 1 3 2 4 6 2 5 2 5 1 2 3 4 5 6 7 8 9 1 3 2 4 6 2 5 2 5 100 101 102 103 104 105 106 107 • 250 values overall • 70 is the value for q = 0.2
  57. 74 Inserting Values into HDRHistogram 100 101 102 103 104

    • Inserting 42 into the histogram
  58. 75 Inserting Values into HDRHistogram • Inserting 42 into the

    histogram • Find the logarithmic bucket for the value 10 20 30 40 50 60 70 80 90 1 3 2 4 6 2 5 2 5 100 101 102 103 104
  59. 76 Inserting Values into HDRHistogram 10 20 30 40 50

    60 70 80 90 1 3 2 4 6 2 5 2 5 • Inserting 42 into the histogram • Find the sub-bucket for the value 100 101 102 103 104
  60. 77 Inserting Values into HDRHistogram • Inserting 42 into the

    histogram • Increment the count for the bucket 10 20 30 40 50 60 70 80 90 1 3 2 5 6 2 5 2 5 100 101 102 103 104
  61. 78 Inserting Values into HDRHistogram 100 101 102 103 104

    • Inserting 1,400,300 into the histogram • No logarithmic bucket to hold the value
  62. 100 101 102 103 104 105 106 79 Inserting Values

    into HDRHistogram • Inserting 1,400,300 into the histogram • Create logarithmic buckets (and sub- buckets) to include the new value 1.1 E5 1.2 E5 1.3 E5 1.4 E5 1.5 E5 1.6 E5 1.7 E5 1.8 E5 1.9 E5 0 0 0 0 0 0 0 0 0 1.1 E6 1.2 E6 1.3 E6 1.4 E6 1.5 E6 1.6 E6 1.7 E6 1.8 E6 1.9 E6 0 0 0 0 0 0 0 0 0
  63. 100 101 102 103 104 105 106 80 Inserting Values

    into HDRHistogram • Inserting 1,400,300 into the histogram • Find logarithmic bucket for the value 1.1 E6 1.2 E6 1.3 E6 1.4 E6 1.5 E6 1.6 E6 1.7 E6 1.8 E6 1.9 E6 0 0 0 0 0 0 0 0 0
  64. 100 101 102 103 104 105 106 81 Inserting Values

    into HDRHistogram • Inserting 1,400,300 into the histogram • Find sub-bucket bucket for the value 1.1 E6 1.2 E6 1.3 E6 1.4 E6 1.5 E6 1.6 E6 1.7 E6 1.8 E6 1.9 E6 0 0 0 0 0 0 0 0 0
  65. 82 Inserting Values into HDRHistogram • Inserting 1,400,300 into the

    histogram • Increment the count for the bucket 1.1 E6 1.2 E6 1.3 E6 1.4 E6 1.5 E6 1.6 E6 1.7 E6 1.8 E6 1.9 E6 0 0 0 1 0 0 0 0 0 100 101 102 103 104 105 106
  66. 83 HDRHistogram Practical Notes • Implemented as flat long array

    with base-2 logarithmic bucket values • Accuracy can be better than the set significant digits but can not be worse • Size of histogram in memory depends on the range of values and the number of significant digits • Implementation requires values as longs but wrapper implementation supporting doubles is available
  67. 84 Which should I use? • Default in Elasticsearch is

    currently t-digest • Use t-digest when you are interested in the extreme values (e.g. 99.99th percentile) • T-Digest tries to adapt to the data so can be used for a wide variety of data as the expense of some time performance • HDRHistogram is fast as it has a fixed histogram which does not need compression or centroid re-calculations • HDRHistogram requires positive values and will be more beneficial when the data is zero based so cannot be applied to all use cases • HDRHistogram performs very well on latency data
  68. ‹#› Please attribute Elastic with a link to elastic.co Except

    where otherwise noted, this work is licensed under http://creativecommons.org/licenses/by-nd/4.0/ Creative Commons and the double C in a circle are registered trademarks of Creative Commons in the United States and other countries. Third party marks and brands are the property of their respective holders. 86
  69. 87 Alternative #3: Various RLE Compressed Bitmaps • Generally encode

    “runs” with codewords All 0’s All 1’s 10110…1 10110…1 10110…1
  70. 88 Alternative #3: Various RLE Compressed Bitmaps • Generally encode

    “runs” with codewords All 0’s All 1’s 10110…1 10110…1 10110…1 31 bits “Dirty” 3x 31 bits “All Zero” 31 bits “Dirty” 31 bits “Dirty” 2x 31 bits “All One”
  71. 89 Alternative #3: Various RLE Compressed Bitmaps • Generally encode

    “runs” with codewords All 0’s All 1’s 10110…1 10110…1 10110…1 31 bits “Dirty” 3x 31 bits “All Zero” 31 bits “Dirty” 31 bits “Dirty” 2x 31 bits “All One” 10110…1 1 “3” (..100) 00 10110…1 1 10110…1 1 “2” (..010) 01