Nearly-Optimal Mergesort

Transcript

Nearly-Optimal Mergesorts Fast, Practical Sorting Methods That Optimally Adapt to

Existing Runs Sebastian Wild [email protected] joint work with Ian Munro ESA 2018 26th Annual European Symposium on Algorithms Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 0 / 15

Outline 1 Adaptive Sorting – Status Quo 1 Adaptive Sorting

– Status Quo 2 Natural Mergesort 2 Natural Mergesort 3 Peeksort 3 Peeksort 4 Powersort 4 Powersort 5 Experiments 5 Experiments Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 0 / 15

Adaptive Sorting Adaptive algorithm: exploit “structure” of input adaptive sorting:

exploit “presortedness” few inversions few runs few outliers ...many more Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 1 / 15

Adaptive Sorting Adaptive algorithm: exploit “structure” of input adaptive sorting:

exploit “presortedness” few inversions few runs few outliers ...many more optimal algorithms known for many measures of presortedness Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 1 / 15

Adaptive Sorting Adaptive algorithm: exploit “structure” of input adaptive sorting:

exploit “presortedness” few inversions few runs few outliers ...many more optim up to constant factors! al algorithms known for many measures of presortedness Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 1 / 15

Adaptive Sorting Adaptive algorithm: exploit “structure” of input adaptive sorting:

exploit “presortedness” few inversions few runs few outliers ...many more optim up to constant factors! al algorithms known for many measures of presortedness Want: Optimal up to lower order terms practical methods low overhead for detecting presortedness competitive on inputs without presortedness Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 1 / 15

Adaptive Sorting Adaptive algorithm: exploit “structure” of input adaptive sorting:

State of the art 1 “fat-pivot” quicksort Sebastian Wild Nearly-Optimal

Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant ... Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant ... adapts to existing runs Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant ... adapts to existing runs but not optimally! factor ≥ 1.5 worse (Buss & Knop 2018) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant ... adapts to existing runs but not optimally! factor ≥ 1.5 worse (Buss & Knop 2018) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant ... adapts to existing runs but not optimally! factor ≥ 1.5 worse (Buss & Knop 2018) Timsort still broken! Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant ... adapts to existing runs but not optimally! factor ≥ 1.5 worse (Buss & Knop 2018) Timsort still broken! „it is still possible to cause the Java implementation to fail: [...] causing an error at runtime in Java’s sorting method.” Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant ... adapts to existing runs but not optimally! factor ≥ 1.5 worse (Buss & Knop 2018) Timsort still broken! „it is still possible to cause the Java implementation to fail: [...] causing an error at runtime in Java’s sorting method.” Observation: Timsort’s merge rules are quite intricate. Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant ... adapts to existing runs but not optimally! factor ≥ 1.5 worse (Buss & Knop 2018) Timsort still broken! „it is still possible to cause the Java implementation to fail: [...] causing an error at runtime in Java’s sorting method.” Observation: Timsort’s merge rules are quite intricate. Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant ... adapts to existing runs but not optimally! factor ≥ 1.5 worse (Buss & Knop 2018) Timsort still broken! „it is still possible to cause the Java implementation to fail: [...] causing an error at runtime in Java’s sorting method.” Observation: Timsort’s merge rules are quite intricate. ? Why these rules? Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant ... adapts to existing runs but not optimally! factor ≥ 1.5 worse (Buss & Knop 2018) Timsort still broken! „it is still possible to cause the Java implementation to fail: [...] causing an error at runtime in Java’s sorting method.” Observation: Timsort’s merge rules are quite intricate. ? Why these rules? ? Why are they so sensitive to sma cf. Java version! ll changes? Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

State of the art 1 “fat-pivot” quicksort split < P,

= P, > P average adapts to duplicate elements optimal up to small constant factor 1.386 (plain quicksort) 1.188 with median-of-3 1.088 with ninther low overhead implementation ... 2 Timsort adaptive mergesort variant ... adapts to existing runs but not optimally! factor ≥ 1.5 worse (Buss & Knop 2018) Timsort still broken! „it is still possible to cause the Java implementation to fail: [...] causing an error at runtime in Java’s sorting method.” Observation: Timsort’s merge rules are quite intricate. ? Why these rules? ? Why are they so sensitive to sma cf. Java version! ll changes? ... and can’t we ﬁnd simpler rules? Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 2 / 15

Run-Length Entropy Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 3 / 15

Run-Length Entropy Our measure of unsortedness: runs Sebastian Wild Nearly-Optimal

Mergesorts 2018-08-20 3 / 15

Run-Length Entropy Our measure of unsortedness: runs maximal contiguous sorted

range Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 3 / 15

Run-Length Entropy Our measure of unsortedness: runs maximal contiguous sorted

range simple version: lg(#runs) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 3 / 15

Run-Length Entropy Our measure of unsortedness: runs maximal contiguous sorted

range simple version: lg(#runs) ﬁne-grained version: entropy of run lengths runs lengths L1, . . . , Lr H L1 n , . . . , Lr n = r i=1 Li n lg n Li Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 3 / 15

Run-Length Entropy Our measure of unsortedness: runs maximal contiguous sorted

range simple version: lg(#runs) ﬁne-grained version: entropy of run lengths runs lengths L1, . . . , Lr H L1 n , . . . , Lr n = r i=1 Li n lg n Li Comparison Lower Bound n! permutations in total Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 3 / 15

Run-Length Entropy Our measure of unsortedness: runs maximal contiguous sorted

range simple version: lg(#runs) ﬁne-grained version: entropy of run lengths runs lengths L1, . . . , Lr H L1 n , . . . , Lr n = r i=1 Li n lg n Li Comparison Lower Bound n! permutations in total but sorted within runs Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 3 / 15

Run-Length Entropy Our measure of unsortedness: runs maximal contiguous sorted

range simple version: lg(#runs) ﬁne-grained version: entropy of run lengths runs lengths L1, . . . , Lr H L1 n , . . . , Lr n = r i=1 Li n lg n Li Comparison Lower Bound n! permutations in total but sorted within runs n! L1! · · · Lr! possible inputs Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 3 / 15

Run-Length Entropy Our measure of unsortedness: runs maximal contiguous sorted

range simple version: lg(#runs) ﬁne-grained version: entropy of run lengths runs lengths L1, . . . , Lr H L1 n , . . . , Lr n = r i=1 Li n lg n Li Comparison Lower Bound n! permutations in total but sorted within runs n! L1! · · · Lr! possible inputs Need lg n! L1! · · · Lr! = H L1 n , . . . , Lr n · n − O(n) comparisons Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 3 / 15

Run-Length Entropy Our measure of unsortedness: runs maximal contiguous sorted

range simple version: lg(#runs) ﬁne-grained version: entropy of run lengths runs lengths L1, . . . , Lr H L1 n , . . . , Lr n = r i=1 Li n lg n Li Comparison Lower Bound n! permutations in total but sorted within runs n! L1! · · · Lr! possible inputs Need lg n! L1! · · · Lr! = H L only H in the following 1 n , . . . , Lr n · n − O(n) comparisons Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 3 / 15

Outline 1 Adaptive Sorting – Status Quo 1 Adaptive Sorting

Natural Mergesort “natural” mergesort = run-adaptive mergesort Sebastian Wild Nearly-Optimal

Mergesorts 2018-08-20 4 / 15

Natural Mergesort “natural” mergesort = run-adaptive mergesort (Knuth 1973) Sebastian

Wild Nearly-Optimal Mergesorts 2018-08-20 4 / 15

Natural Mergesort “natural” mergesort = run-adaptive mergesort (Knuth 1973) Conceptually

two steps: 1 Find runs in input. 2 Merge them Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 4 / 15

Natural Mergesort “natural” mergesort = run-adaptive mergesort (Knuth 1973) Concep

interleaved in code tually two steps: 1 Find runs in input. 2 Merge them Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 4 / 15

Natural Mergesort “natural” mergesort = run-adaptive mergesort (Knuth 1973) Concep

interleaved in code tually two steps: 1 Find runs in input. 2 Merge them in some order. Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 4 / 15

Natural Mergesort “natural” mergesort = run-adaptive mergesort (Knuth 1973) Concep

interleaved in code tually two steps: 1 Find runs in input. 2 Merge them in some order (Knuth: simple bottom-up) . Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 4 / 15

Natural Mergesort “natural” mergesort = run-adaptive mergesort Conceptually two steps:

1 Find runs in input. 2 Merge them in some order. Here: only binary merges 2 becomes: merge 2 runs, repeat until single run only stable sorts merge 2 adjacent runs Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 4 / 15

Natural Mergesort “natural” mergesort = run-adaptive mergesort Conceptually two steps:

1 Find runs in input. 2 Merge them in some order. Here: only binary merges 2 becomes: merge 2 runs, repeat until single run only stable sorts merge 2 adjacent runs Merge trees: 15 17 12 19 2 9 13 7 11 1 4 8 10 14 23 5 21 3 6 16 18 20 22 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 4 / 15

Natural Mergesort “natural” mergesort = run-adaptive mergesort Conceptually two steps:

1 Find runs in input. 2 Merge them in some order. Here: only binary merges 2 becomes: merge 2 runs, repeat until single run only stable sorts merge 2 adjacent runs Merge trees: 15 17 12 19 2 9 13 7 11 1 4 8 10 14 23 5 21 3 6 16 18 20 22 Merge costs cost of merge := size of output ≈ memory transfers #cmps total cost = total area of Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 4 / 15

Natural Mergesort “natural” mergesort = run-adaptive mergesort Conceptually two steps:

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 4 6 8 10 12 14 16 3 5 1 9 7 17 11 13 15 0 2 4 6 8 10 12 14 16 3 5 1 9 7 17 11 13 15 0 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 4 6 8 10 12 14 16 3 5 1 9 7 17 11 13 15 0 2 4 6 8 10 12 14 16 3 5 1 9 7 17 11 13 15 0 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 4 6 8 10 12 14 16 3 5 1 9 7 17 11 13 15 0 merge costs: 42 2 4 6 8 10 12 14 16 3 5 1 9 7 17 11 13 15 0 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 4 6 8 10 12 14 16 3 5 1 9 7 17 11 13 15 0 merge costs: 42 2 4 6 8 10 12 14 16 3 5 1 9 7 17 11 13 15 0 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 4 6 8 10 12 14 16 3 5 1 9 7 17 11 13 15 0 merge costs: 42 2 4 6 8 10 12 14 16 3 5 1 9 7 17 11 13 15 0 merge costs: 71 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

15 17 12 19 2 9 13 7 11 1 4 8 10 14 23 5 21 3 6 16 18 20 22 Merge cost = total area of Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

15 17 12 19 2 9 13 7 11 1 4 8 10 14 23 5 21 3 6 16 18 20 22 Merge cost = total area of = total length of paths to all array entries Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths not stable Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths not stable Hu-Tucker merge Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths not stable Hu-Tucker merge optimal alphabetic tree Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths not stable Hu-Tucker merge optimal alphabetic tree have to store lengths Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths not stable Hu-Tucker merge optimal alphabetic tree have to store lengths complicated algorithm Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths not stable Hu-Tucker merge optimal alphabetic tree have to store lengths complicated algorithm nearly-optimal ...70s are calling BST merge Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths not stable Hu-Tucker merge optimal alphabetic tree have to store lengths complicated algorithm nearly-optimal ...70s are calling BST merge simple (greedy) linear-time methods! Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths not stable Hu-Tucker merge optimal alphabetic tree have to store lengths complicated algorithm nearly-optimal ...70s are calling BST merge simple (greedy) linear-time methods! almost optimal ( H + 2) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths not stable Hu-Tucker merge optimal alphabetic tree have to store lengths complicated algorithm nearly-optimal ...70s are calling BST merge simple (greedy) linear-time methods! almost optimal ( H + 2) ŏ have to store lengths Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths not stable Hu-Tucker merge optimal alphabetic tree have to store lengths complicated algorithm nearly-optimal ...70s are calling BST merge simple (greedy) linear-time methods! almost optimal ( H + 2) ŏ have to store lengths ŏ extra scan to detect runs Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Mergesort meets search trees Diﬀerent merge trees yield diﬀerent cost!

2 3 2 2 6 2 6 Merge cost = total area of = total length of paths to all array entries = weighted external path w leaf weight(w) · depth(w) length optimal merge tree = optimal BST for leaf weights L1, . . . , Lr How to compute good merge tree? Huﬀman merge merge shortest runs indep. discovered • Golin & Sedgewick 1993 • Takaoka 1998 • Barbay & Navarro 2009 • Chandramouli & Goldstein 2014 must sort lengths not stable Hu-Tucker merge optimal alphabetic tree have to store lengths complicated algorithm nearly-optimal ...70s are calling BST merge simple (greedy) linear-time methods! almost optimal ( H + 2) ŏ have to store lengths ŏ extra scan to detect runs avoidable? Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 5 / 15

Outline 1 Adaptive Sorting – Status Quo 1 Adaptive Sorting

Peeksort Method 1: weight-balancing (Mehlhorn 1975, Bayer 1975) choose root

to balance subtree weights recurse on subtrees Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 6 / 15

Peeksort Method 1: weight-balancing (Mehlhorn 1975, Bayer 1975) choose root

to balance subtree weights recurse on subtrees 1⁄2 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 6 / 15

Peeksort Method 1: weight-balancing (Mehlhorn 1975, Bayer 1975) choose root

to balance subtree weights recurse on subtrees 1⁄2 1⁄2 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 6 / 15

Peeksort Method 1: weight-balancing (Mehlhorn 1975, Bayer 1975) choose root

to balance subtree weights recurse on subtrees 1⁄2 1⁄2 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 6 / 15

Peeksort Method 1: weight-balancing (Mehlhorn 1975, Bayer 1975) choose root

to balance subtree weights recurse on subtrees 1⁄2 1⁄2 1⁄2 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 6 / 15

Peeksort Method 1: weight-balancing (Mehlhorn 1975, Bayer 1975) choose root

to balance subtree weights recurse on subtrees 1⁄2 1⁄2 1⁄2 Peeksort can simulate weight-balancing without knowing/storing all runs! “peek” at middle of array to ﬁnd closest run boundary split there and recurse can avoid redundant work: ﬁnd full run straddling midpoint 4 parameters for recursive calls: r e s stores outermost runs each run scanned only once Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 6 / 15

Peeksort Method 1: weight-balancing (Mehlhorn 1975, Bayer 1975) choose root

to balance subtree weights recurse on subtrees 1⁄2 1⁄2 1⁄2 Peeksort can simulate weight-balancing without knowing/storing all runs! “peek” at middle of array to ﬁnd closest run boundary split there and recurse can avoid redundant work: ﬁnd full run straddling midpoint 4 parameters for recursive calls: r e s stores outermost runs empty if = e resp. s = r each run scanned only once Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 6 / 15

Peeksort Method 1: weight-balancing (Mehlhorn 1975, Bayer 1975) choose root

to balance subtree weights recurse on subtrees 1⁄2 1⁄2 1⁄2 Peeksort can simulate weight-balancing without knowing/storing all runs! “peek” at middle of array to ﬁnd closest run boundary split there and recurse can avoid redundant work: ﬁnd full run straddling midpoint 4 parameters for recursive calls: r e s stores outermost runs empty if = e resp. s = r each run scanned only once Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 6 / 15

Analysis of peeksort Theorem (Horibe 1977, Bayer 1975:) Weight-balancing on

leaf probabilities α1, . . . , αr yields a BST with search cost C H(α1, . . . , αr) + 2. immediate corollary: Peeksort incurs merge cost M (H + 2)n. Peeksort needs C n + (H + 2)n cmps. both are optimal up to O(n) terms Peeksort exploits existing runs optimally up to lower order terms! Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 7 / 15

Analysis of peeksort Theorem (Horibe 1977, Bayer 1975:) Weight-balancing on

leaf probabilities α1, . . . , αr yields a BST with search cost exp. #cmps to ﬁnd random leaf chosen with prob. αi. C H(α1, . . . , αr) + 2. immediate corollary: Peeksort incurs merge cost M (H + 2)n. Peeksort needs C n + (H + 2)n cmps. both are optimal up to O(n) terms Peeksort exploits existing runs optimally up to lower order terms! Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 7 / 15

Analysis of peeksort Theorem (Horibe 1977, Bayer 1975:) Weight-balancing on

leaf probabilities α1, . . . , αr yields a BST with search cost exp. #cmps to ﬁnd random leaf chosen with prob. αi. C H(α1, . . . , αr) + 2. immediate corollary: Peeksort incurs merge cost M (H + 2)n. Peeksort needs C n detect runs + (H + 2)n cmps. both are optimal up to O(n) terms Peeksort exploits existing runs optimally up to lower order terms! Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 7 / 15

Analysis of peeksort Theorem (Horibe 1977, Bayer 1975:) Weight-balancing on

leaf probabilities α1, . . . , αr yields a BST with search cost exp. #cmps to ﬁnd random leaf chosen with prob. αi. C H(α1, . . . , αr) + 2. immediate corollary: Peeksort incurs merge cost M (H + 2)n. Peeksort needs C n detect runs + (H + merge cost 2)n cmps. both are optimal up to O(n) terms Peeksort exploits existing runs optimally up to lower order terms! Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 7 / 15

Analysis of peeksort Theorem (Horibe 1977, Bayer 1975:) Weight-balancing on

leaf probabilities α1, . . . , αr yields a BST with search cost exp. #cmps to ﬁnd random leaf chosen with prob. αi. C H(α1, . . . , αr) + 2. immediate corollary: Peeksort incurs merge cost M (H + 2)n. Peeksort needs C n detect runs + (H + merge cost 2)n cmps. both are optimal up to O(n) terms Peeksort exploits existing runs optimally up to lower order terms! Are we done then? Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 7 / 15

Analysis of peeksort Theorem (Horibe 1977, Bayer 1975:) Weight-balancing on

leaf probabilities α1, . . . , αr yields a BST with search cost exp. #cmps to ﬁnd random leaf chosen with prob. αi. C H(α1, . . . , αr) + 2. immediate corollary: Peeksort incurs merge cost M (H + 2)n. Peeksort needs C n detect runs + (H + merge cost 2)n cmps. both are optimal up to O(n) terms Peeksort exploits existing runs optimally up to lower order terms! Are we done then? ŏ have to store lengths Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 7 / 15

Analysis of peeksort Theorem (Horibe 1977, Bayer 1975:) Weight-balancing on

leaf probabilities α1, . . . , αr yields a BST with search cost exp. #cmps to ﬁnd random leaf chosen with prob. αi. C H(α1, . . . , αr) + 2. immediate corollary: Peeksort incurs merge cost M (H + 2)n. Peeksort needs C n detect runs + (H + merge cost 2)n cmps. both are optimal up to O(n) terms Peeksort exploits existing runs optimally up to lower order terms! Are we done then? ŏ have to store lengths Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 7 / 15

Analysis of peeksort Theorem (Horibe 1977, Bayer 1975:) Weight-balancing on

leaf probabilities α1, . . . , αr yields a BST with search cost exp. #cmps to ﬁnd random leaf chosen with prob. αi. C H(α1, . . . , αr) + 2. immediate corollary: Peeksort incurs merge cost M (H + 2)n. Peeksort needs C n detect runs + (H + merge cost 2)n cmps. both are optimal up to O(n) terms Peeksort exploits existing runs optimally up to lower order terms! Are we done then? ŏ have to store lengths ŏ extra scan to detect runs one run at a time we load runs (peeking) without putting memory transfers to good use ... can’t we do better? Timsort does better: newly detected run usually merged soon after Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 7 / 15

Analysis of peeksort Theorem (Horibe 1977, Bayer 1975:) Weight-balancing on

Outline 1 Adaptive Sorting – Status Quo 1 Adaptive Sorting

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 1⁄2 1⁄4 3⁄4 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 1⁄2 1⁄4 3⁄4 weight-balancing chose this! Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 1⁄2 1⁄4 3⁄4 weight-balancing chose this! 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 1⁄2 1⁄4 3⁄4 weight-balancing chose this! 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 1⁄2 1⁄4 3⁄4 weight-balancing chose this! 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 split out of range! no node created Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 1⁄2 1⁄4 3⁄4 weight-balancing chose this! 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 split out of range! no node created 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 15⁄16 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 1⁄2 1⁄4 3⁄4 weight-balancing chose this! 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 split out of range! no node created 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 15⁄16 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 1⁄2 1⁄4 3⁄4 weight-balancing chose this! 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 split out of range! no node created 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 15⁄16 Alternative view: node powers inner node midpoint interval = normalized interval [1..n] → [0, 1] power = min s.t. contains c · 2− depends only on 2 runs Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 1⁄2 1⁄4 3⁄4 weight-balancing chose this! 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 split out of range! no node created 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 15⁄16 Alternative view: node powers inner node midpoint interval = normalized interval [1..n] → [0, 1] power = min s.t. contains c · 2− depends only on 2 runs Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 1⁄2 1⁄4 3⁄4 weight-balancing chose this! 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 split out of range! no node created 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 4 4 3 3 4 4 2 2 4 4 3 3 4 4 1 1 4 4 3 3 4 4 2 2 4 4 3 3 4 4 15⁄16 Alternative view: node powers inner node midpoint interval = normalized interval [1..n] → [0, 1] power = min s.t. contains c · 2− depends only on 2 runs Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

The bisection heuristic Timsort proceed left to right: detect the

next run push it onto stack of runs merge some devil is in the details runs from the stack cannot use weight-balancing Method 2 : bisection (Mehlhorn 1977) 1⁄2 1⁄2 1⁄4 3⁄4 weight-balancing chose this! 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 split out of range! no node created 1⁄2 1⁄4 3⁄4 1⁄8 7⁄8 4 4 3 3 4 4 2 2 4 4 3 3 4 4 1 1 4 4 3 3 4 4 2 2 4 4 3 3 4 4 15⁄16 3 2 1 2 4 Alternative view: node powers inner node midpoint interval = normalized interval [1..n] → [0, 1] power = min s.t. contains c · 2− depends only on 2 runs Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 8 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs a b c d e f More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs a b c d e f 3 run1 run2 More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs a b c d e f 3 run1 run2 a – 3 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs a b c d e f 3 2 run1 run2 a – 3 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs a b c d e f 3 2 run1 run2 a – 3 b – 2 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs a b c d e f 3 2 run1 run2 a – 3 b – 2 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs a b c d e f 3 2 run1 run2 a – 3 b – 2 run stack merge More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs ab c d e f 2 run2 ab – 2 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs ab c d e f 2 1 run1 run2 ab – 2 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs ab c d e f 2 1 run1 run2 ab – 2 c – 1 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs ab c d e f 2 1 run1 run2 ab – 2 c – 1 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs ab c d e f 2 1 run1 run2 ab – 2 c – 1 run stack merge More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs abc d e f 1 run2 abc – 1 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs abc d e f 1 2 run1 run2 abc – 1 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs abc d e f 1 2 run1 run2 abc – 1 d – 2 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs abc d e f 1 2 4 run1 run2 abc – 1 d – 2 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs abc d e f 1 2 4 run1 run2 abc – 1 d – 2 e – 4 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs abc d e f 1 2 4 abc – 1 d – 2 e – 4 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs merge-down phase abc d e f 1 2 4 abc – 1 d – 2 e – 4 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs merge-down phase abc d e f 1 2 4 abc – 1 d – 2 e – 4 run stack merge More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs merge-down phase abc d ef 1 2 abc – 1 d – 2 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs merge-down phase abc d ef 1 2 abc – 1 d – 2 run stack merge More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs merge-down phase abc def 1 abc – 1 run stack More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs merge-down phase abc def 1 abc – 1 run stack merge More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs abcdef More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

compute power push run onto stack of runs while new node less powerful: merge topmost runs abcdef Theorem (Mehlhorn 1977:) The bisection heuristic yields a BST with search cost C H(α1, . . . , αr) + 2. same merge/cmps cost as Peeksort exploit runs optimally up to lower order terms! but: detects runs lazily! no extra scan! More good properties: power eﬃcient to compute; O(1) with bitwise tricks (clz count leading zeros ) never stores more than lg n runs: powers on stack strictly monotonic (highest on top) stack height max power lg n + 1 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 9 / 15

Powersort Powersort proceed left to right: detect next run &

Outline 1 Adaptive Sorting – Status Quo 1 Adaptive Sorting

Experimental Evaluation Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 10 / 15

Experimental Evaluation Hypotheses: Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 10 /

15

Experimental Evaluation Hypotheses: 1 Negligible overhead: Peek- and powersort are

as fast as standard mergesort on inputs with high H. Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 10 / 15

Experimental Evaluation Hypotheses: 1 Negligible overhead: Peek- and powersort are

as fast as standard mergesort on inputs with high H. 2 Run-adaptiveness helps: Adaptive methods are faster on inputs with low H. Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 10 / 15

Experimental Evaluation Hypotheses: 1 Negligible overhead: Peek- and powersort are

as fast as standard mergesort on inputs with high H. 2 Run-adaptiveness helps: Adaptive methods are faster on inputs with low H. 3 Timsort’s weak point: Timsort is much slower than peek-/powersort on certain inputs. Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 10 / 15

Experimental Evaluation Hypotheses: 1 Negligible overhead: Peek- and powersort are

as fast as standard mergesort on inputs with high H. 2 Run-adaptiveness helps: Adaptive methods are faster on inputs with low H. 3 Timsort’s weak point: Timsort is much slower than peek-/powersort on certain inputs. Setup: Java implementations, reproduced in C++ mildly hand-tuned code sorting int[]s, length around 107 Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 10 / 15

Negligible Overhead 1 Negligible overhead: Peek- and powersort are as

good as standard mergesort on inputs with high H. Study: random permutations, Java runtimes time n lg n C++ 105 106 107 108 4 5 6 7 top-down mergesort bottom-up mergesort peeksort powersort Timsort trotsort (Timsort w/o galloping) Arrays.sort(int[]) (quicksort, not stable) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 11 / 15

Negligible Overhead 1 Negligible overhead: Peek- and powersort are as

good as standard mergesort on inputs with high H. Study: random permutations, Java runtimes time n lg n C++ 105 106 107 108 4 5 6 7 top-down mergesort bottom-up mergesort peeksort powersort Timsort trotsort (Timsort w/o galloping) Arrays.sort(int[]) (quicksort, not stable) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 11 / 15

Negligible Overhead 1 Negligible overhead: Peek- and powersort are as

good as standard mergesort on inputs with high H. Study: random permutations, Java runtimes time n lg n C++ 105 106 107 108 4 5 6 7 top-down mergesort bottom-up mergesort peeksort powersort Timsort trotsort (Timsort w/o galloping) Arrays.sort(int[]) (quicksort, not stable) galloping merge too slow Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 11 / 15

Negligible Overhead 1 Negligible overhead: Peek- and powersort are as

good as standard mergesort on inputs with high H. Study: random permutations, Java runtimes time n lg n C++ 105 106 107 108 4.2 4.4 4.6 top-down mergesort bottom-up mergesort peeksort powersort Timsort trotsort (Timsort w/o galloping) Arrays.sort(int[]) (quicksort, not stable) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 11 / 15

Negligible Overhead 1 Negligible overhead: Peek- and powersort are as

good as standard mergesort on inputs with high H. Study: random permutations, Java runtimes time n lg n C++ 105 106 107 108 4.2 4.4 4.6 top-down mergesort bottom-up mergesort peeksort powersort Timsort trotsort (Timsort w/o galloping) Arrays.sort(int[]) (quicksort, not stable) no signiﬁcant diﬀerence to standard mergesort Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 11 / 15

Negligible Overhead 1 Negligible overhead: Peek- and powersort are as

good as standard mergesort on inputs with high H. Study: random permutations, Java runtimes time n lg n C++ 105 106 107 108 4.2 4.4 4.6 top-down mergesort bottom-up mergesort peeksort powersort Timsort trotsort (Timsort w/o galloping) Arrays.sort(int[]) (quicksort, not stable) no signiﬁcant diﬀerence to standard mergesort Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 11 / 15

Run-adaptiveness helps 2 Run-adaptiveness helps: Adaptive methods are faster on

inputs with low H. Study: “random runs”: rp w/ ranges of Geo(1/√ n) sorted, Java runtimes, n = 107 C++ ≈ √ n runs, avg length √ n 10% of runs < 0.1 √ n 5% of runs > 5 √ n moderate presortedness Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 12 / 15

Run-adaptiveness helps 2 Run-adaptiveness helps: Adaptive methods are faster on

inputs with low H. Study: “random runs”: rp w/ ranges of Geo(1/√ n) sorted, Java runtimes, n = 107 C++ ≈ √ n runs, avg length √ n 10% of runs < 0.1 √ n 5% of runs > 5 √ n moderate presortedness 500 600 700 time (ms) top-down mergesort bottom-up mergesort peeksort powersort Timsort trotsort (no galloping) Arrays.sort(int[]) merge cost Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 12 / 15

Run-adaptiveness helps 2 Run-adaptiveness helps: Adaptive methods are faster on

inputs with low H. Study: “random runs”: rp w/ ranges of Geo(1/√ n) sorted, Java runtimes, n = 107 C++ ≈ √ n runs, avg length √ n 10% of runs < 0.1 √ n 5% of runs > 5 √ n moderate presortedness 500 600 700 time (ms) top-down mergesort bottom-up mergesort peeksort powersort Timsort trotsort (no galloping) Arrays.sort(int[]) merge cost beat quicksort by 20% Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 12 / 15

Run-adaptiveness helps 2 Run-adaptiveness helps: Adaptive methods are faster on

inputs with low H. Study: “random runs”: rp w/ ranges of Geo(1/√ n) sorted, Java runtimes, n = 107 C++ ≈ √ n runs, avg length √ n 10% of runs < 0.1 √ n 5% of runs > 5 √ n moderate presortedness 500 600 700 time (ms) top-down mergesort bottom-up mergesort peeksort powersort Timsort trotsort (no galloping) Arrays.sort(int[]) merge cost beat quicksort by 20% Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 12 / 15

Timsort’s weak point 3 Timsort’s weak point: Timsort is much

slower than peek-/powersort on certain inputs. Study: “Timsort-drags”: known family of bad-case sequences Rtim (n) by Buss & Knop 2018 L1, . . . , Lr Java runtimes, n = 224 ≈ 1.6 · 107 C++ Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 13 / 15

Timsort’s weak point 3 Timsort’s weak point: Timsort is much

slower than peek-/powersort on certain inputs. Study: “Timsort-drags”: known family of bad-case sequences Rtim (n) by Buss & Knop 2018 L1, . . . , Lr Java runtimes, n = 224 ≈ 1.6 · 107 C++ 1,200 1,400 1,600 1,800 2,000 time (ms) td-mergesort bu-mergesort peeksort powersort Timsort trotsort Arrays.sort(int[]) 0.8 0.9 1 1.1 1.2 1.3 merge costs (normalized) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 13 / 15

Timsort’s weak point 3 Timsort’s weak point: Timsort is much

slower than peek-/powersort on certain inputs. Study: “Timsort-drags”: known family of bad-case sequences Rtim (n) by Buss & Knop 2018 L1, . . . , Lr Java runtimes, n = 224 ≈ 1.6 · 107 C++ 1,200 1,400 1,600 1,800 2,000 time (ms) td-mergesort bu-mergesort peeksort powersort Timsort trotsort Arrays.sort(int[]) 0.8 0.9 1 1.1 1.2 1.3 merge costs (normalized) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 13 / 15

Timsort’s weak point 3 Timsort’s weak point: Timsort is much

slower than peek-/powersort on certain inputs. Study: “Timsort-drags”: known family of bad-case sequences Rtim (n) by Buss & Knop 2018 L1, . . . , Lr Java runtimes, n = 224 ≈ 1.6 · 107 C++ 1,200 1,400 1,600 1,800 2,000 time (ms) td-mergesort bu-mergesort peeksort powersort Timsort trotsort Arrays.sort(int[]) 0.8 0.9 1 1.1 1.2 1.3 merge costs (normalized) Timsort/trotsort has 40% higher merge cost 10% higher running time in Java Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 13 / 15

Timsort’s weak point 3 Timsort’s weak point: Timsort is much

slower than peek-/powersort on certain inputs. Study: “Timsort-drags”: known family of bad-case sequences Rtim (n) by Buss & Knop 2018 L1, . . . , Lr Java runtimes, n = 224 ≈ 1.6 · 107 C++ 1,200 1,400 1,600 1,800 2,000 time (ms) td-mergesort bu-mergesort peeksort powersort Timsort trotsort Arrays.sort(int[]) 0.8 0.9 1 1.1 1.2 1.3 merge costs (normalized) Timsort/trotsort has 40% higher merge cost 10% higher running time in Java Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 13 / 15

Timsort’s weak point 3 Timsort’s weak point: Timsort is much

slower than peek-/powersort on certain inputs. Study: “Timsort-drags”: known family of bad-case sequences Rtim (n) by Buss & Knop 2018 L1, . . . , Lr Java runtimes, n = 224 ≈ 1.6 · 107 C++ 1,200 1,400 1,600 1,800 2,000 time (ms) td-mergesort bu-mergesort peeksort powersort Timsort trotsort Arrays.sort(int[]) 0.8 0.9 1 1.1 1.2 1.3 merge costs (normalized) Timsort/trotsort has 40% higher merge cost 10% higher running time in Java 40% higher running time in C++ Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 13 / 15

Timsort’s weak point 3 Timsort’s weak point: Timsort is much

slower than peek-/powersort on certain inputs. Study: “Timsort-drags”: known family of bad-case sequences Rtim (n) by Buss & Knop 2018 L1, . . . , Lr Java runtimes, n = 224 ≈ 1.6 · 107 C++ 1,200 1,400 1,600 1,800 2,000 time (ms) td-mergesort bu-mergesort peeksort powersort Timsort trotsort Arrays.sort(int[]) 0.8 0.9 1 1.1 1.2 1.3 merge costs (normalized) Timsort/trotsort has 40% higher merge cost 10% higher running time in Java 40% higher running time in C++ Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 13 / 15

Conclusion We have seen optimal run-adaptivity by conceptually simple methods

with negligible overhead in running time correctness and performance bounds easy to prove, including constant factor What’s next? improvements during merge? nearly-optimal multiway merge? preprocessing by patience sort to get longer runs? practical method that adapts to duplicates and runs? Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 14 / 15

Conclusion We have seen optimal run-adaptivity by conceptually simple methods

Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 15 / 15

Icons made by Freepik and Gregor Cresnar from www.flaticon.com. Squid

photo by LuqueStock / Freepik Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 16 / 15

Appendix: Negligible Overhead – C++ 1 Negligible overhead: Peek- and

powersort are as good as standard mergesort on inputs with high H. Study: random permutations, C++ runtimes time n lg n Java 104 105 106 107 108 3 4 5 top-down mergesort bottom-up mergesort boustrophedonic bottom-up peeksort powersort Timsort trotsort (Timsort w/o galloping) trotsort (Timsort w/o galloping, straight insertion sort) std::sort Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 17 / 15

Appendix: Negligible Overhead – C++ 1 Negligible overhead: Peek- and

powersort are as good as standard mergesort on inputs with high H. Study: random permutations, C++ runtimes time n lg n Java 104 105 106 107 108 3 3.2 3.4 top-down mergesort bottom-up mergesort boustrophedonic bottom-up peeksort powersort Timsort trotsort (Timsort w/o galloping) trotsort (Timsort w/o galloping, straight insertion sort) std::sort Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 17 / 15

Appendix: Run-adaptiveness helps – merge cost 2 Run-adaptiveness helps: Adaptive

methods are faster on inputs with low H. Study: “random runs”: rp w/ ranges of Geo(1/√ n) sorted, Java runtimes, n = 107 C++ ≈ √ n runs, avg length √ n 10% of runs < 0.1 √ n 5% of runs > 5 √ n moderate presortedness 500 600 700 time (ms) top-down mergesort bottom-up mergesort peeksort powersort Timsort trotsort (no galloping) Arrays.sort(int[]) 0.55 0.6 0.65 0.7 merge costs (normalized) beat quicksort by 20% back Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 18 / 15

Appendix: Run-adaptiveness helps – C++ 2 Run-adaptiveness helps: Adaptive methods

are faster on inputs with low H. Study: “random runs”: rp w/ ranges of Geo(1/√ n) sorted, C++ runtimes, n = 107 Java ≈ √ n runs, avg length √ n 10% of runs < 0.1 √ n 5% of runs > 5 √ n moderate presortedness 400 500 time (ms) top-down mergesort bottom-up mergesort boustrophedonic bottom-up peeksort powersort Timsort trotsort (Timsort w/o galloping) trotsort (Timsort w/o galloping, straight insertion sort) std::sort Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 19 / 15

Appendix: Timsort’s weak point – C++ 3 Timsort’s weak point:

Timsort is much slower than peek-/powersort on certain inputs. Study: “Timsort-drags”: known family of bad-case sequences Rtim (n) by Buss & Knop 2018 L1, . . . , Lr C++ runtimes, n = 224 ≈ 1.6 · 107 Java 800 1,000 1,200 1,400 1,600 time (ms) top-down mergesort bottom-up mergesort boustrophedonic bottom-up peeksort powersort Timsort trotsort (Timsort w/o galloping) trotsort (Timsort w/o galloping, straight insertion sort) std::sort Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 20 / 15

Timsort How does Timsort work? Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20

21 / 15

Timsort How does Timsort work? Timsort Proceed left to right:

Find next Run Push Run onto stack Z Y X W . . . top Run-Stack While A / B / C / D applicable: Merge corresponding runs Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 21 / 15

Timsort How does Timsort work? Timsort Proceed left to right:

Find next Run Push Run onto stack Z Y X W . . . top Run-Stack While A / B / C / D applicable: Merge corresponding runs Rule A Z Y X . . . Z > X Z X+Y . . . Rule B ¬ A Z Y X Z . . . Y + Z > X Y+Z X . . . Rule C ¬ A , ¬ B Z Y X W Y . . . X + Y > W Y+Z X W . . . Rule D ¬ A , ¬ B , ¬ C Z Y . . . Z Y Y+Z . . . Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 21 / 15

Timsort How does Timsort work? Timsort Proceed left to right:

Find next Run Push Run onto stack Z Y X W . . . top Run-Stack While A / B / C / D applicable: Merge corresponding runs Rule A Z Y X . . . Z > X Z X+Y . . . Rule B ¬ A Z Y X Z . . . Y + Z > X Y+Z X . . . Rule C ¬ A , ¬ B Z Y X W Y . . . X + Y > W Y+Z X W . . . Rule D ¬ A , ¬ B , ¬ C Z Y . . . Z Y Y+Z . . . Advantages: proﬁts from existing runs locality of reference for merges But: opaque rules! Goal: run lengths Fibonacci stack height ≈ logφ (n) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 21 / 15

Timsort How does Timsort work? Timsort Proceed left to right:

Find next Run Push Run onto stack Z Y X W . . . top Run-Stack While A / B / C / D applicable: Merge corresponding runs Rule A Z Y X . . . Z > X Z X+Y . . . Rule B ¬ A Z Y X Z . . . Y + Z > X Y+Z X . . . Rule C ¬ A , ¬ B Z Y X W Y . . . X + Y > W Y+Z X W . . . Rule D ¬ A , ¬ B , ¬ C Z Y . . . Z Y Y+Z . . . Advantages: proﬁts from existing runs locality of reference for merges But: opaque rules! Goal: run lengths Fibonacci stack height ≈ logφ (n) 2015 invariant does not hold w/o rule C! Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 21 / 15

Timsort How does Timsort work? Timsort Proceed left to right:

Find next Run Push Run onto stack Z Y X W . . . top Run-Stack While A / B / C / D applicable: Merge corresponding runs Rule A Z Y X . . . Z > X Z X+Y . . . Rule B ¬ A Z Y X Z . . . Y + Z > X Y+Z X . . . Rule C ¬ A , ¬ B Z Y X W Y . . . X + Y > W Y+Z X W . . . Rule D ¬ A , ¬ B , ¬ C Z Y . . . Z Y Y+Z . . . Advantages: proﬁts from existing runs locality of reference for merges But: opaque rules! Goal: run lengths Fibonacci stack height ≈ logφ (n) 2015 invariant does not hold w/o rule C! CPython: Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 21 / 15

Timsort How does Timsort work? Timsort Proceed left to right:

Find next Run Push Run onto stack Z Y X W . . . top Run-Stack While A / B / C / D applicable: Merge corresponding runs Rule A Z Y X . . . Z > X Z X+Y . . . Rule B ¬ A Z Y X Z . . . Y + Z > X Y+Z X . . . Rule C ¬ A , ¬ B Z Y X W Y . . . X + Y > W Y+Z X W . . . Rule D ¬ A , ¬ B , ¬ C Z Y . . . Z Y Y+Z . . . Advantages: proﬁts from existing runs locality of reference for merges But: opaque rules! Goal: run lengths Fibonacci stack height ≈ logφ (n) 2015 invariant does not hold w/o rule C! CPython: OpenJDK: rather increase stack ... Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 21 / 15

Timsort How does Timsort work? Timsort Proceed left to right:

Find next Run Push Run onto stack Z Y X W . . . top Run-Stack While A / B / C / D applicable: Merge corresponding runs Rule A Z Y X . . . Z > X Z X+Y . . . Rule B ¬ A Z Y X Z . . . Y + Z > X Y+Z X . . . Rule C ¬ A , ¬ B Z Y X W Y . . . X + Y > W Y+Z X W . . . Rule D ¬ A , ¬ B , ¬ C Z Y . . . Z Y Y+Z . . . Advantages: proﬁts from existing runs locality of reference for merges But: opaque rules! Goal: run lengths Fibonacci stack height ≈ logφ (n) 2015 invariant does not hold w/o rule C! CPython: OpenJDK: rather increase stack ... Stack still too small! (Auger et al. ESA 2018) Sebastian Wild Nearly-Optimal Mergesorts 2018-08-20 21 / 15

Nearly-Optimal Mergesort

Nearly-Optimal Mergesort

More Decks by Sebastian Wild

Featured

Transcript