Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Funnelselect - Cache-Oblivious Multiple Selection

Sebastian Wild
September 13, 2023

Funnelselect - Cache-Oblivious Multiple Selection

Slides for the talk at ESA 2023 in Amsterdam; paper and further details at https://www.wild-inter.net/publications/brodal-wild-2023

Sebastian Wild

September 13, 2023
Tweet

More Decks by Sebastian Wild

Other Decks in Research

Transcript

  1. Funnelselect: Cache-Oblivious Multiple Selection Sebastian Wild joint work with Gerth

    Stølting Brodal European Symposium on Algorithms 2023 CWI Amsterdam Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 0 / 12
  2. Outline 1 Multiple Selection 1 Multiple Selection 2 Cache Oblivious

    Algorithms 2 Cache Oblivious Algorithms 3 Funnelselect 3 Funnelselect 4 Conclusion 4 Conclusion Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 0 / 12
  3. “What’s this about” in 2min Two ancient comparison-based problems on

    unordered list of N elements Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
  4. “What’s this about” in 2min Two ancient comparison-based problems on

    unordered list of N elements Sorting Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
  5. “What’s this about” in 2min Two ancient comparison-based problems on

    unordered list of N elements Sorting (Single) Selection e.g. find median by Rank Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
  6. “What’s this about” in 2min Two ancient comparison-based problems on

    unordered list of N elements Sorting (Single) Selection e.g. find median by Rank O(N log2 N) comparisons, time O(N B logM/B N B ) I/Os internal: Mergesort IO: Multiway Mergesort CO: Funnelsort O(N) comparisons, time O(N/B) I/Os internal: Median-of-medians algorithm IO: Median-of-medians algorithm CO: Median-of-medians algorithm Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
  7. “What’s this about” in 2min Two ancient comparison-based problems on

    unordered list of N elements Sorting (Single) Selection e.g. find median by Rank O(N log2 N) comparisons, time O(N B logM/B N B ) I/Os internal: Mergesort IO: Multiway Mergesort CO: Funnelsort O(N) comparisons, time O(N/B) I/Os internal: Median-of-medians algorithm IO: Median-of-medians algorithm CO: Median-of-medians algorithm more sorted more expensive Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
  8. “What’s this about” in 2min Two ancient comparison-based problems on

    unordered list of N elements Sorting (Single) Selection e.g. find median by Rank O(N log2 N) comparisons, time O(N B logM/B N B ) I/Os internal: Mergesort IO: Multiway Mergesort CO: Funnelsort O(N) comparisons, time O(N/B) I/Os internal: Median-of-medians algorithm IO: Median-of-medians algorithm CO: Median-of-medians algorithm more sorted more expensive M u l t i p l e S e l e c t i o n Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
  9. “What’s this about” in 2min Two ancient comparison-based problems on

    unordered list of N elements Sorting (Single) Selection e.g. find median by Rank O(N log2 N) comparisons, time O(N B logM/B N B ) I/Os internal: Mergesort IO: Multiway Mergesort CO: Funnelsort O(N) comparisons, time O(N/B) I/Os internal: Median-of-medians algorithm IO: Median-of-medians algorithm CO: Median-of-medians algorithm more sorted more expensive M u l t i p l e S e l e c t i o n This talk: What happens between selection and sorting in the cache-oblivious model? Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
  10. The Multiple Selection Problem Goal: find q elements of ranks

    r1 < r2 < · · · < rq from unsorted elements here: all distinct x1 , . . . , xN ⇝ Report sought elements x(r1) , . . . , x(rq) in sorted order Example: 67 x1 30 x2 45 x3 33 x4 15 x5 99 x6 26 x7 90 x8 55 x9 9 x10 96 x11 45 x12 95 x13 31 x14 3 x15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 r1 = 1 r2 = 2 r3 = 3 r4 = 8 Simple algorithms: 1 q calls to selection algorithm (quickselect, median of medians) ⇝ O(Nq) 2 Divide & conquer 1: (single) select r⌈q/2⌉ -th smallest and partition x1 , . . . , xN around it. Recursively select r1 , . . . , r⌈q/2⌉−1 and r⌈q/2⌉+1 , . . . , rq ⇝ O(N lg q) 3 Divide & conquer 2: Find median of x1 , . . . , xN and split query ranks. Recurse where subproblem contains query ranks. ⇝ O(N lg q) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 2 / 12
  11. The Multiple Selection Problem Goal: find q elements of ranks

    r1 < r2 < · · · < rq from unsorted elements here: all distinct x1 , . . . , xN ⇝ Report sought elements x(r1) , . . . , x(rq) in sorted order Example: 67 x1 30 x2 45 x3 33 x4 15 x5 99 x6 26 x7 90 x8 55 x9 9 x10 96 x11 45 x12 95 x13 31 x14 3 x15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 r1 = 1 r2 = 2 r3 = 3 r4 = 8 Simple algorithms: 1 q calls to selection algorithm (quickselect, median of medians) ⇝ O(Nq) 2 Divide & conquer 1: (single) select r⌈q/2⌉ -th smallest and partition x1 , . . . , xN around it. Recursively select r1 , . . . , r⌈q/2⌉−1 and r⌈q/2⌉+1 , . . . , rq ⇝ O(N lg q) 3 Divide & conquer 2: Find median of x1 , . . . , xN and split query ranks. Recurse where subproblem contains query ranks. ⇝ O(N lg q) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 2 / 12
  12. The Multiple Selection Problem Goal: find q elements of ranks

    r1 < r2 < · · · < rq from unsorted elements here: all distinct x1 , . . . , xN ⇝ Report sought elements x(r1) , . . . , x(rq) in sorted order Example: 67 x1 30 x2 45 x3 33 x4 15 x5 99 x6 26 x7 90 x8 55 x9 9 x10 96 x11 45 x12 95 x13 31 x14 3 x15 3 1 9 2 15 3 26 4 30 5 31 6 33 7 45 8 45 9 55 10 67 11 90 12 95 13 96 14 99 15 r1 = 1 r2 = 2 r3 = 3 r4 = 8 Simple algorithms: 1 q calls to selection algorithm (quickselect, median of medians) ⇝ O(Nq) 2 Divide & conquer 1: (single) select r⌈q/2⌉ -th smallest and partition x1 , . . . , xN around it. Recursively select r1 , . . . , r⌈q/2⌉−1 and r⌈q/2⌉+1 , . . . , rq ⇝ O(N lg q) 3 Divide & conquer 2: Find median of x1 , . . . , xN and split query ranks. Recurse where subproblem contains query ranks. ⇝ O(N lg q) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 2 / 12
  13. The Multiple Selection Problem Goal: find q elements of ranks

    r1 < r2 < · · · < rq from unsorted elements here: all distinct x1 , . . . , xN ⇝ Report sought elements x(r1) , . . . , x(rq) in sorted order Example: 67 x1 30 x2 45 x3 33 x4 15 x5 99 x6 26 x7 90 x8 55 x9 9 x10 96 x11 45 x12 95 x13 31 x14 3 x15 3 1 9 2 15 3 26 4 30 5 31 6 33 7 45 8 45 9 55 10 67 11 90 12 95 13 96 14 99 15 r1 = 1 r2 = 2 r3 = 3 r4 = 8 Answer: 3, 9, 15, 45 Simple algorithms: 1 q calls to selection algorithm (quickselect, median of medians) ⇝ O(Nq) 2 Divide & conquer 1: (single) select r⌈q/2⌉ -th smallest and partition x1 , . . . , xN around it. Recursively select r1 , . . . , r⌈q/2⌉−1 and r⌈q/2⌉+1 , . . . , rq ⇝ O(N lg q) 3 Divide & conquer 2: Find median of x1 , . . . , xN and split query ranks. Recurse where subproblem contains query ranks. ⇝ O(N lg q) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 2 / 12
  14. The Multiple Selection Problem Goal: find q elements of ranks

    r1 < r2 < · · · < rq from unsorted elements here: all distinct x1 , . . . , xN ⇝ Report sought elements x(r1) , . . . , x(rq) in sorted order Example: 67 x1 30 x2 45 x3 33 x4 15 x5 99 x6 26 x7 90 x8 55 x9 9 x10 96 x11 45 x12 95 x13 31 x14 3 x15 3 1 9 2 15 3 26 4 30 5 31 6 33 7 45 8 45 9 55 10 67 11 90 12 95 13 96 14 99 15 r1 = 1 r2 = 2 r3 = 3 r4 = 8 Answer: 3, 9, 15, 45 Simple algorithms: 1 q calls to selection algorithm (quickselect, median of medians) ⇝ O(Nq) 2 Divide & conquer 1: (single) select r⌈q/2⌉ -th smallest and partition x1 , . . . , xN around it. Recursively select r1 , . . . , r⌈q/2⌉−1 and r⌈q/2⌉+1 , . . . , rq ⇝ O(N lg q) 3 Divide & conquer 2: Find median of x1 , . . . , xN and split query ranks. Recurse where subproblem contains query ranks. ⇝ O(N lg q) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 2 / 12
  15. Fine-grained bounds “Gap Entropy”: B = q+1 i=1 ∆i log2

    N ∆i with ∆i = ri − ri−1 (1 ⩽ i ⩽ q + 1, r0 = 0 and rq+1 = N + 1) r1 = 1 r2 = 2 r3 = 3 r4 = 8 ∆ 0 = 1 ∆ 1 = 1 ∆ 2 = 1 ∆0 = 5 ∆1 = 8 ⇝ B ≈ 26.9 lower bound of B − O(N) comparisons via “finish sorting” argument upper bound of B + o(B) + O(N) comparisons Kaligosi, Mehlhorn, Munro & Sanders: Towards Optimal Multiple Selection, ICALP 2005 follows from nontrivial analysis; much easier: repeated median selection & recursion gets O(B + N) (→ paper) ⇝ In internal memory, multiple-selection problem solved✓ Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 3 / 12
  16. Fine-grained bounds “Gap Entropy”: B = q+1 i=1 ∆i log2

    N ∆i with ∆i = ri − ri−1 (1 ⩽ i ⩽ q + 1, r0 = 0 and rq+1 = N + 1) r1 = 1 r2 = 2 r3 = 3 r4 = 8 ∆ 0 = 1 ∆ 1 = 1 ∆ 2 = 1 ∆0 = 5 ∆1 = 8 ⇝ B ≈ 26.9 lower bound of B − O(N) comparisons via “finish sorting” argument upper bound of B + o(B) + O(N) comparisons Kaligosi, Mehlhorn, Munro & Sanders: Towards Optimal Multiple Selection, ICALP 2005 follows from nontrivial analysis; much easier: repeated median selection & recursion gets O(B + N) (→ paper) ⇝ In internal memory, multiple-selection problem solved✓ Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 3 / 12
  17. Fine-grained bounds “Gap Entropy”: B = q+1 i=1 ∆i log2

    N ∆i with ∆i = ri − ri−1 (1 ⩽ i ⩽ q + 1, r0 = 0 and rq+1 = N + 1) r1 = 1 r2 = 2 r3 = 3 r4 = 8 ∆ 0 = 1 ∆ 1 = 1 ∆ 2 = 1 ∆0 = 5 ∆1 = 8 ⇝ B ≈ 26.9 lower bound of B − O(N) comparisons via “finish sorting” argument upper bound of B + o(B) + O(N) comparisons Kaligosi, Mehlhorn, Munro & Sanders: Towards Optimal Multiple Selection, ICALP 2005 follows from nontrivial analysis; much easier: repeated median selection & recursion gets O(B + N) (→ paper) ⇝ In internal memory, multiple-selection problem solved✓ Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 3 / 12
  18. Outline 1 Multiple Selection 1 Multiple Selection 2 Cache Oblivious

    Algorithms 2 Cache Oblivious Algorithms 3 Funnelselect 3 Funnelselect 4 Conclusion 4 Conclusion Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 3 / 12
  19. 2 Cache Oblivious Algorithms 2 Cache Oblivious Algorithms Sebastian Wild

    Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 3 / 12
  20. IO Model & CO Model CPU fast random access slow

    access only in blocks of B cells IO Model (External Memory Model): Cost of computation = #“I/Os” = #blocks transfered between cache & external memory Cache size M ... external memory – size unbounded Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12
  21. IO Model & CO Model CPU fast random access slow

    access only in blocks of B cells IO Model (External Memory Model): Cost of computation = #“I/Os” = #blocks transfered between cache & external memory Cache size M ... external memory – size unbounded Cache Oblivious (CO) Model algorithm can’t use M and B (I/Os automatic, OPT paging policy) ⇝ CO algorithm works for all even hierarchies! M and B Frigo, Leiserson, Prokop, Ramachandran: Cache-oblivious algorithms, TALG 2012 Tall Cache Assumption: M ⩾ B1+ε think: ε = 1 ⇝ cache fits many cache lines necessary for existence of I/O-optimal (comparison-based) cache-oblivious sorting algorithms Brodal, Fagerberg: On the limits of cache-obliviousness, STOC 2003 Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12
  22. IO Model & CO Model CPU fast random access slow

    access only in blocks of B cells IO Model (External Memory Model): Cost of computation = #“I/Os” = #blocks transfered between cache & external memory Cache size M ... external memory – size unbounded Cache Oblivious (CO) Model algorithm can’t use M and B (I/Os automatic, OPT paging policy) ⇝ CO algorithm works for all even hierarchies! M and B Frigo, Leiserson, Prokop, Ramachandran: Cache-oblivious algorithms, TALG 2012 Tall Cache Assumption: M ⩾ B1+ε think: ε = 1 ⇝ cache fits many cache lines necessary for existence of I/O-optimal (comparison-based) cache-oblivious sorting algorithms Brodal, Fagerberg: On the limits of cache-obliviousness, STOC 2003 Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12
  23. IO Model & CO Model CPU fast random access slow

    access only in blocks of B cells IO Model (External Memory Model): Cost of computation = #“I/Os” = #blocks transfered between cache & external memory Cache size M ... external memory – size unbounded Cache Oblivious (CO) Model algorithm can’t use M and B (I/Os automatic, OPT paging policy) ⇝ CO algorithm works for all even hierarchies! M and B Frigo, Leiserson, Prokop, Ramachandran: Cache-oblivious algorithms, TALG 2012 Tall Cache Assumption: M ⩾ B1+ε think: ε = 1 ⇝ cache fits many cache lines necessary for existence of I/O-optimal (comparison-based) cache-oblivious sorting algorithms Brodal, Fagerberg: On the limits of cache-obliviousness, STOC 2003 Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12
  24. Previous Work & Our Results Reference Comparisons I/Os Comments Single

    selection Blum et al. 1973 wc 5.4305N O(N/B) CO, deterministic Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 5 / 12
  25. Previous Work & Our Results Reference Comparisons I/Os Comments Single

    selection Blum et al. 1973 wc 5.4305N O(N/B) CO, deterministic Multiple selection Dobkin & Munro 1981 wc 3B + O(N) O((B + N)/B) CO, deterministic Hu et al. 2014 wc O(B + N) O(BI/O + N/B) deterministic Barbay et al. 2016 wc O(B + N) O(BI/O + N/B) online, determ., M ⩾ B1+ε B = q+1 i=1 ∆i log2 N ∆i BI/O = q+1 i=1 ∆i B logM B N ∆i = B B log2 (M/B) ≪ (usually) B B Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 5 / 12
  26. Previous Work & Our Results Reference Comparisons I/Os Comments Single

    selection Blum et al. 1973 wc 5.4305N O(N/B) CO, deterministic Multiple selection Dobkin & Munro 1981 wc 3B + O(N) O((B + N)/B) CO, deterministic Hu et al. 2014 wc O(B + N) O(BI/O + N/B) deterministic Barbay et al. 2016 wc O(B + N) O(BI/O + N/B) online, determ., M ⩾ B1+ε Funnelselect E O(B + N) O(BI/O + N/B) CO, randomized, M ⩾ B1+ε Lower bound B − O(N) Ω(BI/O ) − O N B M ⩾ B1+ε B = q+1 i=1 ∆i log2 N ∆i BI/O = q+1 i=1 ∆i B logM B N ∆i = B B log2 (M/B) ≪ (usually) B B Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 5 / 12
  27. Previous Work & Our Results Reference Comparisons I/Os Comments Single

    selection Blum et al. 1973 wc 5.4305N O(N/B) CO, deterministic Multiple selection Dobkin & Munro 1981 wc 3B + O(N) O((B + N)/B) CO, deterministic Hu et al. 2014 wc O(B + N) O(BI/O + N/B) deterministic Barbay et al. 2016 wc O(B + N) O(BI/O + N/B) online, determ., M ⩾ B1+ε Funnelselect E O(B + N) O(BI/O + N/B) CO, randomized, M ⩾ B1+ε Lower bound B − O(N) Ω(BI/O ) − O N B M ⩾ B1+ε B = q+1 i=1 ∆i log2 N ∆i BI/O = q+1 i=1 ∆i B logM B N ∆i = B B log2 (M/B) ≪ (usually) B B Funnelselect is the first cache-oblivious I/O-optimal algorithm. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 5 / 12
  28. Our technical challenge 1 Key algorithmic change from internal ⇝

    external-memory: (for sorting and multiple selection) binary partitioning ⇝ ≈ M B -way partitioning can’t do that cache obliviously 2 cache-oblivious sorting uses Nδ-way partitioning by “funnels” ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O ) I/Os can’t use that for multiple selection Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12
  29. Our technical challenge 1 Key algorithmic change from internal ⇝

    external-memory: (for sorting and multiple selection) binary partitioning ⇝ ≈ M B -way partitioning can’t do that cache obliviously 2 cache-oblivious sorting uses Nδ-way partitioning by “funnels” ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O ) I/Os can’t use that for multiple selection Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12
  30. Our technical challenge 1 Key algorithmic change from internal ⇝

    external-memory: (for sorting and multiple selection) binary partitioning ⇝ ≈ M B -way partitioning can’t do that cache obliviously 2 cache-oblivious sorting uses Nδ-way partitioning by “funnels” ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O ) I/Os can’t use that for multiple selection Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12
  31. Our technical challenge 1 Key algorithmic change from internal ⇝

    external-memory: (for sorting and multiple selection) binary partitioning ⇝ ≈ M B -way partitioning can’t do that cache obliviously 2 cache-oblivious sorting uses Nδ-way partitioning by “funnels” ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O ) I/Os can’t use that for multiple selection Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12
  32. Outline 1 Multiple Selection 1 Multiple Selection 2 Cache Oblivious

    Algorithms 2 Cache Oblivious Algorithms 3 Funnelselect 3 Funnelselect 4 Conclusion 4 Conclusion Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12
  33. Funnels for Partitioning Can use funnels from Funnelsort in reverse!

    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
  34. Funnels for Partitioning Can use funnels from Funnelsort in reverse!

    Funnel: k = 4 √ N-way merger (recursive binary merges) judiciously sized buffers for intermediate results Simplifying assumptions: N = 22i and d = 4 output array input arrays Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
  35. Funnels for Partitioning Can use funnels from Funnelsort in reverse!

    Funnel: k = 4 √ N-way merger (recursive binary merges) judiciously sized buffers for intermediate results Simplifying assumptions: N = 22i and d = 4 output array input arrays P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 input array output arrays Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
  36. Funnels for Partitioning Can use funnels from Funnelsort in reverse!

    Funnel: k = 4 √ N-way merger (recursive binary merges) judiciously sized buffers for intermediate results Simplifying assumptions: N = 22i and d = 4 P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 input array output arrays √ k-partitioner √ k-partitioners k-partitioner Funnel recursion (van Emde Boas) Funnel buffer sizes (largest in middle layer) ⇝ overall space O(k5/2) = O(N5/8) Nodes partition around pivot value P Partition = push input down when output buffer full, recursively push need final flush of all buffers ≈ I/O-optimal cache-oblivious gadget for ˆ k-way partitioning with ˆ k ≈ M0.3 ⇝ Can be used for expected I/O-optimal CO quicksort (but better options available) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
  37. Funnels for Partitioning Can use funnels from Funnelsort in reverse!

    Funnel: k = 4 √ N-way merger (recursive binary merges) judiciously sized buffers for intermediate results Simplifying assumptions: N = 22i and d = 4 P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 input array output arrays √ k-partitioner √ k-partitioners k2 per buffer k-partitioner Funnel recursion (van Emde Boas) Funnel buffer sizes (largest in middle layer) ⇝ overall space O(k5/2) = O(N5/8) Nodes partition around pivot value P Partition = push input down when output buffer full, recursively push need final flush of all buffers ≈ I/O-optimal cache-oblivious gadget for ˆ k-way partitioning with ˆ k ≈ M0.3 ⇝ Can be used for expected I/O-optimal CO quicksort (but better options available) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
  38. Funnels for Partitioning Can use funnels from Funnelsort in reverse!

    Funnel: k = 4 √ N-way merger (recursive binary merges) judiciously sized buffers for intermediate results Simplifying assumptions: N = 22i and d = 4 P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 input array output arrays √ k-partitioner √ k-partitioners k2 per buffer k-partitioner Funnel recursion (van Emde Boas) Funnel buffer sizes (largest in middle layer) ⇝ overall space O(k5/2) = O(N5/8) Nodes partition around pivot value P Partition = push input down when output buffer full, recursively push need final flush of all buffers ≈ I/O-optimal cache-oblivious gadget for ˆ k-way partitioning with ˆ k ≈ M0.3 ⇝ Can be used for expected I/O-optimal CO quicksort (but better options available) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
  39. Funnels for Partitioning Can use funnels from Funnelsort in reverse!

    Funnel: k = 4 √ N-way merger (recursive binary merges) judiciously sized buffers for intermediate results Simplifying assumptions: N = 22i and d = 4 P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 input array output arrays √ k-partitioner √ k-partitioners k2 per buffer k-partitioner Funnel recursion (van Emde Boas) Funnel buffer sizes (largest in middle layer) ⇝ overall space O(k5/2) = O(N5/8) Nodes partition around pivot value P Partition = push input down when output buffer full, recursively push need final flush of all buffers ≈ I/O-optimal cache-oblivious gadget for ˆ k-way partitioning with ˆ k ≈ M0.3 ⇝ Can be used for expected I/O-optimal CO quicksort (but better options available) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
  40. Funnels for Partitioning Can use funnels from Funnelsort in reverse!

    Funnel: k = 4 √ N-way merger (recursive binary merges) judiciously sized buffers for intermediate results Simplifying assumptions: N = 22i and d = 4 P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 input array output arrays √ k-partitioner √ k-partitioners k2 per buffer k-partitioner Funnel recursion (van Emde Boas) Funnel buffer sizes (largest in middle layer) ⇝ overall space O(k5/2) = O(N5/8) Nodes partition around pivot value P Partition = push input down when output buffer full, recursively push need final flush of all buffers ≈ I/O-optimal cache-oblivious gadget for ˆ k-way partitioning with ˆ k ≈ M0.3 ⇝ Can be used for expected I/O-optimal CO quicksort (but better options available) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
  41. Early Truncation Recall: for multiple selection (in general), can’t use

    full k-partitioner (already sorting complexity) Early truncation: Don’t split buckets that don’t contain any query ranks! Buckets depend on (random) pivots ⇝ Only know a query’s bucket after partitioning ... Assume bucket boundaries close within safety margin ±ξ = N1/2+δ to expected location. ⇝ If bucket is expected query free, don’t split further. “Expected query free” known up front (Depends only on N & query ranks) ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12
  42. Early Truncation Recall: for multiple selection (in general), can’t use

    full k-partitioner (already sorting complexity) Early truncation: Don’t split buckets that don’t contain any query ranks! P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 Buckets depend on (random) pivots ⇝ Only know a query’s bucket after partitioning ... Assume bucket boundaries close within safety margin ±ξ = N1/2+δ to expected location. ⇝ If bucket is expected query free, don’t split further. “Expected query free” known up front (Depends only on N & query ranks) ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12
  43. Early Truncation Recall: for multiple selection (in general), can’t use

    full k-partitioner (already sorting complexity) Early truncation: Don’t split buckets that don’t contain any query ranks! P1 P3 P7 P9 P2 P6 P10 P4 P12 P8 Buckets depend on (random) pivots ⇝ Only know a query’s bucket after partitioning ... Assume bucket boundaries close within safety margin ±ξ = N1/2+δ to expected location. ⇝ If bucket is expected query free, don’t split further. “Expected query free” known up front (Depends only on N & query ranks) ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12
  44. Early Truncation Recall: for multiple selection (in general), can’t use

    full k-partitioner (already sorting complexity) Early truncation: Don’t split buckets that don’t contain any query ranks! P1 P3 P7 P9 P2 P6 P10 P4 P12 P8 Buckets depend on (random) pivots ⇝ Only know a query’s bucket after partitioning ... Assume bucket boundaries close within safety margin ±ξ = N1/2+δ to expected location. ⇝ If bucket is expected query free, don’t split further. “Expected query free” known up front (Depends only on N & query ranks) ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12
  45. Early Truncation Recall: for multiple selection (in general), can’t use

    full k-partitioner (already sorting complexity) Early truncation: Don’t split buckets that don’t contain any query ranks! P1 P3 P7 P9 P2 P6 P10 P4 P12 P8 Buckets depend on (random) pivots ⇝ Only know a query’s bucket after partitioning ... Assume bucket boundaries close within safety margin ±ξ = N1/2+δ to expected location. ⇝ If bucket is expected query free, don’t split further. “Expected query free” known up front (Depends only on N & query ranks) ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12
  46. Pivot Sampling Recall this? Assume bucket boundaries close within safety

    margin ±ξ = N1/2+δ to expected location. We have to make that true by choosing pivots well The following randomized choice works with high probability: by standard Chernoff bound arguments 1 Include each element in sample ¯ S with prob. p = 1/ log2 (N). 2 Sort the sample ¯ S. 3 Pick pivot Pi as ≈ ipN/kth smallest in ¯ S (i = 1, . . . , k − 1) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 9 / 12
  47. Pivot Sampling Recall this? Assume bucket boundaries close within safety

    margin ±ξ = N1/2+δ to expected location. We have to make that true by choosing pivots well The following randomized choice works with high probability: by standard Chernoff bound arguments 1 Include each element in sample ¯ S with prob. p = 1/ log2 (N). 2 Sort the sample ¯ S. 3 Pick pivot Pi as ≈ ipN/kth smallest in ¯ S (i = 1, . . . , k − 1) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 9 / 12
  48. Overall Algorithm Funnelselect: 1 Sample pivots P1 , . .

    . , Pk−1 2 Build k-partitioner using P1 , . . . , Pk−1 3 Mark expected query free buckets & rewire their parent’s buffer to output 4 For each bucket with queries: i Sort the bucket (Funnelsort) ii Report sought elements Observations: No (top-level) recursion needed Algorithm can fail at several places sample too small, sample too large, pivots too skewed, query in expected query free bucket ⇝ restart but: with high probability, no fails ⇝ no effect on expected running time Can be augmented to produce input partitioned around sought elements in contiguous external memory (default: only return sought elements in order) Can be made to handle equal elements Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 10 / 12
  49. Overall Algorithm Funnelselect: 1 Sample pivots P1 , . .

    . , Pk−1 2 Build k-partitioner using P1 , . . . , Pk−1 3 Mark expected query free buckets & rewire their parent’s buffer to output 4 For each bucket with queries: i Sort the bucket (Funnelsort) ii Report sought elements Observations: No (top-level) recursion needed Algorithm can fail at several places sample too small, sample too large, pivots too skewed, query in expected query free bucket ⇝ restart but: with high probability, no fails ⇝ no effect on expected running time Can be augmented to produce input partitioned around sought elements in contiguous external memory (default: only return sought elements in order) Can be made to handle equal elements Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 10 / 12
  50. Outline 1 Multiple Selection 1 Multiple Selection 2 Cache Oblivious

    Algorithms 2 Cache Oblivious Algorithms 3 Funnelselect 3 Funnelselect 4 Conclusion 4 Conclusion Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 10 / 12
  51. Conclusion We presented Funnelselect the first I/O-optimal, cache-oblivious multiple-selection algorithm

    The presented algorithm is inherently randomized, but we meanwhile found a deterministic method! (but too late for proceedings ... but stay tuned) Open Problems: 1 Funnelselect assumes a tall cache. Cannot be avoided since I/O-opt. CO sorting (⊂ multiple selection) requires it. But single selection doesn’t! ⇝ What happens in between? 2 Can online multiple selection be solved I/O-optimal cache obliviously? Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 11 / 12
  52. Conclusion We presented Funnelselect the first I/O-optimal, cache-oblivious multiple-selection algorithm

    The presented algorithm is inherently randomized, but we meanwhile found a deterministic method! (but too late for proceedings ... but stay tuned) Open Problems: 1 Funnelselect assumes a tall cache. Cannot be avoided since I/O-opt. CO sorting (⊂ multiple selection) requires it. But single selection doesn’t! ⇝ What happens in between? 2 Can online multiple selection be solved I/O-optimal cache obliviously? Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 11 / 12
  53. Conclusion We presented Funnelselect the first I/O-optimal, cache-oblivious multiple-selection algorithm

    The presented algorithm is inherently randomized, but we meanwhile found a deterministic method! (but too late for proceedings ... but stay tuned) Open Problems: 1 Funnelselect assumes a tall cache. Cannot be avoided since I/O-opt. CO sorting (⊂ multiple selection) requires it. But single selection doesn’t! ⇝ What happens in between? 2 Can online multiple selection be solved I/O-optimal cache obliviously? Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 11 / 12
  54. We’re hiring! for Computing over compressed graph-structured data 3 year

    postdoc PhD student Liverpool sounds cool? → Talk to me! Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 12 / 12
  55. Icons made by Freepik, Gregor Cresnar, Those Icons, Smashicons, Good

    Ware, Pause08, and Madebyoliver from www.flaticon.com. Vector graphics from Pressfoto, brgfx, macrovector and Jannoon028 on freepik.com Other photos from www.pixabay.com. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 13 / 12
  56. I/O Lower Bound Recall: B = q+1 i=1 ∆i log2

    N ∆i with ∆i = ri − ri−1 (1 ⩽ i ⩽ q + 1, r0 = 0, rq+1 = N + 1) BI/O = q+1 i=1 ∆i B logM B N ∆i = B B log2 (M/B) ≪ (usually) B B Theorem (Lower bound) External-memory multiple selection in expectation requires Ω(BI/O ) − O N B logM/B B I/Os. Follows from general reduction (cmps bound ⇝ I/Os bound) Arge, Knudsen, Larsen: A general lower bound on the I/O-complexity of comparison-based algorithms, WADS 1993 “finish-sorting” argument no longer rigorous Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 14 / 12
  57. Recap: (Lazy) Funnelsort Funnelsort is a k = 4 √

    N-way Mergesort (outer recursion) each realized by recursive binary merging (inner recursion) with judiciously sized buffers for intermediate results (funnel) Simplifying assumptions: N = 22i and d = 4 (i.e., ε ⩾ 2 3 in tall cache assumption) d > 2 controls fanout (≈ N1/d-way merging) output array k2 per buffer input arrays √ k-merger √ k-mergers k-merger Recursive structure (cf. van Emde Boas trees) largest buffers in middle layer ⇝ overall space O(k5/2) = O(N5/8) Merge = fill output buffer when input buffer empty, recursively fill it ≈ I/O-optimal cache-oblivious gadget for ˆ k-way merging with ˆ k ≈ M0.3 Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 15 / 12
  58. Tall Caches CPU Cache size M T a l l

    ... external memory – size unbounded Tall Cache Assumption: M ⩾ B1+ε think: ε = 1 ⇝ M/B ⩾ Bε ⇝ cache fits many cache lines necessary for existence of I/O-optimal (comparison-based) cache-oblivious sorting algorithms Brodal, Fagerberg: On the limits of cache-obliviousness, STOC 2003 Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 16 / 12