unordered list of N elements Sorting (Single) Selection e.g. find median by Rank Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
unordered list of N elements Sorting (Single) Selection e.g. find median by Rank O(N log2 N) comparisons, time O(N B logM/B N B ) I/Os internal: Mergesort IO: Multiway Mergesort CO: Funnelsort O(N) comparisons, time O(N/B) I/Os internal: Median-of-medians algorithm IO: Median-of-medians algorithm CO: Median-of-medians algorithm Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
unordered list of N elements Sorting (Single) Selection e.g. find median by Rank O(N log2 N) comparisons, time O(N B logM/B N B ) I/Os internal: Mergesort IO: Multiway Mergesort CO: Funnelsort O(N) comparisons, time O(N/B) I/Os internal: Median-of-medians algorithm IO: Median-of-medians algorithm CO: Median-of-medians algorithm more sorted more expensive Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
unordered list of N elements Sorting (Single) Selection e.g. find median by Rank O(N log2 N) comparisons, time O(N B logM/B N B ) I/Os internal: Mergesort IO: Multiway Mergesort CO: Funnelsort O(N) comparisons, time O(N/B) I/Os internal: Median-of-medians algorithm IO: Median-of-medians algorithm CO: Median-of-medians algorithm more sorted more expensive M u l t i p l e S e l e c t i o n Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
unordered list of N elements Sorting (Single) Selection e.g. find median by Rank O(N log2 N) comparisons, time O(N B logM/B N B ) I/Os internal: Mergesort IO: Multiway Mergesort CO: Funnelsort O(N) comparisons, time O(N/B) I/Os internal: Median-of-medians algorithm IO: Median-of-medians algorithm CO: Median-of-medians algorithm more sorted more expensive M u l t i p l e S e l e c t i o n This talk: What happens between selection and sorting in the cache-oblivious model? Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12
access only in blocks of B cells IO Model (External Memory Model): Cost of computation = #“I/Os” = #blocks transfered between cache & external memory Cache size M ... external memory – size unbounded Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12
access only in blocks of B cells IO Model (External Memory Model): Cost of computation = #“I/Os” = #blocks transfered between cache & external memory Cache size M ... external memory – size unbounded Cache Oblivious (CO) Model algorithm can’t use M and B (I/Os automatic, OPT paging policy) ⇝ CO algorithm works for all even hierarchies! M and B Frigo, Leiserson, Prokop, Ramachandran: Cache-oblivious algorithms, TALG 2012 Tall Cache Assumption: M ⩾ B1+ε think: ε = 1 ⇝ cache fits many cache lines necessary for existence of I/O-optimal (comparison-based) cache-oblivious sorting algorithms Brodal, Fagerberg: On the limits of cache-obliviousness, STOC 2003 Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12
access only in blocks of B cells IO Model (External Memory Model): Cost of computation = #“I/Os” = #blocks transfered between cache & external memory Cache size M ... external memory – size unbounded Cache Oblivious (CO) Model algorithm can’t use M and B (I/Os automatic, OPT paging policy) ⇝ CO algorithm works for all even hierarchies! M and B Frigo, Leiserson, Prokop, Ramachandran: Cache-oblivious algorithms, TALG 2012 Tall Cache Assumption: M ⩾ B1+ε think: ε = 1 ⇝ cache fits many cache lines necessary for existence of I/O-optimal (comparison-based) cache-oblivious sorting algorithms Brodal, Fagerberg: On the limits of cache-obliviousness, STOC 2003 Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12
access only in blocks of B cells IO Model (External Memory Model): Cost of computation = #“I/Os” = #blocks transfered between cache & external memory Cache size M ... external memory – size unbounded Cache Oblivious (CO) Model algorithm can’t use M and B (I/Os automatic, OPT paging policy) ⇝ CO algorithm works for all even hierarchies! M and B Frigo, Leiserson, Prokop, Ramachandran: Cache-oblivious algorithms, TALG 2012 Tall Cache Assumption: M ⩾ B1+ε think: ε = 1 ⇝ cache fits many cache lines necessary for existence of I/O-optimal (comparison-based) cache-oblivious sorting algorithms Brodal, Fagerberg: On the limits of cache-obliviousness, STOC 2003 Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12
selection Blum et al. 1973 wc 5.4305N O(N/B) CO, deterministic Multiple selection Dobkin & Munro 1981 wc 3B + O(N) O((B + N)/B) CO, deterministic Hu et al. 2014 wc O(B + N) O(BI/O + N/B) deterministic Barbay et al. 2016 wc O(B + N) O(BI/O + N/B) online, determ., M ⩾ B1+ε Funnelselect E O(B + N) O(BI/O + N/B) CO, randomized, M ⩾ B1+ε Lower bound B − O(N) Ω(BI/O ) − O N B M ⩾ B1+ε B = q+1 i=1 ∆i log2 N ∆i BI/O = q+1 i=1 ∆i B logM B N ∆i = B B log2 (M/B) ≪ (usually) B B Funnelselect is the first cache-oblivious I/O-optimal algorithm. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 5 / 12
external-memory: (for sorting and multiple selection) binary partitioning ⇝ ≈ M B -way partitioning can’t do that cache obliviously 2 cache-oblivious sorting uses Nδ-way partitioning by “funnels” ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O ) I/Os can’t use that for multiple selection Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12
external-memory: (for sorting and multiple selection) binary partitioning ⇝ ≈ M B -way partitioning can’t do that cache obliviously 2 cache-oblivious sorting uses Nδ-way partitioning by “funnels” ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O ) I/Os can’t use that for multiple selection Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12
external-memory: (for sorting and multiple selection) binary partitioning ⇝ ≈ M B -way partitioning can’t do that cache obliviously 2 cache-oblivious sorting uses Nδ-way partitioning by “funnels” ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O ) I/Os can’t use that for multiple selection Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12
external-memory: (for sorting and multiple selection) binary partitioning ⇝ ≈ M B -way partitioning can’t do that cache obliviously 2 cache-oblivious sorting uses Nδ-way partitioning by “funnels” ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O ) I/Os can’t use that for multiple selection Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12
Funnel: k = 4 √ N-way merger (recursive binary merges) judiciously sized buffers for intermediate results Simplifying assumptions: N = 22i and d = 4 P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 input array output arrays √ k-partitioner √ k-partitioners k2 per buffer k-partitioner Funnel recursion (van Emde Boas) Funnel buffer sizes (largest in middle layer) ⇝ overall space O(k5/2) = O(N5/8) Nodes partition around pivot value P Partition = push input down when output buffer full, recursively push need final flush of all buffers ≈ I/O-optimal cache-oblivious gadget for ˆ k-way partitioning with ˆ k ≈ M0.3 ⇝ Can be used for expected I/O-optimal CO quicksort (but better options available) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
Funnel: k = 4 √ N-way merger (recursive binary merges) judiciously sized buffers for intermediate results Simplifying assumptions: N = 22i and d = 4 P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 input array output arrays √ k-partitioner √ k-partitioners k2 per buffer k-partitioner Funnel recursion (van Emde Boas) Funnel buffer sizes (largest in middle layer) ⇝ overall space O(k5/2) = O(N5/8) Nodes partition around pivot value P Partition = push input down when output buffer full, recursively push need final flush of all buffers ≈ I/O-optimal cache-oblivious gadget for ˆ k-way partitioning with ˆ k ≈ M0.3 ⇝ Can be used for expected I/O-optimal CO quicksort (but better options available) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
Funnel: k = 4 √ N-way merger (recursive binary merges) judiciously sized buffers for intermediate results Simplifying assumptions: N = 22i and d = 4 P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 input array output arrays √ k-partitioner √ k-partitioners k2 per buffer k-partitioner Funnel recursion (van Emde Boas) Funnel buffer sizes (largest in middle layer) ⇝ overall space O(k5/2) = O(N5/8) Nodes partition around pivot value P Partition = push input down when output buffer full, recursively push need final flush of all buffers ≈ I/O-optimal cache-oblivious gadget for ˆ k-way partitioning with ˆ k ≈ M0.3 ⇝ Can be used for expected I/O-optimal CO quicksort (but better options available) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
Funnel: k = 4 √ N-way merger (recursive binary merges) judiciously sized buffers for intermediate results Simplifying assumptions: N = 22i and d = 4 P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 input array output arrays √ k-partitioner √ k-partitioners k2 per buffer k-partitioner Funnel recursion (van Emde Boas) Funnel buffer sizes (largest in middle layer) ⇝ overall space O(k5/2) = O(N5/8) Nodes partition around pivot value P Partition = push input down when output buffer full, recursively push need final flush of all buffers ≈ I/O-optimal cache-oblivious gadget for ˆ k-way partitioning with ˆ k ≈ M0.3 ⇝ Can be used for expected I/O-optimal CO quicksort (but better options available) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12
full k-partitioner (already sorting complexity) Early truncation: Don’t split buckets that don’t contain any query ranks! Buckets depend on (random) pivots ⇝ Only know a query’s bucket after partitioning ... Assume bucket boundaries close within safety margin ±ξ = N1/2+δ to expected location. ⇝ If bucket is expected query free, don’t split further. “Expected query free” known up front (Depends only on N & query ranks) ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12
full k-partitioner (already sorting complexity) Early truncation: Don’t split buckets that don’t contain any query ranks! P1 P3 P5 P7 P9 P11 P13 P15 P2 P6 P10 P14 P4 P12 P8 Buckets depend on (random) pivots ⇝ Only know a query’s bucket after partitioning ... Assume bucket boundaries close within safety margin ±ξ = N1/2+δ to expected location. ⇝ If bucket is expected query free, don’t split further. “Expected query free” known up front (Depends only on N & query ranks) ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12
full k-partitioner (already sorting complexity) Early truncation: Don’t split buckets that don’t contain any query ranks! P1 P3 P7 P9 P2 P6 P10 P4 P12 P8 Buckets depend on (random) pivots ⇝ Only know a query’s bucket after partitioning ... Assume bucket boundaries close within safety margin ±ξ = N1/2+δ to expected location. ⇝ If bucket is expected query free, don’t split further. “Expected query free” known up front (Depends only on N & query ranks) ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12
full k-partitioner (already sorting complexity) Early truncation: Don’t split buckets that don’t contain any query ranks! P1 P3 P7 P9 P2 P6 P10 P4 P12 P8 Buckets depend on (random) pivots ⇝ Only know a query’s bucket after partitioning ... Assume bucket boundaries close within safety margin ±ξ = N1/2+δ to expected location. ⇝ If bucket is expected query free, don’t split further. “Expected query free” known up front (Depends only on N & query ranks) ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12
full k-partitioner (already sorting complexity) Early truncation: Don’t split buckets that don’t contain any query ranks! P1 P3 P7 P9 P2 P6 P10 P4 P12 P8 Buckets depend on (random) pivots ⇝ Only know a query’s bucket after partitioning ... Assume bucket boundaries close within safety margin ±ξ = N1/2+δ to expected location. ⇝ If bucket is expected query free, don’t split further. “Expected query free” known up front (Depends only on N & query ranks) ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12
margin ±ξ = N1/2+δ to expected location. We have to make that true by choosing pivots well The following randomized choice works with high probability: by standard Chernoff bound arguments 1 Include each element in sample ¯ S with prob. p = 1/ log2 (N). 2 Sort the sample ¯ S. 3 Pick pivot Pi as ≈ ipN/kth smallest in ¯ S (i = 1, . . . , k − 1) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 9 / 12
margin ±ξ = N1/2+δ to expected location. We have to make that true by choosing pivots well The following randomized choice works with high probability: by standard Chernoff bound arguments 1 Include each element in sample ¯ S with prob. p = 1/ log2 (N). 2 Sort the sample ¯ S. 3 Pick pivot Pi as ≈ ipN/kth smallest in ¯ S (i = 1, . . . , k − 1) Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 9 / 12
. , Pk−1 2 Build k-partitioner using P1 , . . . , Pk−1 3 Mark expected query free buckets & rewire their parent’s buffer to output 4 For each bucket with queries: i Sort the bucket (Funnelsort) ii Report sought elements Observations: No (top-level) recursion needed Algorithm can fail at several places sample too small, sample too large, pivots too skewed, query in expected query free bucket ⇝ restart but: with high probability, no fails ⇝ no effect on expected running time Can be augmented to produce input partitioned around sought elements in contiguous external memory (default: only return sought elements in order) Can be made to handle equal elements Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 10 / 12
. , Pk−1 2 Build k-partitioner using P1 , . . . , Pk−1 3 Mark expected query free buckets & rewire their parent’s buffer to output 4 For each bucket with queries: i Sort the bucket (Funnelsort) ii Report sought elements Observations: No (top-level) recursion needed Algorithm can fail at several places sample too small, sample too large, pivots too skewed, query in expected query free bucket ⇝ restart but: with high probability, no fails ⇝ no effect on expected running time Can be augmented to produce input partitioned around sought elements in contiguous external memory (default: only return sought elements in order) Can be made to handle equal elements Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 10 / 12
The presented algorithm is inherently randomized, but we meanwhile found a deterministic method! (but too late for proceedings ... but stay tuned) Open Problems: 1 Funnelselect assumes a tall cache. Cannot be avoided since I/O-opt. CO sorting (⊂ multiple selection) requires it. But single selection doesn’t! ⇝ What happens in between? 2 Can online multiple selection be solved I/O-optimal cache obliviously? Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 11 / 12
The presented algorithm is inherently randomized, but we meanwhile found a deterministic method! (but too late for proceedings ... but stay tuned) Open Problems: 1 Funnelselect assumes a tall cache. Cannot be avoided since I/O-opt. CO sorting (⊂ multiple selection) requires it. But single selection doesn’t! ⇝ What happens in between? 2 Can online multiple selection be solved I/O-optimal cache obliviously? Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 11 / 12
The presented algorithm is inherently randomized, but we meanwhile found a deterministic method! (but too late for proceedings ... but stay tuned) Open Problems: 1 Funnelselect assumes a tall cache. Cannot be avoided since I/O-opt. CO sorting (⊂ multiple selection) requires it. But single selection doesn’t! ⇝ What happens in between? 2 Can online multiple selection be solved I/O-optimal cache obliviously? Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 11 / 12
Ware, Pause08, and Madebyoliver from www.flaticon.com. Vector graphics from Pressfoto, brgfx, macrovector and Jannoon028 on freepik.com Other photos from www.pixabay.com. Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 13 / 12
N ∆i with ∆i = ri − ri−1 (1 ⩽ i ⩽ q + 1, r0 = 0, rq+1 = N + 1) BI/O = q+1 i=1 ∆i B logM B N ∆i = B B log2 (M/B) ≪ (usually) B B Theorem (Lower bound) External-memory multiple selection in expectation requires Ω(BI/O ) − O N B logM/B B I/Os. Follows from general reduction (cmps bound ⇝ I/Os bound) Arge, Knudsen, Larsen: A general lower bound on the I/O-complexity of comparison-based algorithms, WADS 1993 “finish-sorting” argument no longer rigorous Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 14 / 12