Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Funnelselect - Cache-Oblivious Multiple Selection

Funnelselect - Cache-Oblivious Multiple Selection

Slides for the talk at ESA 2023 in Amsterdam; paper and further details at https://www.wild-inter.net/publications/brodal-wild-2023

Sebastian Wild

September 13, 2023
Tweet

More Decks by Sebastian Wild

Other Decks in Research

Transcript

  1. Funnelselect: Cache-Oblivious Multiple Selection
    Sebastian Wild
    joint work with Gerth Stølting Brodal
    European Symposium on Algorithms 2023
    CWI Amsterdam
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 0 / 12

    View Slide

  2. Outline
    1 Multiple Selection
    1 Multiple Selection
    2 Cache Oblivious Algorithms
    2 Cache Oblivious Algorithms
    3 Funnelselect
    3 Funnelselect
    4 Conclusion
    4 Conclusion
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 0 / 12

    View Slide

  3. 1 Multiple Selection
    1 Multiple Selection
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 0 / 12

    View Slide

  4. “What’s this about” in 2min
    Two ancient comparison-based problems on unordered list of N elements
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12

    View Slide

  5. “What’s this about” in 2min
    Two ancient comparison-based problems on unordered list of N elements
    Sorting
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12

    View Slide

  6. “What’s this about” in 2min
    Two ancient comparison-based problems on unordered list of N elements
    Sorting
    (Single)
    Selection
    e.g. find median
    by Rank
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12

    View Slide

  7. “What’s this about” in 2min
    Two ancient comparison-based problems on unordered list of N elements
    Sorting
    (Single)
    Selection
    e.g. find median
    by Rank
    O(N log2
    N) comparisons, time
    O(N
    B
    logM/B
    N
    B
    ) I/Os
    internal: Mergesort
    IO: Multiway Mergesort
    CO: Funnelsort
    O(N) comparisons, time
    O(N/B) I/Os
    internal: Median-of-medians algorithm
    IO: Median-of-medians algorithm
    CO: Median-of-medians algorithm
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12

    View Slide

  8. “What’s this about” in 2min
    Two ancient comparison-based problems on unordered list of N elements
    Sorting
    (Single)
    Selection
    e.g. find median
    by Rank
    O(N log2
    N) comparisons, time
    O(N
    B
    logM/B
    N
    B
    ) I/Os
    internal: Mergesort
    IO: Multiway Mergesort
    CO: Funnelsort
    O(N) comparisons, time
    O(N/B) I/Os
    internal: Median-of-medians algorithm
    IO: Median-of-medians algorithm
    CO: Median-of-medians algorithm
    more sorted
    more expensive
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12

    View Slide

  9. “What’s this about” in 2min
    Two ancient comparison-based problems on unordered list of N elements
    Sorting
    (Single)
    Selection
    e.g. find median
    by Rank
    O(N log2
    N) comparisons, time
    O(N
    B
    logM/B
    N
    B
    ) I/Os
    internal: Mergesort
    IO: Multiway Mergesort
    CO: Funnelsort
    O(N) comparisons, time
    O(N/B) I/Os
    internal: Median-of-medians algorithm
    IO: Median-of-medians algorithm
    CO: Median-of-medians algorithm
    more sorted
    more expensive
    M u l t i p l e S e l e c t i o n
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12

    View Slide

  10. “What’s this about” in 2min
    Two ancient comparison-based problems on unordered list of N elements
    Sorting
    (Single)
    Selection
    e.g. find median
    by Rank
    O(N log2
    N) comparisons, time
    O(N
    B
    logM/B
    N
    B
    ) I/Os
    internal: Mergesort
    IO: Multiway Mergesort
    CO: Funnelsort
    O(N) comparisons, time
    O(N/B) I/Os
    internal: Median-of-medians algorithm
    IO: Median-of-medians algorithm
    CO: Median-of-medians algorithm
    more sorted
    more expensive
    M u l t i p l e S e l e c t i o n
    This talk: What happens between selection and sorting in the cache-oblivious model?
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 1 / 12

    View Slide

  11. The Multiple Selection Problem
    Goal: find q elements of ranks r1
    < r2
    < · · · < rq
    from unsorted elements
    here: all distinct
    x1
    , . . . , xN
    ⇝ Report sought elements x(r1)
    , . . . , x(rq)
    in sorted order
    Example:
    67
    x1
    30
    x2
    45
    x3
    33
    x4
    15
    x5
    99
    x6
    26
    x7
    90
    x8
    55
    x9
    9
    x10
    96
    x11
    45
    x12
    95
    x13
    31
    x14
    3
    x15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
    r1
    =
    1
    r2
    =
    2
    r3
    =
    3
    r4
    =
    8
    Simple algorithms:
    1 q calls to selection algorithm (quickselect, median of medians) ⇝ O(Nq)
    2 Divide & conquer 1:
    (single) select r⌈q/2⌉
    -th smallest and partition x1
    , . . . , xN
    around it.
    Recursively select r1
    , . . . , r⌈q/2⌉−1
    and r⌈q/2⌉+1
    , . . . , rq ⇝ O(N lg q)
    3 Divide & conquer 2:
    Find median of x1
    , . . . , xN
    and split query ranks.
    Recurse where subproblem contains query ranks. ⇝ O(N lg q)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 2 / 12

    View Slide

  12. The Multiple Selection Problem
    Goal: find q elements of ranks r1
    < r2
    < · · · < rq
    from unsorted elements
    here: all distinct
    x1
    , . . . , xN
    ⇝ Report sought elements x(r1)
    , . . . , x(rq)
    in sorted order
    Example:
    67
    x1
    30
    x2
    45
    x3
    33
    x4
    15
    x5
    99
    x6
    26
    x7
    90
    x8
    55
    x9
    9
    x10
    96
    x11
    45
    x12
    95
    x13
    31
    x14
    3
    x15 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
    r1
    =
    1
    r2
    =
    2
    r3
    =
    3
    r4
    =
    8
    Simple algorithms:
    1 q calls to selection algorithm (quickselect, median of medians) ⇝ O(Nq)
    2 Divide & conquer 1:
    (single) select r⌈q/2⌉
    -th smallest and partition x1
    , . . . , xN
    around it.
    Recursively select r1
    , . . . , r⌈q/2⌉−1
    and r⌈q/2⌉+1
    , . . . , rq ⇝ O(N lg q)
    3 Divide & conquer 2:
    Find median of x1
    , . . . , xN
    and split query ranks.
    Recurse where subproblem contains query ranks. ⇝ O(N lg q)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 2 / 12

    View Slide

  13. The Multiple Selection Problem
    Goal: find q elements of ranks r1
    < r2
    < · · · < rq
    from unsorted elements
    here: all distinct
    x1
    , . . . , xN
    ⇝ Report sought elements x(r1)
    , . . . , x(rq)
    in sorted order
    Example:
    67
    x1
    30
    x2
    45
    x3
    33
    x4
    15
    x5
    99
    x6
    26
    x7
    90
    x8
    55
    x9
    9
    x10
    96
    x11
    45
    x12
    95
    x13
    31
    x14
    3
    x15
    3
    1
    9
    2
    15
    3
    26
    4
    30
    5
    31
    6
    33
    7
    45
    8
    45
    9
    55
    10
    67
    11
    90
    12
    95
    13
    96
    14
    99
    15
    r1
    =
    1
    r2
    =
    2
    r3
    =
    3
    r4
    =
    8
    Simple algorithms:
    1 q calls to selection algorithm (quickselect, median of medians) ⇝ O(Nq)
    2 Divide & conquer 1:
    (single) select r⌈q/2⌉
    -th smallest and partition x1
    , . . . , xN
    around it.
    Recursively select r1
    , . . . , r⌈q/2⌉−1
    and r⌈q/2⌉+1
    , . . . , rq ⇝ O(N lg q)
    3 Divide & conquer 2:
    Find median of x1
    , . . . , xN
    and split query ranks.
    Recurse where subproblem contains query ranks. ⇝ O(N lg q)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 2 / 12

    View Slide

  14. The Multiple Selection Problem
    Goal: find q elements of ranks r1
    < r2
    < · · · < rq
    from unsorted elements
    here: all distinct
    x1
    , . . . , xN
    ⇝ Report sought elements x(r1)
    , . . . , x(rq)
    in sorted order
    Example:
    67
    x1
    30
    x2
    45
    x3
    33
    x4
    15
    x5
    99
    x6
    26
    x7
    90
    x8
    55
    x9
    9
    x10
    96
    x11
    45
    x12
    95
    x13
    31
    x14
    3
    x15
    3
    1
    9
    2
    15
    3
    26
    4
    30
    5
    31
    6
    33
    7
    45
    8
    45
    9
    55
    10
    67
    11
    90
    12
    95
    13
    96
    14
    99
    15
    r1
    =
    1
    r2
    =
    2
    r3
    =
    3
    r4
    =
    8
    Answer: 3, 9, 15, 45
    Simple algorithms:
    1 q calls to selection algorithm (quickselect, median of medians) ⇝ O(Nq)
    2 Divide & conquer 1:
    (single) select r⌈q/2⌉
    -th smallest and partition x1
    , . . . , xN
    around it.
    Recursively select r1
    , . . . , r⌈q/2⌉−1
    and r⌈q/2⌉+1
    , . . . , rq ⇝ O(N lg q)
    3 Divide & conquer 2:
    Find median of x1
    , . . . , xN
    and split query ranks.
    Recurse where subproblem contains query ranks. ⇝ O(N lg q)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 2 / 12

    View Slide

  15. The Multiple Selection Problem
    Goal: find q elements of ranks r1
    < r2
    < · · · < rq
    from unsorted elements
    here: all distinct
    x1
    , . . . , xN
    ⇝ Report sought elements x(r1)
    , . . . , x(rq)
    in sorted order
    Example:
    67
    x1
    30
    x2
    45
    x3
    33
    x4
    15
    x5
    99
    x6
    26
    x7
    90
    x8
    55
    x9
    9
    x10
    96
    x11
    45
    x12
    95
    x13
    31
    x14
    3
    x15
    3
    1
    9
    2
    15
    3
    26
    4
    30
    5
    31
    6
    33
    7
    45
    8
    45
    9
    55
    10
    67
    11
    90
    12
    95
    13
    96
    14
    99
    15
    r1
    =
    1
    r2
    =
    2
    r3
    =
    3
    r4
    =
    8
    Answer: 3, 9, 15, 45
    Simple algorithms:
    1 q calls to selection algorithm (quickselect, median of medians) ⇝ O(Nq)
    2 Divide & conquer 1:
    (single) select r⌈q/2⌉
    -th smallest and partition x1
    , . . . , xN
    around it.
    Recursively select r1
    , . . . , r⌈q/2⌉−1
    and r⌈q/2⌉+1
    , . . . , rq ⇝ O(N lg q)
    3 Divide & conquer 2:
    Find median of x1
    , . . . , xN
    and split query ranks.
    Recurse where subproblem contains query ranks. ⇝ O(N lg q)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 2 / 12

    View Slide

  16. Fine-grained bounds
    “Gap Entropy”:
    B =
    q+1
    i=1
    ∆i
    log2
    N
    ∆i
    with ∆i
    = ri
    − ri−1
    (1 ⩽ i ⩽ q + 1, r0
    = 0 and rq+1
    = N + 1)
    r1
    =
    1
    r2
    =
    2
    r3
    =
    3
    r4
    =
    8

    0 =
    1

    1 =
    1

    2 =
    1
    ∆0
    = 5 ∆1
    = 8
    ⇝ B ≈ 26.9
    lower bound of B − O(N) comparisons
    via “finish sorting” argument
    upper bound of B + o(B) + O(N) comparisons
    Kaligosi, Mehlhorn, Munro & Sanders: Towards Optimal Multiple Selection, ICALP 2005
    follows from nontrivial analysis;
    much easier: repeated median selection & recursion gets O(B + N) (→ paper)
    ⇝ In internal memory, multiple-selection problem solved✓
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 3 / 12

    View Slide

  17. Fine-grained bounds
    “Gap Entropy”:
    B =
    q+1
    i=1
    ∆i
    log2
    N
    ∆i
    with ∆i
    = ri
    − ri−1
    (1 ⩽ i ⩽ q + 1, r0
    = 0 and rq+1
    = N + 1)
    r1
    =
    1
    r2
    =
    2
    r3
    =
    3
    r4
    =
    8

    0 =
    1

    1 =
    1

    2 =
    1
    ∆0
    = 5 ∆1
    = 8
    ⇝ B ≈ 26.9
    lower bound of B − O(N) comparisons
    via “finish sorting” argument
    upper bound of B + o(B) + O(N) comparisons
    Kaligosi, Mehlhorn, Munro & Sanders: Towards Optimal Multiple Selection, ICALP 2005
    follows from nontrivial analysis;
    much easier: repeated median selection & recursion gets O(B + N) (→ paper)
    ⇝ In internal memory, multiple-selection problem solved✓
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 3 / 12

    View Slide

  18. Fine-grained bounds
    “Gap Entropy”:
    B =
    q+1
    i=1
    ∆i
    log2
    N
    ∆i
    with ∆i
    = ri
    − ri−1
    (1 ⩽ i ⩽ q + 1, r0
    = 0 and rq+1
    = N + 1)
    r1
    =
    1
    r2
    =
    2
    r3
    =
    3
    r4
    =
    8

    0 =
    1

    1 =
    1

    2 =
    1
    ∆0
    = 5 ∆1
    = 8
    ⇝ B ≈ 26.9
    lower bound of B − O(N) comparisons
    via “finish sorting” argument
    upper bound of B + o(B) + O(N) comparisons
    Kaligosi, Mehlhorn, Munro & Sanders: Towards Optimal Multiple Selection, ICALP 2005
    follows from nontrivial analysis;
    much easier: repeated median selection & recursion gets O(B + N) (→ paper)
    ⇝ In internal memory, multiple-selection problem solved✓
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 3 / 12

    View Slide

  19. Outline
    1 Multiple Selection
    1 Multiple Selection
    2 Cache Oblivious Algorithms
    2 Cache Oblivious Algorithms
    3 Funnelselect
    3 Funnelselect
    4 Conclusion
    4 Conclusion
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 3 / 12

    View Slide

  20. 2 Cache Oblivious Algorithms
    2 Cache Oblivious Algorithms
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 3 / 12

    View Slide

  21. IO Model & CO Model
    CPU
    fast random access
    slow access
    only in blocks of
    B cells
    IO Model (External Memory Model):
    Cost of computation =
    #“I/Os” = #blocks transfered
    between cache &
    external memory
    Cache
    size M
    ...
    external memory – size unbounded
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12

    View Slide

  22. IO Model & CO Model
    CPU
    fast random access
    slow access
    only in blocks of
    B cells
    IO Model (External Memory Model):
    Cost of computation =
    #“I/Os” = #blocks transfered
    between cache &
    external memory
    Cache
    size M
    ...
    external memory – size unbounded
    Cache Oblivious (CO) Model
    algorithm can’t use M and B
    (I/Os automatic, OPT paging policy)
    ⇝ CO algorithm works for all
    even hierarchies!
    M and B
    Frigo, Leiserson, Prokop, Ramachandran:
    Cache-oblivious algorithms, TALG 2012
    Tall Cache Assumption: M ⩾ B1+ε
    think: ε = 1
    ⇝ cache fits many cache lines
    necessary for existence of I/O-optimal
    (comparison-based) cache-oblivious sorting algorithms
    Brodal, Fagerberg: On the limits of cache-obliviousness, STOC 2003
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12

    View Slide

  23. IO Model & CO Model
    CPU
    fast random access
    slow access
    only in blocks of
    B cells
    IO Model (External Memory Model):
    Cost of computation =
    #“I/Os” = #blocks transfered
    between cache &
    external memory
    Cache
    size M
    ...
    external memory – size unbounded
    Cache Oblivious (CO) Model
    algorithm can’t use M and B
    (I/Os automatic, OPT paging policy)
    ⇝ CO algorithm works for all
    even hierarchies!
    M and B
    Frigo, Leiserson, Prokop, Ramachandran:
    Cache-oblivious algorithms, TALG 2012
    Tall Cache Assumption: M ⩾ B1+ε
    think: ε = 1
    ⇝ cache fits many cache lines
    necessary for existence of I/O-optimal
    (comparison-based) cache-oblivious sorting algorithms
    Brodal, Fagerberg: On the limits of cache-obliviousness, STOC 2003
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12

    View Slide

  24. IO Model & CO Model
    CPU
    fast random access
    slow access
    only in blocks of
    B cells
    IO Model (External Memory Model):
    Cost of computation =
    #“I/Os” = #blocks transfered
    between cache &
    external memory
    Cache
    size M
    ...
    external memory – size unbounded
    Cache Oblivious (CO) Model
    algorithm can’t use M and B
    (I/Os automatic, OPT paging policy)
    ⇝ CO algorithm works for all
    even hierarchies!
    M and B
    Frigo, Leiserson, Prokop, Ramachandran:
    Cache-oblivious algorithms, TALG 2012
    Tall Cache Assumption: M ⩾ B1+ε
    think: ε = 1
    ⇝ cache fits many cache lines
    necessary for existence of I/O-optimal
    (comparison-based) cache-oblivious sorting algorithms
    Brodal, Fagerberg: On the limits of cache-obliviousness, STOC 2003
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 4 / 12

    View Slide

  25. Previous Work & Our Results
    Reference Comparisons I/Os Comments
    Single selection
    Blum et al. 1973 wc 5.4305N O(N/B) CO, deterministic
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 5 / 12

    View Slide

  26. Previous Work & Our Results
    Reference Comparisons I/Os Comments
    Single selection
    Blum et al. 1973 wc 5.4305N O(N/B) CO, deterministic
    Multiple selection
    Dobkin & Munro 1981 wc 3B + O(N) O((B + N)/B) CO, deterministic
    Hu et al. 2014 wc O(B + N) O(BI/O
    + N/B) deterministic
    Barbay et al. 2016 wc O(B + N) O(BI/O
    + N/B) online, determ., M ⩾ B1+ε
    B =
    q+1
    i=1
    ∆i
    log2
    N
    ∆i
    BI/O
    =
    q+1
    i=1
    ∆i
    B
    logM
    B
    N
    ∆i
    =
    B
    B log2
    (M/B)

    (usually)
    B
    B
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 5 / 12

    View Slide

  27. Previous Work & Our Results
    Reference Comparisons I/Os Comments
    Single selection
    Blum et al. 1973 wc 5.4305N O(N/B) CO, deterministic
    Multiple selection
    Dobkin & Munro 1981 wc 3B + O(N) O((B + N)/B) CO, deterministic
    Hu et al. 2014 wc O(B + N) O(BI/O
    + N/B) deterministic
    Barbay et al. 2016 wc O(B + N) O(BI/O
    + N/B) online, determ., M ⩾ B1+ε
    Funnelselect E O(B + N) O(BI/O
    + N/B) CO, randomized, M ⩾ B1+ε
    Lower bound B − O(N) Ω(BI/O
    ) − O N
    B
    M ⩾ B1+ε
    B =
    q+1
    i=1
    ∆i
    log2
    N
    ∆i
    BI/O
    =
    q+1
    i=1
    ∆i
    B
    logM
    B
    N
    ∆i
    =
    B
    B log2
    (M/B)

    (usually)
    B
    B
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 5 / 12

    View Slide

  28. Previous Work & Our Results
    Reference Comparisons I/Os Comments
    Single selection
    Blum et al. 1973 wc 5.4305N O(N/B) CO, deterministic
    Multiple selection
    Dobkin & Munro 1981 wc 3B + O(N) O((B + N)/B) CO, deterministic
    Hu et al. 2014 wc O(B + N) O(BI/O
    + N/B) deterministic
    Barbay et al. 2016 wc O(B + N) O(BI/O
    + N/B) online, determ., M ⩾ B1+ε
    Funnelselect E O(B + N) O(BI/O
    + N/B) CO, randomized, M ⩾ B1+ε
    Lower bound B − O(N) Ω(BI/O
    ) − O N
    B
    M ⩾ B1+ε
    B =
    q+1
    i=1
    ∆i
    log2
    N
    ∆i
    BI/O
    =
    q+1
    i=1
    ∆i
    B
    logM
    B
    N
    ∆i
    =
    B
    B log2
    (M/B)

    (usually)
    B
    B
    Funnelselect is the first cache-oblivious I/O-optimal algorithm.
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 5 / 12

    View Slide

  29. Our technical challenge
    1 Key algorithmic change from internal ⇝ external-memory:
    (for sorting and multiple selection)
    binary partitioning ⇝ ≈ M
    B
    -way partitioning
    can’t do that cache obliviously
    2 cache-oblivious sorting uses Nδ-way partitioning by “funnels”
    ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O
    ) I/Os
    can’t use that for multiple selection
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12

    View Slide

  30. Our technical challenge
    1 Key algorithmic change from internal ⇝ external-memory:
    (for sorting and multiple selection)
    binary partitioning ⇝ ≈ M
    B
    -way partitioning
    can’t do that cache obliviously
    2 cache-oblivious sorting uses Nδ-way partitioning by “funnels”
    ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O
    ) I/Os
    can’t use that for multiple selection
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12

    View Slide

  31. Our technical challenge
    1 Key algorithmic change from internal ⇝ external-memory:
    (for sorting and multiple selection)
    binary partitioning ⇝ ≈ M
    B
    -way partitioning
    can’t do that cache obliviously
    2 cache-oblivious sorting uses Nδ-way partitioning by “funnels”
    ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O
    ) I/Os
    can’t use that for multiple selection
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12

    View Slide

  32. Our technical challenge
    1 Key algorithmic change from internal ⇝ external-memory:
    (for sorting and multiple selection)
    binary partitioning ⇝ ≈ M
    B
    -way partitioning
    can’t do that cache obliviously
    2 cache-oblivious sorting uses Nδ-way partitioning by “funnels”
    ⇝ first partitioning round already has sorting complexity ⇝ ω(BI/O
    ) I/Os
    can’t use that for multiple selection
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12

    View Slide

  33. Outline
    1 Multiple Selection
    1 Multiple Selection
    2 Cache Oblivious Algorithms
    2 Cache Oblivious Algorithms
    3 Funnelselect
    3 Funnelselect
    4 Conclusion
    4 Conclusion
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12

    View Slide

  34. 3 Funnelselect
    3 Funnelselect
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 6 / 12

    View Slide

  35. Funnels for Partitioning
    Can use funnels from Funnelsort in reverse!
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12

    View Slide

  36. Funnels for Partitioning
    Can use funnels from Funnelsort in reverse!
    Funnel:
    k = 4

    N-way merger (recursive binary merges)
    judiciously sized buffers for intermediate results
    Simplifying assumptions:
    N = 22i
    and d = 4
    output array
    input
    arrays
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12

    View Slide

  37. Funnels for Partitioning
    Can use funnels from Funnelsort in reverse!
    Funnel:
    k = 4

    N-way merger (recursive binary merges)
    judiciously sized buffers for intermediate results
    Simplifying assumptions:
    N = 22i
    and d = 4
    output array
    input
    arrays
    P1
    P3
    P5
    P7
    P9
    P11
    P13
    P15
    P2
    P6
    P10
    P14
    P4
    P12
    P8
    input array
    output
    arrays
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12

    View Slide

  38. Funnels for Partitioning
    Can use funnels from Funnelsort in reverse!
    Funnel:
    k = 4

    N-way merger (recursive binary merges)
    judiciously sized buffers for intermediate results
    Simplifying assumptions:
    N = 22i
    and d = 4
    P1
    P3
    P5
    P7
    P9
    P11
    P13
    P15
    P2
    P6
    P10
    P14
    P4
    P12
    P8
    input array
    output
    arrays

    k-partitioner

    k-partitioners
    k-partitioner
    Funnel recursion (van Emde Boas)
    Funnel buffer sizes (largest in middle layer)
    ⇝ overall space O(k5/2) = O(N5/8)
    Nodes partition around pivot value P
    Partition = push input down
    when output buffer full, recursively push
    need final flush of all buffers
    ≈ I/O-optimal cache-oblivious gadget
    for ˆ
    k-way partitioning with ˆ
    k ≈ M0.3
    ⇝ Can be used for expected I/O-optimal CO quicksort
    (but better options available)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12

    View Slide

  39. Funnels for Partitioning
    Can use funnels from Funnelsort in reverse!
    Funnel:
    k = 4

    N-way merger (recursive binary merges)
    judiciously sized buffers for intermediate results
    Simplifying assumptions:
    N = 22i
    and d = 4
    P1
    P3
    P5
    P7
    P9
    P11
    P13
    P15
    P2
    P6
    P10
    P14
    P4
    P12
    P8
    input array
    output
    arrays

    k-partitioner

    k-partitioners
    k2 per buffer
    k-partitioner
    Funnel recursion (van Emde Boas)
    Funnel buffer sizes (largest in middle layer)
    ⇝ overall space O(k5/2) = O(N5/8)
    Nodes partition around pivot value P
    Partition = push input down
    when output buffer full, recursively push
    need final flush of all buffers
    ≈ I/O-optimal cache-oblivious gadget
    for ˆ
    k-way partitioning with ˆ
    k ≈ M0.3
    ⇝ Can be used for expected I/O-optimal CO quicksort
    (but better options available)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12

    View Slide

  40. Funnels for Partitioning
    Can use funnels from Funnelsort in reverse!
    Funnel:
    k = 4

    N-way merger (recursive binary merges)
    judiciously sized buffers for intermediate results
    Simplifying assumptions:
    N = 22i
    and d = 4
    P1
    P3
    P5
    P7
    P9
    P11
    P13
    P15
    P2
    P6
    P10
    P14
    P4
    P12
    P8
    input array
    output
    arrays

    k-partitioner

    k-partitioners
    k2 per buffer
    k-partitioner
    Funnel recursion (van Emde Boas)
    Funnel buffer sizes (largest in middle layer)
    ⇝ overall space O(k5/2) = O(N5/8)
    Nodes partition around pivot value P
    Partition = push input down
    when output buffer full, recursively push
    need final flush of all buffers
    ≈ I/O-optimal cache-oblivious gadget
    for ˆ
    k-way partitioning with ˆ
    k ≈ M0.3
    ⇝ Can be used for expected I/O-optimal CO quicksort
    (but better options available)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12

    View Slide

  41. Funnels for Partitioning
    Can use funnels from Funnelsort in reverse!
    Funnel:
    k = 4

    N-way merger (recursive binary merges)
    judiciously sized buffers for intermediate results
    Simplifying assumptions:
    N = 22i
    and d = 4
    P1
    P3
    P5
    P7
    P9
    P11
    P13
    P15
    P2
    P6
    P10
    P14
    P4
    P12
    P8
    input array
    output
    arrays

    k-partitioner

    k-partitioners
    k2 per buffer
    k-partitioner
    Funnel recursion (van Emde Boas)
    Funnel buffer sizes (largest in middle layer)
    ⇝ overall space O(k5/2) = O(N5/8)
    Nodes partition around pivot value P
    Partition = push input down
    when output buffer full, recursively push
    need final flush of all buffers
    ≈ I/O-optimal cache-oblivious gadget
    for ˆ
    k-way partitioning with ˆ
    k ≈ M0.3
    ⇝ Can be used for expected I/O-optimal CO quicksort
    (but better options available)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12

    View Slide

  42. Funnels for Partitioning
    Can use funnels from Funnelsort in reverse!
    Funnel:
    k = 4

    N-way merger (recursive binary merges)
    judiciously sized buffers for intermediate results
    Simplifying assumptions:
    N = 22i
    and d = 4
    P1
    P3
    P5
    P7
    P9
    P11
    P13
    P15
    P2
    P6
    P10
    P14
    P4
    P12
    P8
    input array
    output
    arrays

    k-partitioner

    k-partitioners
    k2 per buffer
    k-partitioner
    Funnel recursion (van Emde Boas)
    Funnel buffer sizes (largest in middle layer)
    ⇝ overall space O(k5/2) = O(N5/8)
    Nodes partition around pivot value P
    Partition = push input down
    when output buffer full, recursively push
    need final flush of all buffers
    ≈ I/O-optimal cache-oblivious gadget
    for ˆ
    k-way partitioning with ˆ
    k ≈ M0.3
    ⇝ Can be used for expected I/O-optimal CO quicksort
    (but better options available)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 7 / 12

    View Slide

  43. Early Truncation
    Recall: for multiple selection (in general), can’t use full k-partitioner
    (already sorting complexity)
    Early truncation: Don’t split buckets that don’t contain any query ranks!
    Buckets depend on (random) pivots
    ⇝ Only know a query’s bucket after partitioning ...
    Assume bucket boundaries close
    within safety margin ±ξ = N1/2+δ
    to expected location.
    ⇝ If bucket is expected query free, don’t split further.
    “Expected query free” known up front
    (Depends only on N & query ranks)
    ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing.
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12

    View Slide

  44. Early Truncation
    Recall: for multiple selection (in general), can’t use full k-partitioner
    (already sorting complexity)
    Early truncation: Don’t split buckets that don’t contain any query ranks!
    P1
    P3
    P5
    P7
    P9
    P11
    P13
    P15
    P2
    P6
    P10
    P14
    P4
    P12
    P8
    Buckets depend on (random) pivots
    ⇝ Only know a query’s bucket after partitioning ...
    Assume bucket boundaries close
    within safety margin ±ξ = N1/2+δ
    to expected location.
    ⇝ If bucket is expected query free, don’t split further.
    “Expected query free” known up front
    (Depends only on N & query ranks)
    ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing.
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12

    View Slide

  45. Early Truncation
    Recall: for multiple selection (in general), can’t use full k-partitioner
    (already sorting complexity)
    Early truncation: Don’t split buckets that don’t contain any query ranks!
    P1
    P3
    P7
    P9
    P2
    P6
    P10
    P4
    P12
    P8
    Buckets depend on (random) pivots
    ⇝ Only know a query’s bucket after partitioning ...
    Assume bucket boundaries close
    within safety margin ±ξ = N1/2+δ
    to expected location.
    ⇝ If bucket is expected query free, don’t split further.
    “Expected query free” known up front
    (Depends only on N & query ranks)
    ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing.
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12

    View Slide

  46. Early Truncation
    Recall: for multiple selection (in general), can’t use full k-partitioner
    (already sorting complexity)
    Early truncation: Don’t split buckets that don’t contain any query ranks!
    P1
    P3
    P7
    P9
    P2
    P6
    P10
    P4
    P12
    P8
    Buckets depend on (random) pivots
    ⇝ Only know a query’s bucket after partitioning ...
    Assume bucket boundaries close
    within safety margin ±ξ = N1/2+δ
    to expected location.
    ⇝ If bucket is expected query free, don’t split further.
    “Expected query free” known up front
    (Depends only on N & query ranks)
    ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing.
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12

    View Slide

  47. Early Truncation
    Recall: for multiple selection (in general), can’t use full k-partitioner
    (already sorting complexity)
    Early truncation: Don’t split buckets that don’t contain any query ranks!
    P1
    P3
    P7
    P9
    P2
    P6
    P10
    P4
    P12
    P8
    Buckets depend on (random) pivots
    ⇝ Only know a query’s bucket after partitioning ...
    Assume bucket boundaries close
    within safety margin ±ξ = N1/2+δ
    to expected location.
    ⇝ If bucket is expected query free, don’t split further.
    “Expected query free” known up front
    (Depends only on N & query ranks)
    ⇝ Can remove unwanted partitioning nodes from funnel in preprocessing.
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 8 / 12

    View Slide

  48. Pivot Sampling
    Recall this? Assume bucket boundaries close
    within safety margin ±ξ = N1/2+δ
    to expected location.
    We have to make that true by choosing pivots well
    The following randomized choice works with high probability:
    by standard Chernoff bound arguments
    1 Include each element in sample ¯
    S with prob. p = 1/ log2
    (N).
    2 Sort the sample ¯
    S.
    3 Pick pivot Pi
    as ≈ ipN/kth smallest in ¯
    S (i = 1, . . . , k − 1)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 9 / 12

    View Slide

  49. Pivot Sampling
    Recall this? Assume bucket boundaries close
    within safety margin ±ξ = N1/2+δ
    to expected location.
    We have to make that true by choosing pivots well
    The following randomized choice works with high probability:
    by standard Chernoff bound arguments
    1 Include each element in sample ¯
    S with prob. p = 1/ log2
    (N).
    2 Sort the sample ¯
    S.
    3 Pick pivot Pi
    as ≈ ipN/kth smallest in ¯
    S (i = 1, . . . , k − 1)
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 9 / 12

    View Slide

  50. Overall Algorithm
    Funnelselect:
    1 Sample pivots P1
    , . . . , Pk−1
    2 Build k-partitioner using P1
    , . . . , Pk−1
    3 Mark expected query free buckets &
    rewire their parent’s buffer to output
    4 For each bucket with queries:
    i Sort the bucket (Funnelsort)
    ii Report sought elements
    Observations:
    No (top-level) recursion needed
    Algorithm can fail at several places
    sample too small,
    sample too large,
    pivots too skewed,
    query in expected
    query free bucket
    ⇝ restart
    but: with high probability, no fails
    ⇝ no effect on expected running time
    Can be augmented to produce input
    partitioned around sought elements in
    contiguous external memory
    (default: only return sought elements in order)
    Can be made to handle equal elements
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 10 / 12

    View Slide

  51. Overall Algorithm
    Funnelselect:
    1 Sample pivots P1
    , . . . , Pk−1
    2 Build k-partitioner using P1
    , . . . , Pk−1
    3 Mark expected query free buckets &
    rewire their parent’s buffer to output
    4 For each bucket with queries:
    i Sort the bucket (Funnelsort)
    ii Report sought elements
    Observations:
    No (top-level) recursion needed
    Algorithm can fail at several places
    sample too small,
    sample too large,
    pivots too skewed,
    query in expected
    query free bucket
    ⇝ restart
    but: with high probability, no fails
    ⇝ no effect on expected running time
    Can be augmented to produce input
    partitioned around sought elements in
    contiguous external memory
    (default: only return sought elements in order)
    Can be made to handle equal elements
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 10 / 12

    View Slide

  52. Outline
    1 Multiple Selection
    1 Multiple Selection
    2 Cache Oblivious Algorithms
    2 Cache Oblivious Algorithms
    3 Funnelselect
    3 Funnelselect
    4 Conclusion
    4 Conclusion
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 10 / 12

    View Slide

  53. 4 Conclusion
    4 Conclusion
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 10 / 12

    View Slide

  54. Conclusion
    We presented Funnelselect
    the first I/O-optimal, cache-oblivious multiple-selection algorithm
    The presented algorithm is inherently randomized,
    but we meanwhile found a deterministic method! (but too late for proceedings ... but stay tuned)
    Open Problems:
    1 Funnelselect assumes a tall cache.
    Cannot be avoided since I/O-opt. CO sorting (⊂ multiple selection) requires it.
    But single selection doesn’t! ⇝ What happens in between?
    2 Can online multiple selection be solved I/O-optimal cache obliviously?
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 11 / 12

    View Slide

  55. Conclusion
    We presented Funnelselect
    the first I/O-optimal, cache-oblivious multiple-selection algorithm
    The presented algorithm is inherently randomized,
    but we meanwhile found a deterministic method! (but too late for proceedings ... but stay tuned)
    Open Problems:
    1 Funnelselect assumes a tall cache.
    Cannot be avoided since I/O-opt. CO sorting (⊂ multiple selection) requires it.
    But single selection doesn’t! ⇝ What happens in between?
    2 Can online multiple selection be solved I/O-optimal cache obliviously?
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 11 / 12

    View Slide

  56. Conclusion
    We presented Funnelselect
    the first I/O-optimal, cache-oblivious multiple-selection algorithm
    The presented algorithm is inherently randomized,
    but we meanwhile found a deterministic method! (but too late for proceedings ... but stay tuned)
    Open Problems:
    1 Funnelselect assumes a tall cache.
    Cannot be avoided since I/O-opt. CO sorting (⊂ multiple selection) requires it.
    But single selection doesn’t! ⇝ What happens in between?
    2 Can online multiple selection be solved I/O-optimal cache obliviously?
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 11 / 12

    View Slide

  57. We’re hiring! for
    Computing over compressed graph-structured data
    3 year postdoc
    PhD student
    Liverpool sounds cool? → Talk to me!
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 12 / 12

    View Slide

  58. Icons made by Freepik, Gregor Cresnar, Those Icons, Smashicons, Good Ware, Pause08, and Madebyoliver from www.flaticon.com.
    Vector graphics from Pressfoto, brgfx, macrovector and Jannoon028 on freepik.com
    Other photos from www.pixabay.com.
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 13 / 12

    View Slide

  59. I/O Lower Bound
    Recall:
    B =
    q+1
    i=1
    ∆i
    log2
    N
    ∆i
    with ∆i
    = ri
    − ri−1
    (1 ⩽ i ⩽ q + 1, r0
    = 0, rq+1
    = N + 1)
    BI/O
    =
    q+1
    i=1
    ∆i
    B
    logM
    B
    N
    ∆i
    =
    B
    B log2
    (M/B)

    (usually)
    B
    B
    Theorem (Lower bound)
    External-memory multiple selection in expectation requires Ω(BI/O
    ) − O N
    B
    logM/B
    B I/Os.
    Follows from general reduction (cmps bound ⇝ I/Os bound)
    Arge, Knudsen, Larsen: A general lower bound on the I/O-complexity of comparison-based algorithms, WADS 1993
    “finish-sorting” argument no longer rigorous
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 14 / 12

    View Slide

  60. Recap: (Lazy) Funnelsort
    Funnelsort
    is a k = 4

    N-way Mergesort (outer recursion)
    each realized by recursive binary merging (inner recursion)
    with judiciously sized buffers for intermediate results (funnel)
    Simplifying assumptions:
    N = 22i
    and d = 4
    (i.e., ε ⩾ 2
    3
    in tall cache assumption)
    d > 2 controls fanout (≈ N1/d-way merging)
    output array
    k2 per buffer
    input
    arrays

    k-merger

    k-mergers
    k-merger
    Recursive structure (cf. van Emde Boas trees)
    largest buffers in middle layer
    ⇝ overall space O(k5/2) = O(N5/8)
    Merge = fill output buffer
    when input buffer empty, recursively fill it
    ≈ I/O-optimal cache-oblivious gadget
    for ˆ
    k-way merging with ˆ
    k ≈ M0.3
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 15 / 12

    View Slide

  61. Tall Caches
    CPU
    Cache
    size M
    T
    a
    l
    l
    ...
    external memory – size unbounded
    Tall Cache Assumption: M ⩾ B1+ε think: ε = 1
    ⇝ M/B ⩾ Bε
    ⇝ cache fits many cache lines
    necessary for existence of I/O-optimal
    (comparison-based) cache-oblivious sorting algorithms
    Brodal, Fagerberg: On the limits of cache-obliviousness, STOC 2003
    Sebastian Wild Funnelselect: Cache-Oblivious Multiple Selection 2023-09-05 16 / 12

    View Slide