Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Schedule Recipes

Schedule Recipes

In this talk, we’ll explore a collection of minimal schedulers built using the Linux kernel’s sched_ext framework.

Some are practical, others (intentionally) less so, but all highlight the expressive power and creativity that sched_ext can bring. Like a good set of recipes, these examples are meant to nspire experimentation, lower the barrier of kernel scheduler development, and help you cook up your own ideas, whether it’s for performance tuning, research, or just for fun.

Andrea RIGHI

Avatar for Kernel Recipes

Kernel Recipes PRO

September 28, 2025
Tweet

More Decks by Kernel Recipes

Other Decks in Technology

Transcript

  1. What is a scheduler? • Kernel component that determines ◦

    Where each task needs to run ◦ When each task needs to run ◦ For how long each task needs to run • Scheduler ◦ CPU allocator (space) ◦ Task allocator (time)
  2. sched_ext: extensible scheduler class • Technology in the Linux kernel

    that allows to implement scheduling policies as BPF programs (GPLv2) • Available since Linux v6.12 • Key features ◦ Bespoke scheduling policies ◦ Rapid experimentation ◦ Safety (can’t crash the kernel)
  3. BPF + user-space schedulers Kernel BPF User space sched_ext core

    BPF scheduler libbpf libbpf-rs User-space scheduler sched_ext callbacks
  4. Topology complexity NUMA node 0 L3 cache Core 0 (big)

    CPU 0 CPU 1 Core 1 (big) CPU 0 CPU 1 L3 cache Core 2 (LITTLE) CPU 0 CPU 1 Core 3 (LITTLE) CPU 0 CPU 1 NUMA node 1 L3 cache Core 0 (big) CPU 0 CPU 1 Core 1 (big) CPU 0 CPU 1 L3 cache Core 2 (LITTLE) CPU 0 CPU 1 Core 3 (LITTLE) CPU 0 CPU 1
  5. Recipe #1: The empty scheduler • Empty scheduler (use sched_ext

    default) ◦ Global run-queue ◦ Round-robin scheduler (20ms time slice) ◦ Built-in idle CPU selection policy
  6. Recipe #2: The global vs local runqueue scheduler • Round

    robin scheduler that can use either a single global queue or multiple per-CPU runqueues ◦ Time slice scaled proportionally to the task’s priority (weight) and inversely proportional to the amount of contending tasks (fairness) ◦ Perfect load balancing (global) vs bad load balancing (local) ◦ Not really good if you overload the system (round-robin) ◦ Bad for cache locality and scalability (global) vs good for cache locality and scalability (local)
  7. Recipe #3: The “yell at your PC” scheduler • Adjust

    the number of CPUs used based on the noise level around your PC ◦ Show potential BPF / user space interactions ◦ Good for energy consumption (if the environment is quiet) ◦ Not usable in public places (i.e., library, open space office, etc.)
  8. Key takeaways • The one-size-fits-all scheduler approach is no longer

    sufficient • CPU allocation can be more effective than time allocation • Shared queues vs local queues can be relevant • Prioritizing waker –> wakee pipelines can help improve responsiveness • Hybrid schedulers (BPF + user space) have great potential