pluggable and extensible. A plugin works at one or more extension points in the scheduling framework. Filter Filter out Nodes that cannot run the Pod. (Insufficient resource, unmatch with NodeAffinity, etc) Score Score Nodes and determine the best one. (Image locality, etc)
use case, size, etc of the cluster. ◦ We don’t want to implement all scheduling use cases. • The scheduling is complicated; users don’t want to implement their own scheduler from scratch. ◦ The extensibilities allow users to focus on writing their custom logic, and rely on the upstream scheduler for a fundamental scheduling logic.
“Extender”. Each registered webhook is called at specific point(s) during scheduling. • No need to rebuild a scheduler to extend. • The flexibility of the implementation. • It impacts the scheduling latency very badly. • The functionality is very limited.
based on the Scheduling Framework. It’s designed to provide a better extensibility than webhook (extender). • More extensible than the webhook (extender) • No overhead to between the scheduler and plugins. • Requires a fork/rebuild/replacement of the scheduler.
plugin compiled to wasm module. • (will be) as extensible as Golang plugin. • Less troublesome to set up. (easier distribution, no rebuild, etc) • Could be written in many language. (we only have TinyGo SDK now tho.) • It impacts the scheduling latency negatively. (but less than the extender) • Wasm sandbox limitations. NEW!!
can still be the best if… • The scheduler latency is critical in your cluster. ◦ The more Pods your cluster usually gets, the faster scheduling you need. • You need to handle tons of various objects. ◦ The overhead made by the object transfer and GC would be bigger. (will be discussed later)
Filter(...) • Score(...) • etc Forwards the function calls from the Scheduling Framework to Wasm modules. Application Binary Interface (ABI): The contracts between the host (scheduler) and the wasm module.
Filter(...) • Score(...) • etc The wasm module fetches the additional data from the scheduler side, as necessary. Those functions exposed from the host (scheduler) are also defined with ABI.
non-Wasm people to create Wasm modules. It’d be hard for people to implement Wasm module from scratch, only based on ABIs. The SDK allows people to develop wasm modules with a very similar experience with Golang plugins. – Just need to implement interfaces.
non-Wasm people to create Wasm modules. It’d be hard for people to implement Wasm module from scratch, only based on ABIs. The SDK allows people to develop wasm modules with a very similar experience with Golang plugins. – Just need to implement interfaces. Just implement corresponding interfaces.
the exported function support, which we wanted for a performant design. • …but, it’s actually coming now! We can explore Go SDK with it in the future. Issue: cmd/compile: add go:wasmexport directive #65199
• For example, we cannot define Filter(pod *v1.Pod) . • The guest can only operate their memory. • The objects cannot be passed by reference. So, how to transfer objects (Pods, Nodes, etc)? Only numeric types are supported.
when need them. • Cache: don’t get the same object more than twice. Example: Pod is fetched from the host at pod.Spec() or got from cache, while NodeList is not.
is inlined -> inlined GC overhead was over half the latency of a plugin execution. wasilibs/nottinygc High performance GC alternative for TinyGo targeting WASI, • nottinygc is awesome; made ~50% latency reduction at plugin execution in some scenarios. • But, given the repository is archived, we cannot keep relying on it anymore.
is inlined -> inlined GC overhead was over half the latency of a plugin execution. -gc=leaking flag Only allocate memory, never free it. • The wasm module’s memory usage would keep growing, but wouldn’t be a problem if the wasm module is short-lived. • We tried to kill -> recreate modules every after the scheduling cycle, but didn’t get a good performance because the recreation was costly.
is only one in the cluster basically. We have two layers of benchmarking in the project. • Plugin level benchmark to see how long it takes in which part. • Scheduler level benchmark to see how much wasm’s overhead actually impacts the scheduling latency.
start the extensibility from your system. But, you must consider… SDK design Make it easier for people to build wasm guest. Object transfer If needing to operate large/many objects, you need effort to reduce it. Benchmark Keep taking benchmark. The performance is always a concern.