Slide 1

Slide 1 text

A Tale of Two Plugins: Safely Extending the Kubernetes Scheduler with WebAssembly Kensei Nakada (@sanposhiho)

Slide 2

Slide 2 text

Hello! こんにちは ! 👋 Kensei Nakada (@sanposhiho) • Software Engineer @ • Kubernetes maintainer (SIG-Scheduling approver, SIG-Autoscaling) • Kubernetes contributor award 2022, 2023

Slide 3

Slide 3 text

Kubernetes Scheduler Scheduler Plugins Scheduling Framework Starting from the basics

Slide 4

Slide 4 text

Image Locality Taint/Toleration Kubernetes Scheduler The control plane component that finds the best Node for every Pod to run on. Resource Ports NodeAffinity PodAffinity/AntiAffinity etc etc… Many factors to consider…

Slide 5

Slide 5 text

Scheduler Plugins Each scheduling factor is implemented as a plugin. Image Locality Plugin TaintToleration Plugin Resource Fit Plugin NodePorts Plugin NodeAffinity Plugin Inter-Pod Affinity Plugin etc etc… Kubernetes scheduler consists of many plugins:

Slide 6

Slide 6 text

Scheduling Framework The underlying architecture for the scheduler, which is pluggable and extensible. A plugin works at one or more extension points in the scheduling framework. Filter Filter out Nodes that cannot run the Pod. (Insufficient resource, unmatch with NodeAffinity, etc) Score Score Nodes and determine the best one. (Image locality, etc)

Slide 7

Slide 7 text

Scheduling Framework More extension points actually…

Slide 8

Slide 8 text

Existing extensibilities in the scheduler

Slide 9

Slide 9 text

Extensibility matters! ● The requirements on the scheduling depends the use case, size, etc of the cluster. ○ We don’t want to implement all scheduling use cases. ● The scheduling is complicated; users don’t want to implement their own scheduler from scratch. ○ The extensibilities allow users to focus on writing their custom logic, and rely on the upstream scheduler for a fundamental scheduling logic.

Slide 10

Slide 10 text

Webhook Golang plugin Extensibility

Slide 11

Slide 11 text

Webhook (Extender) The scheduler has a webhook based extension called “Extender”. Each registered webhook is called at specific point(s) during scheduling. • No need to rebuild a scheduler to extend. • The flexibility of the implementation. • It impacts the scheduling latency very badly. • The functionality is very limited.

Slide 12

Slide 12 text

Golang plugin (Scheduling Framework) You can implement your own plugin based on the Scheduling Framework. It’s designed to provide a better extensibility than webhook (extender). • More extensible than the webhook (extender) • No overhead to between the scheduler and plugins. • Requires a fork/rebuild/replacement of the scheduler.

Slide 13

Slide 13 text

Webhook Golang plugin Extensibility

Slide 14

Slide 14 text

Webhook Golang plugin Wasm Extensibility NEW!!

Slide 15

Slide 15 text

Wasm extension Next generation of Kubernetes scheduler extension NEW!!

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

Wasm extension Evolving from Golang plugin, you can write a plugin compiled to wasm module. • (will be) as extensible as Golang plugin. • Less troublesome to set up. (easier distribution, no rebuild, etc) • Could be written in many language. (we only have TinyGo SDK now tho.) • It impacts the scheduling latency negatively. (but less than the extender) • Wasm sandbox limitations. NEW!!

Slide 18

Slide 18 text

Default scheduler

Slide 19

Slide 19 text

Wasm extension: A bit slower than the default scheduler.

Slide 20

Slide 20 text

Webhook (Extender): Much slower then the default scheduler and wasm extension.

Slide 21

Slide 21 text

So… will the wasm extension replace all plugins? Golang plugin can still be the best if… ● The scheduler latency is critical in your cluster. ○ The more Pods your cluster usually gets, the faster scheduling you need. ● You need to handle tons of various objects. ○ The overhead made by the object transfer and GC would be bigger. (will be discussed later)

Slide 22

Slide 22 text

Crafting the wasm extension

Slide 23

Slide 23 text

How it works Using Wazero to embed the wasm runtime within the scheduler.

Slide 24

Slide 24 text

How it works Scheduling Framework Load From: • Remote hosts. • Local files. Load Wasm modules into the scheduler. Wasm extension Go plugin

Slide 25

Slide 25 text

How it works Scheduling Framework Load From: • Remote hosts. • Local files. Load Wasm modules into the scheduler. Wasm extension Go plugin

Slide 26

Slide 26 text

How it works Scheduling Framework Scheduling Framework Filter(...) Filter(...) • Filter(...) • Score(...) • etc Forwards the function calls from the Scheduling Framework to Wasm modules. Wasm extension Go plugin

Slide 27

Slide 27 text

How it works Scheduling Framework Scheduling Framework Filter(...) Filter(...) • Filter(...) • Score(...) • etc Forwards the function calls from the Scheduling Framework to Wasm modules. Application Binary Interface (ABI): The contracts between the host (scheduler) and the wasm module.

Slide 28

Slide 28 text

How it works Scheduling Framework Scheduling Framework Pod(...), Node(...) • Filter(...) • Score(...) • etc The wasm module fetches the additional data from the scheduler side, as necessary. Those functions exposed from the host (scheduler) are also defined with ABI.

Slide 29

Slide 29 text

TinyGo SDK Provide a SDK to make it easier for non-Wasm people to create Wasm modules. It’d be hard for people to implement Wasm module from scratch, only based on ABIs. The SDK allows people to develop wasm modules with a very similar experience with Golang plugins. – Just need to implement interfaces.

Slide 30

Slide 30 text

Golang plugin: How-to Each extension point has an interface.

Slide 31

Slide 31 text

TinyGo SDK Provide a SDK to make it easier for non-Wasm people to create Wasm modules. It’d be hard for people to implement Wasm module from scratch, only based on ABIs. The SDK allows people to develop wasm modules with a very similar experience with Golang plugins. – Just need to implement interfaces. Just implement corresponding interfaces.

Slide 32

Slide 32 text

TinyGo SDK Why TinyGo, not Go? • Golang didn’t have the exported function support, which we wanted for a performant design. • …but, it’s actually coming now! We can explore Go SDK with it in the future. Issue: cmd/compile: add go:wasmexport directive #65199

Slide 33

Slide 33 text

Implement an object transfer • Only numeric types are supported. • For example, we cannot define Filter(pod *v1.Pod) . • The guest can only operate their memory. • The objects cannot be passed by reference. So, how to transfer objects (Pods, Nodes, etc)? Only numeric types are supported.

Slide 34

Slide 34 text

Implement an object transfer Put the Pod to the address (ptr). Don’t use more than limit. I stored the Pod to your memory. The length is xxxx. Host can read/write anything in the guest’s memory.

Slide 35

Slide 35 text

Object transfer is costly.. • Lazy Loading: get objects only when need them. • Cache: don’t get the same object more than twice. Example: Pod is fetched from the host at pod.Spec() or got from cache, while NodeList is not.

Slide 36

Slide 36 text

Scheduling Framework Wasm extension Go plugin Wasm module PreScore(...) PreScore() Pod() Just ask to start PreScore(). At this point, any object transfer is made yet.

Slide 37

Slide 37 text

Scheduling Framework Wasm extension Go plugin Wasm module PreScore(...) PreScore() Pod() (If the Pod isn’t in the cache) Request the Pod object to the host.

Slide 38

Slide 38 text

Garbage collection overhead Wasm has only one thread and GC is inlined -> inlined GC overhead was over half the latency of a plugin execution. wasilibs/nottinygc High performance GC alternative for TinyGo targeting WASI, • nottinygc is awesome; made ~50% latency reduction at plugin execution in some scenarios. • But, given the repository is archived, we cannot keep relying on it anymore.

Slide 39

Slide 39 text

Garbage collection overhead Wasm has only one thread and GC is inlined -> inlined GC overhead was over half the latency of a plugin execution. -gc=leaking flag Only allocate memory, never free it. • The wasm module’s memory usage would keep growing, but wouldn’t be a problem if the wasm module is short-lived. • We tried to kill -> recreate modules every after the scheduling cycle, but didn’t get a good performance because the recreation was costly.

Slide 40

Slide 40 text

Garbage collection overhead Wasm has only one thread and GC is inlined… -gc=leaking flag Only allocate memory, never free it. wasilibs/nottinygc High performance GC alternative for TinyGo targeting WASI,

Slide 41

Slide 41 text

Benchmark The performance matters for the scheduler because the scheduler is only one in the cluster basically. We have two layers of benchmarking in the project. • Plugin level benchmark to see how long it takes in which part. • Scheduler level benchmark to see how much wasm’s overhead actually impacts the scheduling latency.

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

Wrap up

Slide 44

Slide 44 text

Summary Wasm is a valuable option to consider when you start the extensibility from your system. But, you must consider… SDK design Make it easier for people to build wasm guest. Object transfer If needing to operate large/many objects, you need effort to reduce it. Benchmark Keep taking benchmark. The performance is always a concern.

Slide 45

Slide 45 text

THANK YOU!