Can We Use eBPF To Debug Performance of The Go Scheduler?

Can We Use eBPF To Debug Performance of The Go
Scheduler? Raghav Roy, Madhav Jivrajani – VMware

Who Are We?

Disclaimer • Are we eBPF experts? ◦ • Are we
experts in the history and implementation of the Go runtime? ◦ • Are we experts in the Linux Kernel? ◦ • Why attend this talk?

Disclaimer • Are we eBPF experts? ◦ No. • Are
we experts in the history and implementation of the Go runtime? ◦ • Are we experts in the Linux Kernel? ◦ • Why attend this talk?

we experts in the history and implementation of the Go runtime? ◦ No. • Are we experts in the Linux Kernel? ◦ • Why attend this talk?

we experts in the history and implementation of the Go runtime? ◦ No. • Are we experts in the Linux Kernel? ◦ No. • Why attend this talk?

we experts in the history and implementation of the Go runtime? ◦ No. • Are we experts in the Linux Kernel? ◦ No. • Why attend this talk? ◦ Minimum takeaway: internals of the performance of the Go scheduler + what is eBPF and how it works. ◦ If you’re feeling dangerously adventurous: some ideas around optimising latency at the OS scheduler level. ◦ More of “let’s look for solutions searching for a problem”.

Small disclaimer: Everything discussed is in reference to Go 1.20.6

So, why are we here? And why do we care?
Pt. 1

Do you know what goes on in the kernel under
the hood?

Standard libraries abstract away kernel processes, let’s look at what
happens during a simple “echo” call

More than a 100 syscalls!

We can learn a lot about how an application behaves
if we can see it's interaction with the kernel

How would you do that? Thought experiment

Modify the kernel?

Output something when an event occurs in the kernel, in
this case, say a syscall

Why that might not be a great idea : •
Extremely complex codebase

Extremely complex codebase, but that isn’t the only problem

Extremely complex codebase, but that isn’t the only problem • Come up with approach

Extremely complex codebase, but that isn’t the only problem • Come up with approach • Develop it, have it accepted into the Linux kernel

Extremely complex codebase, but that isn’t the only problem • Come up with approach • Develop it, have it accepted into the Linux kernel (months!)

Extremely complex codebase, but that isn’t the only problem • Come up with approach • Develop it, have it accepted into the Linux kernel (months!) • Long time before the kernel with your patch is even adopted by all linux distros.

Extremely complex codebase, but that isn’t the only problem • Come up with approach • Develop it, have it accepted into the Linux kernel (months!) • Long time before the kernel with your patch is even adopted by all linux distros. • Oops, the requirements changed!

Comic Relief

Why eBPF • The Linux kernel can accept kernel modules
that extend its behaviour

Why eBPF • The Linux kernel can accept kernel modules
that extend its behaviour (but is this safe?) • eBPF programs can be loaded safely into the kernel, and do just this.

Comic Relief

Caveats! • What happens if your eBPF program crashes?

Caveats! • What happens if your eBPF program crashes? •
Is it safe to run?

Solution • What happens if your eBPF program crashes? •
Is it safe to run? The eBPF Veriﬁer! • Makes sure the program exits safely

Is it safe to run? The eBPF Veriﬁer! • Makes sure the program exits safely • Only access memory that it is supposed to access

Is it safe to run? The eBPF Veriﬁer! • Makes sure the program exits safely • Only access memory that it is supposed to access Still, only run eBPF programs from veriﬁable sources

Bonus Section (if you are still not convinced) • eBPF
programs can be loaded and removed dynamically, doesn’t matter if your application was running before.

programs can be loaded and removed dynamically, doesn’t matter if your application was running before. • Instantly gets visibility over everything that's happening in the machine

programs can be loaded and removed dynamically, doesn’t matter if your application was running before. • Instantly gets visibility over everything that's happening in the machine • Create new functionality very quickly without all Linux users having to accept the changes.

Okay so we agree this is great, but how does
it work?

How do you write eBPF programs • Kernel accepts programs
in bytecode format (Object ﬁle)

in bytecode format (Object ﬁle) • eBPF programs can't be written in high level languages

in bytecode format (Object ﬁle) • eBPF programs can't be written in high level languages, why?

in bytecode format (Object ﬁle) • eBPF programs can't be written in high level languages, why? 1. The compiler needs to emit bytecode format, not languages support this

in bytecode format (Object ﬁle) • eBPF programs can't be written in high level languages, why? 1. The compiler needs to emit bytecode format, not languages support this 2. Can’t have runtime features (Garbage Collection, Scheduling, etc)

What Events exist in the Kernel?

Events • eBPF programs are event-driven

Events • eBPF programs are event-driven • Once it’s loaded
into the kernel, it needs to be attached to an event

Events • eBPF programs are event-driven • Once it’s loaded
into the kernel, it needs to be attached to an event • We already talked about syscalls, what other places can you hook an eBPF program?

Events • Syscalls are stable APIs, don’t change with Kernel
version.

version. • There are also kernel function “entry” and “exit” points (name explains what that means). These are called kprobes and kretprobes

version. • There are also kernel function “entry” and “exit” points (name explains what that means). These are called kprobes and kretprobes (newer kernel versions call this fentry/fexit) • Can also be hooked to userspace functions called uprobes/uretprobes

version. • There are also kernel function “entry” and “exit” points (name explains what that means). These are called kprobes and kretprobes (newer kernel versions call this fentry/fexit) • Can also be hooked to userspace functions called uprobes/uretprobes • Network interface hooks, XDP

version. • There are also kernel function “entry” and “exit” points (name explains what that means). These are called kprobes and kretprobes (newer kernel versions call this fentry/fexit) • Can also be hooked to userspace functions called uprobes/uretprobes • Network interface hooks, XDP • Perf Events, tracepoints, etc (ie, a vast variety/amount of places to hook your eBPF program to!)

You said something about Maps

You said something about Maps Yes, I did

Maps • Key-Value pairs, data-structures used by both Userspace program
and eBPF program running in the kernel

and eBPF program running in the kernel • Deﬁned alongside ebpf programs, then loaded into kernel

and eBPF program running in the kernel • Deﬁned alongside ebpf programs, then loaded into kernel • Userspace program writes conﬁg info like event-registration, into them, read by eBPF

and eBPF program running in the kernel • Deﬁned alongside ebpf programs, then loaded into kernel • Userspace program writes conﬁg info like event-registration, into them, read by eBPF • Userspace and kernel-space programs need a common understanding of data-structures stored in the map

Now you know the basic components that go into running
an eBPF program

Now you know the basic components that go into running
an eBPF program (Congrats!)

Let’s visualise the ﬂow of an eBPF program being loaded
into the kernel and reacting to Events

Loading it into the Kernel

eBPF program reacts to event-triggers

Types of events

You got the ﬂow!

Things not mentioned but provided resources to: • BCC libraries
that have a ton of eBPF programs written already for you to tinker with

that have a ton of eBPF programs written already for you to tinker with • Helper functions that already exist that can capture events when they occur in the kernel

that have a ton of eBPF programs written already for you to tinker with • Helper functions that already exist that can capture events when they occur in the kernel • How eBPF solves portability (Compile Once-Run Everywhere)

So, why are we here? And why do we care?
Pt. 2

Goroutines! • “Lightweight threads” • Managed by the Go runtime
• Minimal API (go)

Full talk with much more details on the scheduler: Queues,
Fairness and The Go Scheduler

Let’s take a small example

// An abridged main.go func main() { go doSomething() doAnotherThing()
}

// An abridged main.go func main() { go doSomething() doAnotherThing()
} go build -o app main.go

func main() { go doSomething() doAnotherThing() } func main() {
runtime.newProc(...) doAnotherThing() } One such example of calling into the runtime

Let’s actually run our code. ./app

How do we get the code “inside” Goroutines to actually
run on our hardware? We need some way to map Goroutines to OS threads - user-space scheduling!

The Go scheduler does N-M scheduling.

How do we keep track of Goroutines that are yet
to be run?

Queues!

Hold up… what if a bunch of Goroutines are blocked
on syscalls/IO? Do we really need all this state per thread?

How can we tackle this? ✨Indirection✨

“Go scheduler: Implementing language with lightweight concurrency” by Dmitry Vyukov

Interlude: Fairness

Presence of resource hogs in a FIFO system leads to
something known as the Convoy Effect. This is a common problem to deal with while considering fairness in scheduling.

How do we deal with this in scheduling? • One
way is to just schedule the short running tasks before the long running ones. ◦ This would require us knowing what the characteristic of the workload is like. • Another way - pre-emption!

Alright, now that we have enough context, the ﬁrst thing
to ask ourselves is “how do we choose which Goroutine to run?”

Alright, now that we have enough context, the ﬁrst thing
to ask ourselves is “how do we choose which Goroutine to run?” 1. Check local runqueue 2. Check global runqueue - steal in bulk 3. Check netpoller 4. Steal work in bulk from another p

What about the convoy effect? Will that be taken care
of? We spoke about pre-emption earlier, let’s see how Go did it then and now.

Non Co-operative Pre-emption • Each Goroutine is given a time-slice
of 10ms after which, pre-emption is attempted. ◦ 10ms is a soft limit. • Pre-emption occurs by sending a userspace signal to the thread running the Goroutine that needs to be pre-empted. ◦ Similar to interruption based pre-emption in the kernel. • The SIGURG signal is sent to the thread whose Goroutine needs to be pre-empted. “Pardon the Interruption: Loop Preemption in Go 1.14” by Austin Clements

Non Co-operative Pre-emption • Who sends this signal? sysmon

Non Co-operative Pre-emption • sysmon ◦ Daemon running without a
p ◦ Issues pre-emption requests for long-running Goroutines

Where does the pre-empted Goroutine go?

Awesome! We now have a kinda-sorta good idea about what
happens under the hood. Yay! But all this is taken care of by the runtime itself, are there any knobs we can turn to try and control some of this behaviour?

runtime APIs to interact with the scheduler • Try and
treat the runtime as a black-box as much as possible! • (It’s a good thing that) Not a lot of exposed knobs to control the runtime. • Whatever is available should be understood thoroughly before using in code.

runtime APIs to interact with the scheduler • NumGoroutine() •
GOMAXPROCS() • Gosched() • Goexit() • LockOSThread()/UnlockOSThread()

LockOSThread()/UnlockOSThread() • Wires calling Goroutine to the underlying OS Thread.
• Primarily used when the Goroutine changes underlying thread’s state.

LockOSThread()/UnlockOSThread() • Weaveworks has an excellent case-study on this: ◦
https://www.weave.works/blog/linux-namespaces-and-go-don-t-mix ◦ https://www.weave.works/blog/linux-namespaces-golang-followup • Let’s look at the ﬁneprint.

LockOSThread()/UnlockOSThread() • Acts like a “taint” indicating thread state was
changed. • No ◦ Goroutine can be scheduled on this thread till UnlockOSThread() is called the same number of times as LockOSThread(). ◦ Thread can be created from a locked thread. • Don’t create Goroutines from a locked one that are expected to run on the modiﬁed thread state. • If a Goroutine exits before unlocking the thread, the thread is gotten rid of and is not used for scheduling anymore.

So where does eBPF come in?

Demo time!

Demo: https://youtu.be/X0VnDPQRCo4?t=4938

Conclusion • eBPF: A fast and safe way to extend
functionality of kernel • Go scheduler is a distributed, best-effort preemptive scheduler • The Go scheduler (and runtime in general) interact with the OS using syscalls • Information about the Go runtime which would otherwise not be visible can be captured through syscall being made via eBPF. • Latency sensitive, aggressive optimisations are maybe possible at the kernel level enabled by eBPF?

https://github.com/MadhavJivrajani/gse/tree/main/example/preemption_ebpf

References • Scalable Go Scheduler Design Doc • Go scheduler:
Implementing language with lightweight concurrency • The Scheduler Saga • Analysis of the Go runtime scheduler • Non-cooperative goroutine preemption ◦ Pardon the Interruption: Loop Preemption in Go 1.14 • go/src/runtime/{ proc.go, proc_test.go, preempt.go, runtime2.go, ...} ◦ And their corresponding git blames • Go's work-stealing scheduler

References • Everything eBPF • What Is eBPF - O'Reilly
Book • Maps in eBPF • CO-RE and Portability in eBPF

Thank you!

Can We Use eBPF To Debug Performance of The Go ...

Can We Use eBPF To Debug Performance of The Go Scheduler?

More Decks by Raghav Roy

Featured

Transcript