Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Can We Use eBPF To Debug Performance of The Go Scheduler?

Can We Use eBPF To Debug Performance of The Go Scheduler?

Madhav Jivrajani

August 07, 2023
Tweet

More Decks by Madhav Jivrajani

Other Decks in Technology

Transcript

  1. Can We Use eBPF To Debug Performance of The Go

    Scheduler? Raghav Roy, Madhav Jivrajani – VMware
  2. Disclaimer • Are we eBPF experts? ◦ • Are we

    experts in the history and implementation of the Go runtime? ◦ • Are we experts in the Linux Kernel? ◦ • Why attend this talk?
  3. Disclaimer • Are we eBPF experts? ◦ No. • Are

    we experts in the history and implementation of the Go runtime? ◦ • Are we experts in the Linux Kernel? ◦ • Why attend this talk?
  4. Disclaimer • Are we eBPF experts? ◦ No. • Are

    we experts in the history and implementation of the Go runtime? ◦ No. • Are we experts in the Linux Kernel? ◦ • Why attend this talk?
  5. Disclaimer • Are we eBPF experts? ◦ No. • Are

    we experts in the history and implementation of the Go runtime? ◦ No. • Are we experts in the Linux Kernel? ◦ No. • Why attend this talk?
  6. Disclaimer • Are we eBPF experts? ◦ No. • Are

    we experts in the history and implementation of the Go runtime? ◦ No. • Are we experts in the Linux Kernel? ◦ No. • Why attend this talk? ◦ Minimum takeaway: internals of the performance of the Go scheduler + what is eBPF and how it works. ◦ If you’re feeling dangerously adventurous: some ideas around optimising latency at the OS scheduler level. ◦ More of “let’s look for solutions searching for a problem”.
  7. We can learn a lot about how an application behaves

    if we can see it's interaction with the kernel
  8. Why that might not be a great idea : •

    Extremely complex codebase
  9. Why that might not be a great idea : •

    Extremely complex codebase, but that isn’t the only problem
  10. Why that might not be a great idea : •

    Extremely complex codebase, but that isn’t the only problem • Come up with approach
  11. Why that might not be a great idea : •

    Extremely complex codebase, but that isn’t the only problem • Come up with approach • Develop it, have it accepted into the Linux kernel
  12. Why that might not be a great idea : •

    Extremely complex codebase, but that isn’t the only problem • Come up with approach • Develop it, have it accepted into the Linux kernel (months!)
  13. Why that might not be a great idea : •

    Extremely complex codebase, but that isn’t the only problem • Come up with approach • Develop it, have it accepted into the Linux kernel (months!) • Long time before the kernel with your patch is even adopted by all linux distros.
  14. Why that might not be a great idea : •

    Extremely complex codebase, but that isn’t the only problem • Come up with approach • Develop it, have it accepted into the Linux kernel (months!) • Long time before the kernel with your patch is even adopted by all linux distros. • Oops, the requirements changed!
  15. Why eBPF • The Linux kernel can accept kernel modules

    that extend its behaviour (but is this safe?) • eBPF programs can be loaded safely into the kernel, and do just this.
  16. Solution • What happens if your eBPF program crashes? •

    Is it safe to run? The eBPF Verifier! • Makes sure the program exits safely
  17. Solution • What happens if your eBPF program crashes? •

    Is it safe to run? The eBPF Verifier! • Makes sure the program exits safely • Only access memory that it is supposed to access
  18. Solution • What happens if your eBPF program crashes? •

    Is it safe to run? The eBPF Verifier! • Makes sure the program exits safely • Only access memory that it is supposed to access Still, only run eBPF programs from verifiable sources
  19. Bonus Section (if you are still not convinced) • eBPF

    programs can be loaded and removed dynamically, doesn’t matter if your application was running before.
  20. Bonus Section (if you are still not convinced) • eBPF

    programs can be loaded and removed dynamically, doesn’t matter if your application was running before. • Instantly gets visibility over everything that's happening in the machine
  21. Bonus Section (if you are still not convinced) • eBPF

    programs can be loaded and removed dynamically, doesn’t matter if your application was running before. • Instantly gets visibility over everything that's happening in the machine • Create new functionality very quickly without all Linux users having to accept the changes.
  22. How do you write eBPF programs • Kernel accepts programs

    in bytecode format (Object file) • eBPF programs can't be written in high level languages
  23. How do you write eBPF programs • Kernel accepts programs

    in bytecode format (Object file) • eBPF programs can't be written in high level languages, why?
  24. How do you write eBPF programs • Kernel accepts programs

    in bytecode format (Object file) • eBPF programs can't be written in high level languages, why? 1. The compiler needs to emit bytecode format, not languages support this
  25. How do you write eBPF programs • Kernel accepts programs

    in bytecode format (Object file) • eBPF programs can't be written in high level languages, why? 1. The compiler needs to emit bytecode format, not languages support this 2. Can’t have runtime features (Garbage Collection, Scheduling, etc)
  26. Events • eBPF programs are event-driven • Once it’s loaded

    into the kernel, it needs to be attached to an event
  27. Events • eBPF programs are event-driven • Once it’s loaded

    into the kernel, it needs to be attached to an event • We already talked about syscalls, what other places can you hook an eBPF program?
  28. Events • Syscalls are stable APIs, don’t change with Kernel

    version. • There are also kernel function “entry” and “exit” points (name explains what that means). These are called kprobes and kretprobes
  29. Events • Syscalls are stable APIs, don’t change with Kernel

    version. • There are also kernel function “entry” and “exit” points (name explains what that means). These are called kprobes and kretprobes (newer kernel versions call this fentry/fexit) • Can also be hooked to userspace functions called uprobes/uretprobes
  30. Events • Syscalls are stable APIs, don’t change with Kernel

    version. • There are also kernel function “entry” and “exit” points (name explains what that means). These are called kprobes and kretprobes (newer kernel versions call this fentry/fexit) • Can also be hooked to userspace functions called uprobes/uretprobes • Network interface hooks, XDP
  31. Events • Syscalls are stable APIs, don’t change with Kernel

    version. • There are also kernel function “entry” and “exit” points (name explains what that means). These are called kprobes and kretprobes (newer kernel versions call this fentry/fexit) • Can also be hooked to userspace functions called uprobes/uretprobes • Network interface hooks, XDP • Perf Events, tracepoints, etc (ie, a vast variety/amount of places to hook your eBPF program to!)
  32. Maps • Key-Value pairs, data-structures used by both Userspace program

    and eBPF program running in the kernel • Defined alongside ebpf programs, then loaded into kernel
  33. Maps • Key-Value pairs, data-structures used by both Userspace program

    and eBPF program running in the kernel • Defined alongside ebpf programs, then loaded into kernel • Userspace program writes config info like event-registration, into them, read by eBPF
  34. Maps • Key-Value pairs, data-structures used by both Userspace program

    and eBPF program running in the kernel • Defined alongside ebpf programs, then loaded into kernel • Userspace program writes config info like event-registration, into them, read by eBPF • Userspace and kernel-space programs need a common understanding of data-structures stored in the map
  35. Let’s visualise the flow of an eBPF program being loaded

    into the kernel and reacting to Events
  36. Things not mentioned but provided resources to: • BCC libraries

    that have a ton of eBPF programs written already for you to tinker with
  37. Things not mentioned but provided resources to: • BCC libraries

    that have a ton of eBPF programs written already for you to tinker with • Helper functions that already exist that can capture events when they occur in the kernel
  38. Things not mentioned but provided resources to: • BCC libraries

    that have a ton of eBPF programs written already for you to tinker with • Helper functions that already exist that can capture events when they occur in the kernel • How eBPF solves portability (Compile Once-Run Everywhere)
  39. func main() { go doSomething() doAnotherThing() } func main() {

    runtime.newProc(...) doAnotherThing() } One such example of calling into the runtime
  40. How do we get the code “inside” Goroutines to actually

    run on our hardware? We need some way to map Goroutines to OS threads - user-space scheduling!
  41. Hold up… what if a bunch of Goroutines are blocked

    on syscalls/IO? Do we really need all this state per thread?
  42. Presence of resource hogs in a FIFO system leads to

    something known as the Convoy Effect. This is a common problem to deal with while considering fairness in scheduling.
  43. How do we deal with this in scheduling? • One

    way is to just schedule the short running tasks before the long running ones. ◦ This would require us knowing what the characteristic of the workload is like. • Another way - pre-emption!
  44. Alright, now that we have enough context, the first thing

    to ask ourselves is “how do we choose which Goroutine to run?”
  45. Alright, now that we have enough context, the first thing

    to ask ourselves is “how do we choose which Goroutine to run?” 1. Check local runqueue 2. Check global runqueue - steal in bulk 3. Check netpoller 4. Steal work in bulk from another p
  46. What about the convoy effect? Will that be taken care

    of? We spoke about pre-emption earlier, let’s see how Go did it then and now.
  47. Non Co-operative Pre-emption • Each Goroutine is given a time-slice

    of 10ms after which, pre-emption is attempted. ◦ 10ms is a soft limit. • Pre-emption occurs by sending a userspace signal to the thread running the Goroutine that needs to be pre-empted. ◦ Similar to interruption based pre-emption in the kernel. • The SIGURG signal is sent to the thread whose Goroutine needs to be pre-empted. “Pardon the Interruption: Loop Preemption in Go 1.14” by Austin Clements
  48. Non Co-operative Pre-emption • sysmon ◦ Daemon running without a

    p ◦ Issues pre-emption requests for long-running Goroutines
  49. Awesome! We now have a kinda-sorta good idea about what

    happens under the hood. Yay! But all this is taken care of by the runtime itself, are there any knobs we can turn to try and control some of this behaviour?
  50. runtime APIs to interact with the scheduler • Try and

    treat the runtime as a black-box as much as possible! • (It’s a good thing that) Not a lot of exposed knobs to control the runtime. • Whatever is available should be understood thoroughly before using in code.
  51. runtime APIs to interact with the scheduler • NumGoroutine() •

    GOMAXPROCS() • Gosched() • Goexit() • LockOSThread()/UnlockOSThread()
  52. runtime APIs to interact with the scheduler • NumGoroutine() •

    GOMAXPROCS() • Gosched() • Goexit() • LockOSThread()/UnlockOSThread()
  53. LockOSThread()/UnlockOSThread() • Wires calling Goroutine to the underlying OS Thread.

    • Primarily used when the Goroutine changes underlying thread’s state.
  54. LockOSThread()/UnlockOSThread() • Weaveworks has an excellent case-study on this: ◦

    https://www.weave.works/blog/linux-namespaces-and-go-don-t-mix ◦ https://www.weave.works/blog/linux-namespaces-golang-followup • Let’s look at the fineprint.
  55. LockOSThread()/UnlockOSThread() • Acts like a “taint” indicating thread state was

    changed. • No ◦ Goroutine can be scheduled on this thread till UnlockOSThread() is called the same number of times as LockOSThread(). ◦ Thread can be created from a locked thread. • Don’t create Goroutines from a locked one that are expected to run on the modified thread state. • If a Goroutine exits before unlocking the thread, the thread is gotten rid of and is not used for scheduling anymore.
  56. Conclusion • eBPF: A fast and safe way to extend

    functionality of kernel • Go scheduler is a distributed, best-effort preemptive scheduler • The Go scheduler (and runtime in general) interact with the OS using syscalls • Information about the Go runtime which would otherwise not be visible can be captured through syscall being made via eBPF. • Latency sensitive, aggressive optimisations are maybe possible at the kernel level enabled by eBPF?
  57. References • Scalable Go Scheduler Design Doc • Go scheduler:

    Implementing language with lightweight concurrency • The Scheduler Saga • Analysis of the Go runtime scheduler • Non-cooperative goroutine preemption ◦ Pardon the Interruption: Loop Preemption in Go 1.14 • go/src/runtime/{ proc.go, proc_test.go, preempt.go, runtime2.go, ...} ◦ And their corresponding git blames • Go's work-stealing scheduler
  58. References • Everything eBPF • What Is eBPF - O'Reilly

    Book • Maps in eBPF • CO-RE and Portability in eBPF