Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Can We Use eBPF To Debug Performance of The Go Scheduler?

Can We Use eBPF To Debug Performance of The Go Scheduler?

Madhav Jivrajani

August 07, 2023
Tweet

More Decks by Madhav Jivrajani

Other Decks in Technology

Transcript

  1. Can We Use eBPF To Debug
    Performance of The Go Scheduler?
    Raghav Roy, Madhav Jivrajani – VMware

    View full-size slide

  2. Disclaimer
    ● Are we eBPF experts?

    ● Are we experts in the history and implementation of the Go runtime?

    ● Are we experts in the Linux Kernel?

    ● Why attend this talk?

    View full-size slide

  3. Disclaimer
    ● Are we eBPF experts?
    ○ No.
    ● Are we experts in the history and implementation of the Go runtime?

    ● Are we experts in the Linux Kernel?

    ● Why attend this talk?

    View full-size slide

  4. Disclaimer
    ● Are we eBPF experts?
    ○ No.
    ● Are we experts in the history and implementation of the Go runtime?
    ○ No.
    ● Are we experts in the Linux Kernel?

    ● Why attend this talk?

    View full-size slide

  5. Disclaimer
    ● Are we eBPF experts?
    ○ No.
    ● Are we experts in the history and implementation of the Go runtime?
    ○ No.
    ● Are we experts in the Linux Kernel?
    ○ No.
    ● Why attend this talk?

    View full-size slide

  6. Disclaimer
    ● Are we eBPF experts?
    ○ No.
    ● Are we experts in the history and implementation of the Go runtime?
    ○ No.
    ● Are we experts in the Linux Kernel?
    ○ No.
    ● Why attend this talk?
    ○ Minimum takeaway: internals of the performance of the Go scheduler + what is eBPF and how it works.
    ○ If you’re feeling dangerously adventurous: some ideas around optimising latency at the OS scheduler
    level.
    ○ More of “let’s look for solutions searching for a problem”.

    View full-size slide

  7. Small disclaimer: Everything discussed is in reference to
    Go 1.20.6

    View full-size slide

  8. So, why are we here? And why do we care?
    Pt. 1

    View full-size slide

  9. Do you know what goes on in the kernel under the hood?

    View full-size slide

  10. Standard libraries abstract away kernel processes, let’s
    look at what happens during a simple “echo” call

    View full-size slide

  11. More than a 100 syscalls!

    View full-size slide

  12. We can learn a lot about how an application behaves if we
    can see it's interaction with the kernel

    View full-size slide

  13. How would you do that? Thought experiment

    View full-size slide

  14. Modify the kernel?

    View full-size slide

  15. Output something when an event occurs in the kernel, in
    this case, say a syscall

    View full-size slide

  16. Why that might not be a great idea :
    ● Extremely complex codebase

    View full-size slide

  17. Why that might not be a great idea :
    ● Extremely complex codebase, but that isn’t the only problem

    View full-size slide

  18. Why that might not be a great idea :
    ● Extremely complex codebase, but that isn’t the only problem
    ● Come up with approach

    View full-size slide

  19. Why that might not be a great idea :
    ● Extremely complex codebase, but that isn’t the only problem
    ● Come up with approach
    ● Develop it, have it accepted into the Linux kernel

    View full-size slide

  20. Why that might not be a great idea :
    ● Extremely complex codebase, but that isn’t the only problem
    ● Come up with approach
    ● Develop it, have it accepted into the Linux kernel (months!)

    View full-size slide

  21. Why that might not be a great idea :
    ● Extremely complex codebase, but that isn’t the only problem
    ● Come up with approach
    ● Develop it, have it accepted into the Linux kernel (months!)
    ● Long time before the kernel with your patch is even adopted by all linux distros.

    View full-size slide

  22. Why that might not be a great idea :
    ● Extremely complex codebase, but that isn’t the only problem
    ● Come up with approach
    ● Develop it, have it accepted into the Linux kernel (months!)
    ● Long time before the kernel with your patch is even adopted by all linux distros.
    ● Oops, the requirements changed!

    View full-size slide

  23. Comic Relief

    View full-size slide

  24. Why eBPF
    ● The Linux kernel can accept kernel modules that extend its behaviour

    View full-size slide

  25. Why eBPF
    ● The Linux kernel can accept kernel modules that extend its behaviour (but is
    this safe?)
    ● eBPF programs can be loaded safely into the kernel, and do just this.

    View full-size slide

  26. Comic Relief

    View full-size slide

  27. Caveats!
    ● What happens if your eBPF program crashes?

    View full-size slide

  28. Caveats!
    ● What happens if your eBPF program crashes?
    ● Is it safe to run?

    View full-size slide

  29. Solution
    ● What happens if your eBPF program crashes?
    ● Is it safe to run?
    The eBPF Verifier!
    ● Makes sure the program exits safely

    View full-size slide

  30. Solution
    ● What happens if your eBPF program crashes?
    ● Is it safe to run?
    The eBPF Verifier!
    ● Makes sure the program exits safely
    ● Only access memory that it is supposed to access

    View full-size slide

  31. Solution
    ● What happens if your eBPF program crashes?
    ● Is it safe to run?
    The eBPF Verifier!
    ● Makes sure the program exits safely
    ● Only access memory that it is supposed to access
    Still, only run eBPF programs from verifiable sources

    View full-size slide

  32. Bonus Section (if you are still not convinced)
    ● eBPF programs can be loaded and removed dynamically, doesn’t matter if your
    application was running before.

    View full-size slide

  33. Bonus Section (if you are still not convinced)
    ● eBPF programs can be loaded and removed dynamically, doesn’t matter if your
    application was running before.
    ● Instantly gets visibility over everything that's happening in the machine

    View full-size slide

  34. Bonus Section (if you are still not convinced)
    ● eBPF programs can be loaded and removed dynamically, doesn’t matter if your
    application was running before.
    ● Instantly gets visibility over everything that's happening in the machine
    ● Create new functionality very quickly without all Linux users having to accept
    the changes.

    View full-size slide

  35. Okay so we agree this is great, but how does it work?

    View full-size slide

  36. How do you write eBPF programs
    ● Kernel accepts programs in bytecode format (Object file)

    View full-size slide

  37. How do you write eBPF programs
    ● Kernel accepts programs in bytecode format (Object file)
    ● eBPF programs can't be written in high level languages

    View full-size slide

  38. How do you write eBPF programs
    ● Kernel accepts programs in bytecode format (Object file)
    ● eBPF programs can't be written in high level languages, why?

    View full-size slide

  39. How do you write eBPF programs
    ● Kernel accepts programs in bytecode format (Object file)
    ● eBPF programs can't be written in high level languages, why?
    1. The compiler needs to emit bytecode format, not languages support this

    View full-size slide

  40. How do you write eBPF programs
    ● Kernel accepts programs in bytecode format (Object file)
    ● eBPF programs can't be written in high level languages, why?
    1. The compiler needs to emit bytecode format, not languages support this
    2. Can’t have runtime features (Garbage Collection, Scheduling, etc)

    View full-size slide

  41. What Events exist in the Kernel?

    View full-size slide

  42. Events
    ● eBPF programs are event-driven

    View full-size slide

  43. Events
    ● eBPF programs are event-driven
    ● Once it’s loaded into the kernel, it needs to be attached to an event

    View full-size slide

  44. Events
    ● eBPF programs are event-driven
    ● Once it’s loaded into the kernel, it needs to be attached to an event
    ● We already talked about syscalls, what other places can you hook an eBPF
    program?

    View full-size slide

  45. Events
    ● Syscalls are stable APIs, don’t change with Kernel version.

    View full-size slide

  46. Events
    ● Syscalls are stable APIs, don’t change with Kernel version.
    ● There are also kernel function “entry” and “exit” points (name explains what that
    means). These are called kprobes and kretprobes

    View full-size slide

  47. Events
    ● Syscalls are stable APIs, don’t change with Kernel version.
    ● There are also kernel function “entry” and “exit” points (name explains what that
    means). These are called kprobes and kretprobes (newer kernel versions call
    this fentry/fexit)
    ● Can also be hooked to userspace functions called uprobes/uretprobes

    View full-size slide

  48. Events
    ● Syscalls are stable APIs, don’t change with Kernel version.
    ● There are also kernel function “entry” and “exit” points (name explains what that
    means). These are called kprobes and kretprobes (newer kernel versions call
    this fentry/fexit)
    ● Can also be hooked to userspace functions called uprobes/uretprobes
    ● Network interface hooks, XDP

    View full-size slide

  49. Events
    ● Syscalls are stable APIs, don’t change with Kernel version.
    ● There are also kernel function “entry” and “exit” points (name explains what that
    means). These are called kprobes and kretprobes (newer kernel versions call
    this fentry/fexit)
    ● Can also be hooked to userspace functions called uprobes/uretprobes
    ● Network interface hooks, XDP
    ● Perf Events, tracepoints, etc (ie, a vast variety/amount of places to hook your
    eBPF program to!)

    View full-size slide

  50. You said something about Maps

    View full-size slide

  51. You said something about Maps
    Yes, I did

    View full-size slide

  52. Maps
    ● Key-Value pairs, data-structures used by both Userspace program and eBPF
    program running in the kernel

    View full-size slide

  53. Maps
    ● Key-Value pairs, data-structures used by both Userspace program and eBPF
    program running in the kernel
    ● Defined alongside ebpf programs, then loaded into kernel

    View full-size slide

  54. Maps
    ● Key-Value pairs, data-structures used by both Userspace program and eBPF
    program running in the kernel
    ● Defined alongside ebpf programs, then loaded into kernel
    ● Userspace program writes config info like event-registration, into them, read by
    eBPF

    View full-size slide

  55. Maps
    ● Key-Value pairs, data-structures used by both Userspace program and eBPF
    program running in the kernel
    ● Defined alongside ebpf programs, then loaded into kernel
    ● Userspace program writes config info like event-registration, into them, read by
    eBPF
    ● Userspace and kernel-space programs need a common understanding of
    data-structures stored in the map

    View full-size slide

  56. Now you know the basic components that go into running
    an eBPF program

    View full-size slide

  57. Now you know the basic components that go into running
    an eBPF program
    (Congrats!)

    View full-size slide

  58. Let’s visualise the flow of an eBPF program being loaded
    into the kernel and reacting to Events

    View full-size slide

  59. Loading it into the Kernel

    View full-size slide

  60. eBPF program reacts to event-triggers

    View full-size slide

  61. Types of events

    View full-size slide

  62. You got the flow!

    View full-size slide

  63. Things not mentioned but provided resources to:
    ● BCC libraries that have a ton of eBPF programs written already for you to tinker
    with

    View full-size slide

  64. Things not mentioned but provided resources to:
    ● BCC libraries that have a ton of eBPF programs written already for you to tinker
    with
    ● Helper functions that already exist that can capture events when they occur in the
    kernel

    View full-size slide

  65. Things not mentioned but provided resources to:
    ● BCC libraries that have a ton of eBPF programs written already for you to tinker
    with
    ● Helper functions that already exist that can capture events when they occur in the
    kernel
    ● How eBPF solves portability (Compile Once-Run Everywhere)

    View full-size slide

  66. So, why are we here? And why do we care?
    Pt. 2

    View full-size slide

  67. Goroutines!
    ● “Lightweight threads”
    ● Managed by the Go runtime
    ● Minimal API (go)

    View full-size slide

  68. Full talk with much more details on the scheduler: Queues,
    Fairness and The Go Scheduler

    View full-size slide

  69. Let’s take a small example

    View full-size slide

  70. // An abridged main.go
    func main() {
    go doSomething()
    doAnotherThing()
    }

    View full-size slide

  71. // An abridged main.go
    func main() {
    go doSomething()
    doAnotherThing()
    }
    go build -o app main.go

    View full-size slide

  72. func main() {
    go doSomething()
    doAnotherThing()
    }
    func main() {
    runtime.newProc(...)
    doAnotherThing()
    }
    One such example of calling into the runtime

    View full-size slide

  73. Let’s actually run our code.
    ./app

    View full-size slide

  74. How do we get the code “inside” Goroutines to actually
    run on our hardware?
    We need some way to map Goroutines to OS threads - user-space
    scheduling!

    View full-size slide

  75. The Go scheduler does N-M scheduling.

    View full-size slide

  76. How do we keep track of Goroutines that are yet to be run?

    View full-size slide

  77. Hold up… what if a bunch of Goroutines are blocked on syscalls/IO?
    Do we really need all this state per thread?

    View full-size slide

  78. How can we tackle this?
    ✨Indirection✨

    View full-size slide

  79. “Go scheduler: Implementing language with lightweight concurrency”
    by Dmitry Vyukov

    View full-size slide

  80. Interlude: Fairness

    View full-size slide

  81. Presence of resource hogs in a FIFO system leads to something known
    as the Convoy Effect.
    This is a common problem to deal with while considering fairness in scheduling.

    View full-size slide

  82. How do we deal with this in scheduling?
    ● One way is to just schedule the short running tasks before the long running ones.
    ○ This would require us knowing what the characteristic of the workload is like.
    ● Another way - pre-emption!

    View full-size slide

  83. Alright, now that we have enough context, the first thing to ask
    ourselves is “how do we choose which Goroutine to run?”

    View full-size slide

  84. Alright, now that we have enough context, the first thing to ask
    ourselves is “how do we choose which Goroutine to run?”
    1. Check local runqueue
    2. Check global runqueue - steal in bulk
    3. Check netpoller
    4. Steal work in bulk from another p

    View full-size slide

  85. What about the convoy effect? Will that be taken care of?
    We spoke about pre-emption earlier, let’s see how Go did it then and now.

    View full-size slide

  86. Non Co-operative Pre-emption
    ● Each Goroutine is given a time-slice of 10ms after which, pre-emption is attempted.
    ○ 10ms is a soft limit.
    ● Pre-emption occurs by sending a userspace signal to the thread running the Goroutine that needs to
    be pre-empted.
    ○ Similar to interruption based pre-emption in the kernel.
    ● The SIGURG signal is sent to the thread whose Goroutine needs to be pre-empted.
    “Pardon the Interruption: Loop Preemption in Go 1.14” by Austin Clements

    View full-size slide

  87. Non Co-operative Pre-emption
    ● Who sends this signal?
    sysmon

    View full-size slide

  88. Non Co-operative Pre-emption
    ● sysmon
    ○ Daemon running without a p
    ○ Issues pre-emption requests for long-running Goroutines

    View full-size slide

  89. Where does the pre-empted Goroutine go?

    View full-size slide

  90. Awesome! We now have a kinda-sorta good idea about what happens
    under the hood. Yay!
    But all this is taken care of by the runtime itself, are there any knobs we can turn to try
    and control some of this behaviour?

    View full-size slide

  91. runtime APIs to interact with the scheduler
    ● Try and treat the runtime as a black-box as much as possible!
    ● (It’s a good thing that) Not a lot of exposed knobs to control the runtime.
    ● Whatever is available should be understood thoroughly before using in code.

    View full-size slide

  92. runtime APIs to interact with the scheduler
    ● NumGoroutine()
    ● GOMAXPROCS()
    ● Gosched()
    ● Goexit()
    ● LockOSThread()/UnlockOSThread()

    View full-size slide

  93. runtime APIs to interact with the scheduler
    ● NumGoroutine()
    ● GOMAXPROCS()
    ● Gosched()
    ● Goexit()
    ● LockOSThread()/UnlockOSThread()

    View full-size slide

  94. LockOSThread()/UnlockOSThread()
    ● Wires calling Goroutine to the underlying OS Thread.
    ● Primarily used when the Goroutine changes underlying thread’s state.

    View full-size slide

  95. LockOSThread()/UnlockOSThread()
    ● Weaveworks has an excellent case-study on this:
    ○ https://www.weave.works/blog/linux-namespaces-and-go-don-t-mix
    ○ https://www.weave.works/blog/linux-namespaces-golang-followup
    ● Let’s look at the fineprint.

    View full-size slide

  96. LockOSThread()/UnlockOSThread()
    ● Acts like a “taint” indicating thread state was changed.
    ● No
    ○ Goroutine can be scheduled on this thread till UnlockOSThread() is called the same number
    of times as LockOSThread().
    ○ Thread can be created from a locked thread.
    ● Don’t create Goroutines from a locked one that are expected to run on the modified thread state.
    ● If a Goroutine exits before unlocking the thread, the thread is gotten rid of and is not used for
    scheduling anymore.

    View full-size slide

  97. So where does eBPF come in?

    View full-size slide

  98. Demo: https://youtu.be/X0VnDPQRCo4?t=4938

    View full-size slide

  99. Conclusion
    ● eBPF: A fast and safe way to extend functionality of kernel
    ● Go scheduler is a distributed, best-effort preemptive scheduler
    ● The Go scheduler (and runtime in general) interact with the OS using syscalls
    ● Information about the Go runtime which would otherwise not be visible can be captured through
    syscall being made via eBPF.
    ● Latency sensitive, aggressive optimisations are maybe possible at the kernel level enabled by eBPF?

    View full-size slide

  100. https://github.com/MadhavJivrajani/gse/tree/main/example/preemption_ebpf

    View full-size slide

  101. References
    ● Scalable Go Scheduler Design Doc
    ● Go scheduler: Implementing language with lightweight concurrency
    ● The Scheduler Saga
    ● Analysis of the Go runtime scheduler
    ● Non-cooperative goroutine preemption
    ○ Pardon the Interruption: Loop Preemption in Go 1.14
    ● go/src/runtime/{ proc.go, proc_test.go, preempt.go, runtime2.go, ...}
    ○ And their corresponding git blames
    ● Go's work-stealing scheduler

    View full-size slide

  102. References
    ● Everything eBPF
    ● What Is eBPF - O'Reilly Book
    ● Maps in eBPF
    ● CO-RE and Portability in eBPF

    View full-size slide