Tracing BitVisor with bpftrace
Masanori Misono
The University of Tokyo
2020-11-30 BitVisor Summit 9
Slide 2
Slide 2 text
2
About Myself
• Masanori Misono (Shinagawa Laboratory, The University of Tokyo)
• Github: @mmisono
• A Committer of bpftrace (130+commits)
• Attending BitVisor Summit from 2016-
• 2017 : BPFΛར༻ͨ͠BitVisor෦ͰͷύέοτϑΟϧλϦϯά (+α)
(Packet filtering (+α) in BitVisor with BPF)
• 2018 : bitvisor.ko : BitVisor as a module
• 2019 : vIOMMU implementation in BitVisor
Slide 3
Slide 3 text
3
Motivation
• Performance evaluation is always of critical importance
• Performance evaluation of BitVisor itself is not so easy
• VMEXIT count, latency
• When shadow driver working
• …
• How can we get this?
• We want to tracing tool for BitVisor!
Slide 4
Slide 4 text
4
From my 2017’s presentation…
Slide 5
Slide 5 text
5
What is (e)BPF?
• Now Linux extend BPF (thus eBPF) and uses it in various ways
• tracing, networking, security, …
Slide 6
Slide 6 text
6
Why (e)BPF is used in Linux?
Slide 7
Slide 7 text
7
From my 2017’s presentation… (cont’d)
/&('$,(*"-$,*"+(').
* $'#/$,
,*%
/,(/"'!
/////
/////////////
////
//
////////////
Result
I ported basic BPF functionality to BitVisor
Slide 8
Slide 8 text
8
What does it do?
• Use bcc to compile BPF program
• Implement a hypercall
(vmcall/vmmcall) to load a BPF
program
• Implement a hypercall to get a
BPF map
• Statically instrument tracing
points (like Linux’s tracepoint)
Slide 9
Slide 9 text
9
It works! But…
• The safety problem
• The limited verifier (vs. Linux verifier is ~10k)
• BPF (user codes) runs in VMM root-mode ring0
VMX ROOT mode
ring 0
What if the program has bug?
Slide 10
Slide 10 text
10
It works! But… (cont’d)
• The implementation is somewhat specific to BitVisor
• We modified BCC to generate dedicated BPF code for BitVisor
• BCC (and other tools) are actively developed
• Can we use reduce the modification of userland program?
Slide 11
Slide 11 text
3 years later…
Slide 12
Slide 12 text
12
BPF is more and more popular!
https://ebpf.io/summit-2020/
http://www.brendangregg.com/bpf-performance-tools-book.html
https://cloud.google.com/blog/products/containers-kubernetes/bringing-ebpf-and-cilium-to-google-kubernetes-engine
https://gihyo.jp/magazine/SD/archive/2020/202010
Software Design 2020年10月号
November 6, 2019
August 20, 2020
Slide 13
Slide 13 text
It’s time to revisit the problem!
Slide 14
Slide 14 text
It’s time to revisit the problem!
… BCC is really great,
but is there another popular BPF tracing tool now?
Slide 15
Slide 15 text
No content
Slide 16
Slide 16 text
16
bpftrace
(from bpftrace.org)
※ There are other useful tools, of course
Only needing lines of script
※ Unofficial mascot
Slide 17
Slide 17 text
17
bpftrace
(from bpftrace.org)
※ There are other useful tools, of course
Then get the result!
※ Unofficial mascot
Slide 18
Slide 18 text
Let’s try to use bpftrace for tracing BitVisor!
Slide 19
Slide 19 text
19
Goal
(※ basically same as 2017’s)
bpftrace
BPF VM
BPF map
Retrieve data when necessary
Guest OS
BitVisor
Load BPF program
an event
call BPF call back
store/retrieve data
Load BPF
Helper functions
Slide 20
Slide 20 text
20
Challenge
1. Provide safe execution
2. Use BPF code that generate bpftrace as is
Slide 21
Slide 21 text
21
Challenge
1. Provide safe execution
2. Use BPF code that generate bpftrace as is
Slide 22
Slide 22 text
22
Safe Execution
• Implementing or Porting Linux’s verifier is very hard
• Our approach
• Safe execution by running BPF program In a VMX root ring3
(a.k.a protection domain)
VMX ROOT
mode
Ring0
Ring3
BPF VM
BPF map
BitVisor Main Thread
Protection Domain
Helper Functions
an event
Slide 23
Slide 23 text
23
Comparison with 2017’s
Ring0
Ring3
BPF VM
BPF map
BitVisor Main Thread
Protection Domain
Helper Functions
an event
Ring0
BPF VM
BPF map
BitVisor Main Thread
Helper Functions
an event
2017
2020
Messaging overhead, but gives the safety
Slide 24
Slide 24 text
24
Challenge
1. Provide safe execution
2. Use BPF code that generate bpftrace as is
Slide 25
Slide 25 text
25
Implement the same helper function as Linux
• BPF program can call external functions by BPF CALL instruction
• Implement the same helper function
• CALL 1 : BPF_MAP_LOOKUP_ELEM
• CALL 2 : BPF_MAP_UPDETE_ELEM
• CALL 3 : BPF_MAP_DELETE_ELEM
• ….
BPF VM
BPF map
Helper Functions
Other functions
Slide 26
Slide 26 text
26
Implementation
• Port ubpf to BitVisor (the same as the before)
• ubpf is an userland eBPF VM: https://github.com/iovisor/ubpf
• Implement basic BPF helper functions in a protection domain
• Modify bpftrace so that it call vmmcall instead of system call
when interacting BPF functionality
• I do not change any BPF code generation part of bpftrace!
Slide 27
Slide 27 text
27
How to notify event?
• Statically define events (like Linux’s tracepoint), the same as before
• Example
Added
part an event
Slide 28
Slide 28 text
28
BitVisor Ring0 ó Protection domain
• Use msghandler to call BPF VM and pass data
• The mechanism to communicate with other threads/processes using
callbacks
Ring0
Ring3
BPF VM
BPF map
BitVisor Main Thread
Protection Domain
Helper Functions
an event
sendmsg()
Slide 29
Slide 29 text
29
BitVisor ó bpftrace Communication
• bpftrace interacts kernel with bpf(2) system call
• Implementing corresponding hypercall (vmcall/vmmcall) for bpf(2)
bpftrace
BPF map
bpf(BPF_PROG_LOAD) bpf(BPF_MAP_LOOKUP_ELEM)
BitVisor
Guest OS
Slide 30
Slide 30 text
30
Implementation (cont’d)
• Total modification
• BitVisor ~1000LOC (excludes ubpf and third-party libraries)
• bpftrace ~300LOC
• This includes comments, blank lines and debug codes. The actual
amount of modification is much smaller
Slide 31
Slide 31 text
31
Execution Overview
ebpf VM
ring3 ring0
BitVisor
Guest OS
vmcall
handler
2. register
the program
① Loading BPF Program
bpftrace
1. compile
& load BPF
program
Helper Functions
map
BitVisor Main Thread
Protection Domain
Slide 32
Slide 32 text
32
Execution Overview
ebpf VM
ring3 ring0
notify event
an event occurs
Run BPF program
map
② Event handling
bpftrace
Helper Functions
BitVisor
Guest OS
BitVisor Main Thread
Protection Domain
Slide 33
Slide 33 text
33
Execution Overview
ebpf VM
ring3 ring0
vmcall
handler
1. Request
the map data
map 2. Retrieve map
3. return map data
③ Retrieve tracing information
Helper Functions
bpftrace
BitVisor
Guest OS
BitVisor Main Thread
Protection Domain
Slide 34
Slide 34 text
34
Demo
Slide 35
Slide 35 text
35
Trace Script and the Result
1: External Interrupt
7: Interrupt Window
31: RDMSR
18: VMCALL
In BitVisor
The trace script
Slide 36
Slide 36 text
36
The Generated Program
Slide 37
Slide 37 text
37
Performance Evaluation
• How much is the overhead of a message passing? (ring0 ó ring3)
• I developed and experimented everything on VMWare Fusion on
macOS with nested virtualization
• Therefore, no legitimate evaluation presentation today :(
• One day I want to try to do proper evaluation
Slide 38
Slide 38 text
38
Discussion and Future Work
• The current implementation is very preliminarily
• Only support an integer key/value pair map
• I guess it’s not so hard to extend this
• What can we do if this project work enough?
• Guest-Host cooperating tracing
• BitVisor introspection according to the guest behavior
• …
• I think there must be a lot of fun things to do!
Slide 39
Slide 39 text
39
Conclusion
• Propose another way to trace BitVisor’s events
• By utilizing a protection domain, BPF program run with safety
guaranteed in a VMX root mode
• Extend bpftrace and users can trace BitVisor with it
• Let’s enjoy tracing! !