2 servers in different config ● Same server (WEBRick 1.7.0, ruby 3.1.2) ○ ● Same bench parameter: ○ ● Different value of net.core.somaxconn ○ 4,096 vs 500, how this makes effect?
#2 How it works ref. https://speakerdeck.com/chikuwait/learn-ebpf?slide=17 by Yuki Nakata, 2020 emoji from https://github.com/twitter/twemoji/tree/master/assets (*) Very simplified Scripting Bytecode BPF VM BPF Map User Interface Collectiong Kernel Data… … or perf buffer, etc. The Userland The Kingdom of Kernel
What is RbBCC? ● A: BCC for Ruby (libbcc FFI binding for Ruby) ● WHAT is BCC? ○ BPF Compiler Collection: ○ An SDK to make BPF tools, using Script Languages (Python/Lua supported officially) ○ But - Ruby is not in its support list, so I’m developping I’m going to show How to use – How to write BPF Ruby codes.
#2 tracepoint (for kernel) ● Different stuff from Ruby’s TracePoint class ● A static entrypoint to trace kernel events ● It won’t change in the future version of Linux ○ kprobe traces an exported symbol of kernel, so it should be changed and maybe unstable.
#3 uprobe ● Collecting rb_str_new()’s: (function return timestamp - function entry timestamp) ● This represents the latency of a function call ● function entry = uprobe, function return = uretprobe
#4 USDT ● USDT: Userspace Statically Defined Tracepoint ○ Probe points that an author of a program embedded in advance ○ cf. uprobe traces real function call dynamically ● USDT for uprobe is just as Tracepoint for kprobe Dynamic Static Kernel space kprobe tracepount User space uprobe USDT
#4 USDT ● Ruby’s USDT (first for DTrace, but available via BPF in Linux) Japanese article: https://magazine.rubyist.net/articles/0041/0041-200Special-dtrace.html https://rubyreferences.github.io/rubyref/advanced/dtrace.html
#4 USDT ● Example: USDTs about GC: ○ usdt:./bin/ruby:ruby:gc__mark__begin ○ usdt:./bin/ruby:ruby:gc__mark__end ○ usdt:./bin/ruby:ruby:gc__sweep__begin ○ usdt:./bin/ruby:ruby:gc__sweep__end ● They can be used to trace GC latency: ○ (mark_end_time - mark_begin_time)
Summary: ● BPF Observability has 4 keys of tracing source: ● RbBCC can access all of four. Just use Ruby (and small C). ● Use Ruby to trace Ruby. Dynamic Static Kernel space kprobe tracepount User space uprobe USDT
Point 1: Reduce iter()/String ● Then measure! malloc calloc free Ruston Before 750197 22 753491 Ruston After 110197 22 113596 cf. C JSON 20206 10022 34142 (*) N = 10,000
The result #2 ● Comparison before / after all; for case N = 50,000 user system total Ruston Before 0.277292 0.000000 0.277292 Ruston After All 0.051765 0.000000 0.051765 cf. C JSON 0.054263 0.000000 0.054263
Lessons learned ● Existing tools are useful (e.g. perf, strace, gdb…) ● To grasp detailed bottleneck, making simple BPF tool is effective. ● uprobe is an entrypoint to x-ray native programs’ performance e.g. C, C++ and Rust (also … Zig?) ● Just keep them in mind: measure, reproduce, measure.
Acknolegements: ● The Book “Linux Observability with BPF” ○ by David Calavera, Lorenzo Fontana ○ https://www.oreilly.com/library/view/linux-observability-wit h/9781492050193/ ● Brendan Gregg for his superb articles: ○ https://www.brendangregg.com/bpf-performance-tools-b ook.html ● Masashi Misono for his Japanese introduction to BPF ○ https://atmarkit.itmedia.co.jp/ait/articles/2004/09/news006 .html
Acknolegements: ● RbBCC received Ruby Association Grant in 2019 ○ report: https://www.ruby.or.jp/ja/news/20200508 ○ Maintored by Koichi “ko1” Sasada (Cookpad, Inc.) ○ Given some advices from Ryosuke Matsumoto (Sakura Internet), Takao Shimayoshi and Yoshiaki Kasahara (Kyushu Univ.)