$30 off During Our Annual Pro Sale. View Details »

"Story of Rucy" on RubyKaigi takeout 2021

KONDO Uchio
September 09, 2021

"Story of Rucy" on RubyKaigi takeout 2021

Rucy is a Ruby Compiler for BPF target. Rucy is named after "Ru-C".

Talk is presented on RubyKaigi takeout 2021:
https://rubykaigi.org/2021-takeout/presentations/udzura.html

KONDO Uchio

September 09, 2021
Tweet

More Decks by KONDO Uchio

Other Decks in Technology

Transcript

  1. Story of Rucy:
    “Compile” a Ruby Script into BPF

    View Slide

  2. Senior-Principal Engineer@GMO Pepabo
    DataOps & Dev Productivity Engineering Team
    RubyKaigi Speaker (’16, ’18 && ’19)
    RubyKaigi Organizer (’19 @ Fukuoka)
    Hacker Supporter @ Fukuoka City Engineer’s Café
    Ruby, mruby, Rust, Containers, Kernel, Duolingo
    Uchio Kondo:

    View Slide

  3. ToC
    • 1) What is BPF? Has it something to do with us?
    • 2) What is Rucy for, and Why?
    • 3) What is going on behind Rucy
    • Overview, BPF Opcode and mruby VM Opcode, Transpilation
    • 4) Conclusion, demo aiming at the future
    ࠓ೔࿩͢಺༰Ͱ͢ɻ
    3VDZͷ࿩Λ͢Δલʹલఏ஌͕ࣝͨ͘͞Μ͋ΔͷͰɺॱ࣍આ໌͍͖ͯ͠·͢ɻ

    View Slide

  4. What is BPF?
    Has it something to do with us?

    View Slide

  5. BPF is:
    • One of the latest Linux SOTAs
    • Used, for example, in the following areas:
    • Networking
    • Server Tracing (Observability)
    • Do you know DTrace? What BPF can do is almost like that.
    • Device Access Auditing for containers
    #1'͸-JOVYͷ࠷ઌ୺ٕज़ͷҰͭͰ͢ɻωοτϫʔΫɺύϑΥʔϚϯεɾτϨʔεɺ
    ίϯςφ಺෦ͷσόΠεΞΫηε੍ޚͷ಺෦ͳͲͰ࢖ΘΕ͍ͯ·͢ɻ

    View Slide

  6. How BPF works: Observability
    Userspace Kernel
    BPF
    binary
    BPF
    binary
    BPF VM
    Userspace
    Program
    (libbpf)
    BPF Map
    kprobe, tracepoint...
    Load&Verify
    Filter/Aggregate
    Receive Informations
    Load&Verify
    Filter/Aggregate
    Receive Informations
    Visualize
    τϨʔεͷ৔߹ɺʮ#1'όΠφϦʯΛΧʔωϧʹϩʔυ͠ɺ
    Χʔωϧͷ৘ใΛूΊͯɺϢʔβεϖʔεʹฦ͠·͢ɻ

    View Slide

  7. How BPF works: Device access auditing
    cgroup
    Kernel
    Devices
    Process
    BPF
    👍

    Access filter
    Try to access devices
    Given access
    context
    /dev/urandom
    /dev/null
    /dev/zero
    /dev/tty
    /dev/sdaXX
    /dev/sdbXX
    /dev/sdcXX
    ......
    σόΠεΞΫηε੍ޚͷ࣌͸ɺDHSPVQʹ#1'όΠφϦϓϩάϥϜΛಡΈࠐ·ͤ
    ΞΫηε࣌ʹ౉͞ΕΔίϯςΫετΛར༻͠ϑΟϧλʔ͠·͢ɻ

    View Slide

  8. What is Rucy for?
    And... why?

    View Slide

  9. To begin with: Can BPF be used via Ruby?
    • The answer is Yes.
    • I created RbBCC
    • BCC is a BPF SDK for Python, and Lua...
    • I need Ruby’s one. So created.
    • BCC uses FFI(ctypes) to access libbcc, then RbBCC uses Fiddle.
    • RbBCC can do basic traces as BCC do.
    • RbBCC is one of the results of a Ruby Association Grant in 2019.
    (*) https:/
    /github.com/udzura/rbbcc
    Thanks, ko1-san
    #1'͸3VCZ͔Β࢖͑·͔͢ʁ౴͑͸ʮ:FTʯͰ͢ɻ
    3C#$$ͱ͍͏΋ͷΛ࢖͍·͢ɻ3VCZΞιγΤʔγϣϯͷάϥϯτର৅Ͱ΋͋Γ·͢ɻ

    View Slide

  10. BCC’s pitfalls
    • Tools with (Rb)BCC do the whole BPF compilation process
    • When it starts tracing - Just In Time
    • It causes:
    • 1) overhead
    • 2) large number of files required to build
    • The compilation process needs LLVM, clang and kernel headers.
    ͔͠͠ɺ#$$ʢ3C#$$ʣ͸ɺτϨʔεͷ࠷ॳʹ#1'ϓϩάϥϜΛͦͷ৔ͰίϯύΠϧ͠·͢ɻ
    Φʔόʔϔου΍ɺίϯύΠϧʹඞཁͳ؀ڥͷංେԽͳͲ໰୊͕͋Γ·͢ɻ

    View Slide

  11. Brand-new BPF tool for Ruby
    • Linux community recommends to precompile the "BPF binary”
    • As tools using on-the-fly build are such problematic
    • Tools should be served as “One Binary”
    ✴ There are also mechanisms to absorb differences in execution environments such as the kernel version.
    ✴ These portable one-binary commands are called BPF CO-RE (Compile Once, Run Everywhere).
    BPF Tracing Program
    Userspace Part
    (C, Rust, Go...)
    BPF Binary
    (Bytecodes)
    +
    ͳͷͰࣄલʹίϯύΠϧͨ͠ʮ#1'όΠφϦʯͷར༻͕ਪ঑͞Ε·͢ɻ
    ϓϩάϥϜຊମʹ#1'όΠφϦΛ݁߹ͯ͠ɺϫϯόΠφϦͰఏڙՄೳͰ͢ɻ

    View Slide

  12. “One Binary” in Ruby
    • mruby seems to be the best way
    • to achieve the "one binary for everything" goal.
    • But - "BPF binary” can only be created in C, effectively
    NSVCZΛ࢖͑͹ϫϯόΠφϦΛ࡞Δ͜ͱ͕Ͱ͖ͦ͏Ͱ͢ɻ
    Ͱ͕͢ɺʮ#1'όΠφϦʯ΋༻ҙ͠ͳ͍ͱ͍͚·ͤΜɻ͜Ε͸࣮࣭తʹ$ݴޠͰ͔͠ॻ͚·ͤΜɻ

    View Slide

  13. Making a BPF binary
    • We need to pass the C code to the clang command:
    • I decided to make a tool that enable Rubyists -
    • To make "BPF binaries" without writing any C code!
    ✴ In fact, Rust can create BPF binaries using RedBPF crate. But - this is RubyKaigi.
    🆕
    ͡Ό͋ɺશ෦3VCZͰॻ͚ΔΑ͏ʹ͠·͠ΐ͏ɻ

    View Slide

  14. Summary so far:
    • BPF is an emerging Linux technology
    • used in fast firewalls, lightweight tracers, and containers security.
    • You can use BPF technology from Ruby via BCC.
    • But BCC is problematic because it compiles on the fly.
    • Precompiled "One binary" is recommended.
    • You need to write C language for the "BPF binary" part.
    • Writing every part of BPF program in Ruby is a desire of Rubyists.
    ͜͜·Ͱͷ·ͱΊͰ͢ɻ
    ͱʹ͔͘ɺ#1'ʹؔΘΔ͢΂ͯͷύʔτΛ3VCZͰॻ͖ͨ͋͘Γ·ͤΜ͔ʁ

    View Slide

  15. Only with the Rucy.
    ͦ͏ɺ3VDZͳΒͶɻ

    View Slide

  16. What is going on?
    Technologies behind Rucy

    View Slide

  17. Reprise: what is Rucy
    • Rucy is a “Ruby Compiler” to BPF target.
    • In other words:
    • Rucy generates “BPF binary” directly from a plain-old Ruby script.
    ✴ Rucy is named after “Ruby Compiler” (Ru-C -> Rucy)
    3VDZʢ3V$ʣ͸ʮ3VCZ$PNQJMFSʯͰ͢ɻ
    ͩ͘Μͷʮ#1'όΠφϦʯΛɺ3VCZεΫϦϓτ͔Β௚઀ίϯύΠϧͯ͠ੜ੒͠·͢ɻ

    View Slide

  18. Architecture overview
    Ruby Script mruby OpCodes BPF OpCodes BPF Object
    (ELF format)
    mruby Rucy transpiler
    ✴ Rucy is built upon Rust , with mruby embedded.
    ΞʔΩςΫνϟશମͰ͢ɻ3VCZεΫϦϓτΛNSVCZΦϖίʔυʹίϯύΠϧ͠ɺ
    ͦΕΛ͞Βʹ#1'Φϖίʔυʹม׵ɺ࠷ޙʹ&-'ϑΝΠϧʹ·ͱΊ·͢ɻ

    View Slide

  19. BPF OpCode Overview
    • BPF has its own VM and OpCode set.
    • The VM can use 10 registers
    • BPF instructions are basically 64bit fixed length
    • Instruction classes are: LD, LDX, ST, STX, ALU, JMP, and ALU64
    • Layout:
    ASCII C Struct
    #1'͸ɺಠࣗͷϨδελϚγϯ7.Λ࣋ͪ·͢ɻͷϨδελ͕͋Γɺ
    ໋ྩ͸͓͓ΉͶCJUͷݻఆ௕Ͱ͢ɻόΠφϦϨΠΞ΢τ͸εϥΠυͷ௨ΓͰ͢ɻ

    View Slide

  20. BPF instruction examples
    Instruction Pseudo code
    #1'໋ྩͷྫɻ-%9ɺ"-6ɺ+.1ͳͲ͕͋Γ·͢ɻ

    View Slide

  21. Mruby OpCode Overview
    • Like CRuby, mruby has its VM and OpCodes
    • can use 65536 registers, recommended to use only 256.
    • Instructions are in variable length
    • It depends on what kind of operands it will take.
    • The sizes of the operand variable ...
    • B (8 bits), S (16 bits), W (24 bits), and Z (no operand)
    NSVCZ΋ϨδελϚγϯͷ7.͕͋Γ·͢ɻϨδελ͸ͨ͘͞Μ࢖͑·͢ɻ
    ໋ྩ͸Մม௕Ͱ͢ɻ#ɺ4ɺ8ɺ;ͷΦϖϥϯυ͕͋Γ·͢ɻ

    View Slide

  22. mruby instruction examples
    Bytes OpCode&Operand # Pseudo code
    NSVCZͷ໋ྩͰ͢ɻͨͱ͑͹01@-0"%*͸CJUɺ01@&/5&3͸CJUͷ໋ྩ௕Ͱ͢ɻ

    View Slide

  23. Transpile OpCodes

    View Slide

  24. Bytecode transpilation process
    • Here is a sample Ruby code.
    ͯ͞ɺόΠτίʔυΛτϥϯεύΠϧ͢Δํ๏Λઆ໌͠·͢ɻ
    ͪ͜Β͕ࠓճͷ3VCZͷίʔυͰ͢ɻ

    View Slide

  25. This code outputs mruby OpCodes like:
    ͜ͷ3VCZίʔυ͸͜͏͍͏NSVCZΦϖίʔυʹͳΓ·͢ɻ

    View Slide

  26. The goal of transpilation
    mruby opcode BPF opcode
    ࠷ऴతʹɺ͜͏͍͏#1'ͷΦϖίʔυʹม׵͞Ε·͢ɻ

    View Slide

  27. First: ignore some instructions
    ·ͣɺ͍͔ͭ͘ͷ#1'ͷରԠ͢Δ໋ྩ͕ͳ͍Φϖίʔυ͸ແࢹ͠·͢ɻ

    View Slide

  28. Second: direct translation (e.g. OP_LOADI)
    Traspilation Code:
    ͦͷ··ʮ຋༁ʯͰ͖Δ΋ͷ͸ͦ͏͠·͢ɻ
    01@-0"%*͸#1'ͷํͰ΋ͨͩͷ୅ೖʹͳΓ·͢ɻ

    View Slide

  29. Then: object access to structure access
    /
    / remember local variable name
    Traspilation Code:
    ࣍͸ʮΦϒδΣΫτ΁ͷଐੑతϝιουʯΛʮߏ଄ମͷϝϯόʯʹ຋༁͠·͢ɻ
    ࠷ॳʹɺ3ʹDUYม਺͕ೖ͍ͬͯΔ͜ͱΛอଘ͓͖ͯ͠·͢ɻ

    View Slide

  30. Then: object access to structure access
    /
    / calculate offset
    /
    / with varname, symname
    /
    / c.f. ctx.send(:minor)
    ม਺໊ͱɺγϯϘϧ໊͕෼͔Ε͹
    ߏ଄ମͱͯ͠ͷϝϯό΁ͷΦϑηοτ͕ܭࢉͰ͖ɺ#1'ʹ຋༁Ͱ͖·͢ɻ

    View Slide

  31. More: if state to JUDGE/GOTO
    “R3: bool = R3 == R4”
    Traspilation Code:
    ࣍͸+.1໋ྩͰ͕͢ɺNSVCZ͸໋ྩඞཁͳͷʹɺ#1'͸໋ྩʹͳΓ·͢ɻ
    ࠩΛຒΊΔͨΊɺ·ͣɺ01@&2Λݮࢉૢ࡞ʹม׵ͯ͠͠·͍·͢ɻ

    View Slide

  32. More: if state to JUDGE/GOTO
    /
    / save GOTO label info
    /
    / Check r3 as “u32”
    Traspilation Code:
    ࣍ͷ໋ྩͰʮ3͕θϩ͔൱͔ʯΛݕࠪͯ͠(050͢Ε͹ɺ
    શମͱͯ͠01@&201@+.1/05Λ຋༁͢Δ͜ͱ͕Ͱ͖·ͨ͠ɻ

    View Slide

  33. Finally: return a register value
    /
    / R0 = R3
    /
    / exit (return R0)
    Traspilation Code:
    ࠷ޙʹSFUVSOॲཧͰ͢ɻNSVCZ͸೚ҙͷϨδελΛฦͤ·͕͢ɺ#1'͸3͚ͩͰ͢ɻ
    NSVCZ͸3Λ࢖Θͳ͍ͷͰɺܾΊଧͪͰ୅ೖͰ͖·͢ɻ

    View Slide

  34. Again: Generated BPF opcodes
    ͱ͍͏͜ͱͰɺ#1'ͷΦϖίʔυΛੜ੒Ͱ͖·ͨ͠ɻ

    View Slide

  35. Again: Generated BPF opcodes
    օ͞ΜΛஔ͍ͯߦ͔ͳ͍Α͏ɺ৭Λ෇͚ͯΈ·ͨ͠ɻόΠφϦ৭෇͚܎Ͱ͢ɻ

    View Slide

  36. Conclusion,
    demo and future
    You’ll see the magic!

    View Slide

  37. Conclusion
    • We can create a BPF binary object from Ruby script directly.
    • This BPF object + mruby + libbpf = whole command binary
    • Less context switch for programers
    • Shared data structures (in the future? Not yet implemented...)
    3VCZΛίϯύΠϧͯ͠#1'Λ࡞Γ·ͨ͠ɻશ෦3VCZͰॻ͚Δͱศརͩͱࢥ͍·͢ɻ
    কདྷ͸3VCZͷΫϥεͰɺΧʔωϧͱϢʔβϥϯυͷσʔλߏ଄Λڞ༗Ͱ͖Ε͹ͱɻকདྷ͸ɻ

    View Slide

  38. Demo #1: cgroup device filter
    cf. counterpart C
    σϞͦͷɻDHSPVQͷσόΠεΞΫηεϑΟϧλΛ࣮૷͠·͢ɻ

    View Slide

  39. Compile it with rucy command
    • Then, check the load program works properly

    View Slide

  40. Demo #2: Tracing kernel function
    • Working demo:
    • https://github.com/udzura/mruby-rubykaigi-rucy-sample
    σϞͦͷɻΧʔωϧͷUDQ@DPOOFDUؔ਺ͷݺͼग़͠ΛɺLQSPCFܦ༝ͰτϨʔε͠·͢ɻ

    View Slide

  41. Tracing TCP’s connect on the host
    • By mruby cli tool embedded with Rucy BPF object

    View Slide

  42. Restrictions and aiming at the future
    • Only implemented the really minimum BPF functionality.

    🙆: Generate very basic BPF opcodes

    🙆: Call BPF-specific helper functions (e.g. bpf_trace_printk())

    🙅: Handle a BPF Map data structure

    🙅: Inject parameters via rodata, ...
    • Gradually implement all of them. See you soon!
    ݱࡏɺຊ౰ʹ࠷௿ݶͷػೳ͔࣮͠૷͍ͯ͠·ͤΜ͕ɺͨͱ͑͹#1'.BQͷར༻ͳͲɺ
    ॱ࣍ඞཁͦ͏ͳػೳΛ࣮૷͍͖͍ͯͨ͠ؾ࣋ͪ͸͋Γ·͢ɻ

    View Slide

  43. Enjoy binary hack!
    See you in 2022

    View Slide