Upgrade to Pro — share decks privately, control downloads, hide ads and more …

eBPF Vienna - BPF: Evolution of a Loop

Filip Nikolic
September 25, 2024

eBPF Vienna - BPF: Evolution of a Loop

BPF: evolution of a loop - by Anton Protopopov

Presented at eBPF Vienna

Filip Nikolic

September 25, 2024
Tweet

More Decks by Filip Nikolic

Other Decks in Technology

Transcript

  1. OS Kernel • Hardware is… hard • So engineers designed

    the concept of OS–an abstraction layer Userspace Kernel process VM RAM
  2. Running custom code Kernel • New use cases make us

    to patch the kernel • This could be hard to upgrade kernel => kernel modules • Modules are source of various problems and can easily crash the kernel
  3. Enter BPF • BPF allows users to compile and run

    custom code inside kernel space • But, unlike modules, this is safe
  4. eBPF – modern times Evolution: • 32bit architecture -> 64

    bit architecture • More registers, more instructions Revolution: • Maps & Kernel helpers • Verifier the almighty • Compiler
  5. eBPF verifier Static code analyzer walking in-kernel copy of BPF

    program instructions ➔ Ensuring program termination ◆ DFS traversal to check program is a DAG ◆ Preventing unbounded loops ◆ Preventing out-of-bounds or malformed jumps ➔ Ensuring memory safety ◆ Preventing out-of-bounds memory access ◆ Preventing use-after-free bugs and object leaks ◆ Also mitigating vulnerabilities in the underlying hardware (Spectre) ➔ Ensuring type safety ◆ Preventing type confusion bugs ◆ BPF Type Format (BTF) for access to (kernel’s) aggregate types ➔ Preventing hardware exceptions (division by zero) ◆ For unknown scalars, instructions rewritten to follow aarch64 spec [ this slide is copy/pasted/edited from eBPF - The Silent Platform Revolution from Cloud Native ]
  6. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  7. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  8. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  9. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  10. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  11. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  12. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  13. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  14. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  15. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  16. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  17. Verifier: DFS search 0 1 2 3 0: r0 =

    0 1: jneq r1, #0, +1 2: r0 = 2 3: exit
  18. eBPF verifier Works by simulating execution of all paths of

    the program ➔ Follows control flow graph ◆ For each instruction computes set of possible states (BPF register set & stack) ◆ Performs safety checks (e.g. memory access) depending on current instruction ◆ Register spill/fill tracking for program’s private BPF stack ➔ Back-edges in control flow graph ◆ Bounded loops by brute-force simulating all iterations up to a limit ➔ Dealing with potentially large number of states ◆ Path pruning logic compares current state vs prior states • Current path “equivalent” to prior paths with safe exit? ◆ Function-by-function verification for state reduction ◆ On-demand scalar precision (back-)tracking for state reduction ◆ Terminates with rejection upon surpassing “complexity” threshold [ this slide is copy/pasted/edited from eBPF - The Silent Platform Revolution from Cloud Native ]
  19. eBPF verifier BPF register state tracking BPF reg type id

    off var_off s64min s64max u64min u64max s32min s32max u32min u32max u32 u32 s32 tnum s64 s64 u64 u64 s32 s32 u32 u32 uninit, scalar, ptr_to_* types. Types can be composable, e.g. or’ed with ptr_maybe_null. Fixed part of pointer offset (pointer types only). tnum value u64 mask u64 Represents knowledge of actual value for scalars (known and unknown bits). Determined signed and unsigned 64 and 32-bit (sub-register) bounds. Coupled to the var_off tnum, holding a lower and upper bound of the unknown value. Used to determine if any memory access using this register will result in a bad access. … Identifier for state propagation (e.g. learned bits from conditions) [ this slide is copy/pasted/edited from eBPF - The Silent Platform Revolution from Cloud Native ]
  20. Execution simulation example • I will not do an example

    here. A full program walkthrough would be a long animation • In one of his talks, Daniel did an awesome visualisation of just a toy example, see the slides here struct { uint8_t index; int32_t value; int32_t array[256]; } s; s.array[s.index] = -s.value;
  21. • Like in cBPF, no loops were allowed • (Back

    edges are allowed, but they need to lead to “explored” paths.) Loops, version 1: no loops
  22. Finally, real bounded loops were added (after a few attempts,

    see e.g., John’s talk in 2018) by Alexey in 2019 The main idea of verification: • Allow loops in DFS search • Detect loops during abstract execution • … Brute force Loops, version 2: bounded loops
  23. Bounded loops are ok, but easy to misuse. Consider this

    example (taken from great documentation resource on BPF started by our colleague Dylan Reimerink: https://ebpf-docs.dylanreimerink.nl) Loops, version 2: bounded loops
  24. Bounded loops are ok, but easy to misuse. Consider this

    example (taken from great documentation resource on BPF started by our colleague Dylan Reimerink: https://ebpf-docs.dylanreimerink.nl) Loops, version 2: bounded loops ≤ 65535
  25. • Maps can be big, and scanning a whole map

    can easily overflow the verifier state (limit for bounded loops is ~8K) • bpf_map_for_each_elem() helper to the rescue! • It takes a map, and a callback function as argument • From verifier’s point of view, this is for sure terminates (helper) • From user’s point of view: we now can iterate any maps! Safe and profit Loops, version 3: helpers to the rescue!
  26. • The idea of bpf_map_for_each_elem() was later generalized • bpf_loop()

    takes #nr_loops as argument, and can execute a callback (up to ~8M times) • (Callback can, of course, break the loop earlier) Loops, version 4: more helpers
  27. • The bpf_loop(), while useful, had its limitations, and the

    next version of loops support added “open-coded” iterators Loops, version 5: open-coded iterators
  28. • The bpf_loop(), while useful, had its limitations, and the

    next version of loops support added “open-coded” iterators Loops, version 5: open-coded iterators
  29. • A new BPF instruction, may_goto #label can be used

    to notify verifier that we expect a loop to break Loops, version 6: iterator, which doesn’t iterate
  30. • A new BPF instruction, may_goto #label can be used

    to notify verifier that we expect a loop to break Loops, version 6: iterator, which doesn’t iterate
  31. • A new BPF instruction, may_goto #label can be used

    to notify verifier that we expect a loop to break Loops, version 6: iterator, which doesn’t iterate
  32. • Previously, all the code in verifier expected that there

    could be only up to two branches: fallthrough and a conditional jump • I am now working on adding indirect jumps in BPF, which will let programs to jump to more than two basic blocks More branches
  33. • Safety is an Invariant of BPF infrastructure • BPF

    evolved from simple infrastructure to a pretty complex one (see, .e.g, BPF is Turing complete or Doom in BPF) while keeping the “safety” property on BPF Safety Invariant