Save 37% off PRO during our Black Friday Sale! »

Beginner's Guide to eBPF Programming for Networking

676c8aec28ade455c442e648abfa1db5?s=47 Liz Rice
October 11, 2021

Beginner's Guide to eBPF Programming for Networking

eBPF has been described as “Superpowers for Linux,” and recently we’ve seen an explosion of tools that use it to power networking, observability and security in the Cloud Native world. It's an exciting technology that enables running bespoke programs directly in the kernel. In this talk Liz uses live-coding examples to explore how eBPF programs are loaded and run in the kernel, and attached to a variety of networking-related events. You might have seen Liz give a similar talk before, with examples hooking into system calls. This updated version focuses networking examples, giving insight into how eBPF programs can inspect and manipulate packets to form the basis of sophisticated and high-performance networking tools.

676c8aec28ade455c442e648abfa1db5?s=128

Liz Rice

October 11, 2021
Tweet

Transcript

  1. Liz Rice Chief Open Source Officer, Isovalent @lizrice A Beginner’s

    Guide to eBPF Programming for networking
  2. @lizrice eBPF lets you run custom code in the kernel

  3. @lizrice Attaching eBPF to events eBPF programs are event-driven and

    are run when the kernel or an application passes a certain hook point. Pre-defined hooks include system calls, function entry/exit, kernel tracepoints, network events, and several others. ebpf.io/what-is-ebpf/
  4. @lizrice userspace kernel syscalls app “Hello world” event execve() eBPF

    Hello World
  5. @lizrice SEC("kprobe/sys_execve") int hello(void *ctx) { bpf_printk("I'm alive!"); return 0;

    } eBPF Hello World $ sudo ./hello bash-20241 [004] d... 84210.752785: 0: I'm alive! bash-20242 [004] d... 84216.321993: 0: I'm alive! bash-20243 [004] d... 84225.858880: 0: I'm alive! Info about process that called execve syscall + userspace code to load eBPF program
  6. @lizrice Program types enum bpf_prog_type { BPF_PROG_TYPE_UNSPEC, BPF_PROG_TYPE_SOCKET_FILTER, BPF_PROG_TYPE_KPROBE, BPF_PROG_TYPE_SCHED_CLS,

    BPF_PROG_TYPE_SCHED_ACT, BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_XDP, BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_CGROUP_SKB, BPF_PROG_TYPE_CGROUP_SOCK, BPF_PROG_TYPE_LWT_IN, BPF_PROG_TYPE_LWT_OUT, BPF_PROG_TYPE_LWT_XMIT, BPF_PROG_TYPE_SOCK_OPS, BPF_PROG_TYPE_SK_SKB, BPF_PROG_TYPE_CGROUP_DEVICE, BPF_PROG_TYPE_SK_MSG, BPF_PROG_TYPE_RAW_TRACEPOINT, BPF_PROG_TYPE_CGROUP_SOCK_ADDR, BPF_PROG_TYPE_LWT_SEG6LOCAL, BPF_PROG_TYPE_LIRC_MODE2, BPF_PROG_TYPE_SK_REUSEPORT, BPF_PROG_TYPE_FLOW_DISSECTOR, /* See /usr/include/linux/bpf.h for the full list. */ };
  7. @lizrice Program types enum bpf_prog_type { BPF_PROG_TYPE_UNSPEC, BPF_PROG_TYPE_SOCKET_FILTER, BPF_PROG_TYPE_KPROBE, BPF_PROG_TYPE_SCHED_CLS,

    BPF_PROG_TYPE_SCHED_ACT, BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_XDP, BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_CGROUP_SKB, BPF_PROG_TYPE_CGROUP_SOCK, BPF_PROG_TYPE_LWT_IN, BPF_PROG_TYPE_LWT_OUT, BPF_PROG_TYPE_LWT_XMIT, BPF_PROG_TYPE_SOCK_OPS, BPF_PROG_TYPE_SK_SKB, BPF_PROG_TYPE_CGROUP_DEVICE, BPF_PROG_TYPE_SK_MSG, BPF_PROG_TYPE_RAW_TRACEPOINT, BPF_PROG_TYPE_CGROUP_SOCK_ADDR, BPF_PROG_TYPE_LWT_SEG6LOCAL, BPF_PROG_TYPE_LIRC_MODE2, BPF_PROG_TYPE_SK_REUSEPORT, BPF_PROG_TYPE_FLOW_DISSECTOR, /* See /usr/include/linux/bpf.h for the full list. */ }; eBPF - not just for syscalls!
  8. @lizrice

  9. @lizrice Also, many perf events sudo perf list

  10. @lizrice Network events (a very non-comprehensive guide)

  11. @lizrice Program types enum bpf_prog_type { BPF_PROG_TYPE_UNSPEC, BPF_PROG_TYPE_SOCKET_FILTER, BPF_PROG_TYPE_KPROBE, BPF_PROG_TYPE_SCHED_CLS,

    BPF_PROG_TYPE_SCHED_ACT, BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_XDP, BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_CGROUP_SKB, BPF_PROG_TYPE_CGROUP_SOCK, BPF_PROG_TYPE_LWT_IN, BPF_PROG_TYPE_LWT_OUT, BPF_PROG_TYPE_LWT_XMIT, BPF_PROG_TYPE_SOCK_OPS, BPF_PROG_TYPE_SK_SKB, BPF_PROG_TYPE_CGROUP_DEVICE, BPF_PROG_TYPE_SK_MSG, BPF_PROG_TYPE_RAW_TRACEPOINT, BPF_PROG_TYPE_CGROUP_SOCK_ADDR, BPF_PROG_TYPE_LWT_SEG6LOCAL, BPF_PROG_TYPE_LIRC_MODE2, BPF_PROG_TYPE_SK_REUSEPORT, BPF_PROG_TYPE_FLOW_DISSECTOR, /* See /usr/include/linux/bpf.h for the full list. */ };
  12. @lizrice Kprobes / kretprobes Entry to / exit from a

    kernel function Lots of kernel functions relate to networking example tcp_v4_connect() kernel function
  13. @lizrice Program types enum bpf_prog_type { BPF_PROG_TYPE_UNSPEC, BPF_PROG_TYPE_SOCKET_FILTER, BPF_PROG_TYPE_KPROBE, BPF_PROG_TYPE_SCHED_CLS,

    BPF_PROG_TYPE_SCHED_ACT, BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_XDP, BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_CGROUP_SKB, BPF_PROG_TYPE_CGROUP_SOCK, BPF_PROG_TYPE_LWT_IN, BPF_PROG_TYPE_LWT_OUT, BPF_PROG_TYPE_LWT_XMIT, BPF_PROG_TYPE_SOCK_OPS, BPF_PROG_TYPE_SK_SKB, BPF_PROG_TYPE_CGROUP_DEVICE, BPF_PROG_TYPE_SK_MSG, BPF_PROG_TYPE_RAW_TRACEPOINT, BPF_PROG_TYPE_CGROUP_SOCK_ADDR, BPF_PROG_TYPE_LWT_SEG6LOCAL, BPF_PROG_TYPE_LIRC_MODE2, BPF_PROG_TYPE_SK_REUSEPORT, BPF_PROG_TYPE_FLOW_DISSECTOR, /* See /usr/include/linux/bpf.h for the full list. */ };
  14. @lizrice userspace kernel Network connection Socket filter IP Socket Raw

    socket App syscalls Qdisc TCP/UDP/ICMP
  15. @lizrice Socket filter “The filtering actions include dropping packets (if

    the program returns 0) or trimming packets (if the program returns a length less than the original). … Note that we're not trimming or dropping the original packet which would still reach the intended socket intact; we're working with a copy of the packet metadata which raw sockets can access for observability. “ https://blogs.oracle.com/linux/post/bpf-a-tour-of-program-types
  16. @lizrice Socket filter Network packet data copy Filters what gets

    sent to userspace, for performant observability example attach_raw_socket()
  17. @lizrice Program types enum bpf_prog_type { BPF_PROG_TYPE_UNSPEC, BPF_PROG_TYPE_SOCKET_FILTER, BPF_PROG_TYPE_KPROBE, BPF_PROG_TYPE_SCHED_CLS,

    BPF_PROG_TYPE_SCHED_ACT, BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_XDP, BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_CGROUP_SKB, BPF_PROG_TYPE_CGROUP_SOCK, BPF_PROG_TYPE_LWT_IN, BPF_PROG_TYPE_LWT_OUT, BPF_PROG_TYPE_LWT_XMIT, BPF_PROG_TYPE_SOCK_OPS, BPF_PROG_TYPE_SK_SKB, BPF_PROG_TYPE_CGROUP_DEVICE, BPF_PROG_TYPE_SK_MSG, BPF_PROG_TYPE_RAW_TRACEPOINT, BPF_PROG_TYPE_CGROUP_SOCK_ADDR, BPF_PROG_TYPE_LWT_SEG6LOCAL, BPF_PROG_TYPE_LIRC_MODE2, BPF_PROG_TYPE_SK_REUSEPORT, BPF_PROG_TYPE_FLOW_DISSECTOR, /* See /usr/include/linux/bpf.h for the full list. */ };
  18. @lizrice XDP express data path “What if we could run

    eBPF on the network interface card?”
  19. @lizrice kernel NIC / driver eBPF program packet arrives network

    stack Physical network connection XDP
  20. @lizrice kernel NIC eBPF program packet arrives network stack Physical

    network connection XDP Only some NICs / drivers support XDP
  21. @lizrice kernel eBPF program packet arrives network stack Virtual network

    connection XDP eth0
  22. @lizrice XDP express data path Inbound packets Pass / drop

    / manipulate / redirect packets example attach_xdp()
  23. @lizrice Program types enum bpf_prog_type { BPF_PROG_TYPE_UNSPEC, BPF_PROG_TYPE_SOCKET_FILTER, BPF_PROG_TYPE_KPROBE, BPF_PROG_TYPE_SCHED_CLS,

    BPF_PROG_TYPE_SCHED_ACT, BPF_PROG_TYPE_TRACEPOINT, BPF_PROG_TYPE_XDP, BPF_PROG_TYPE_PERF_EVENT, BPF_PROG_TYPE_CGROUP_SKB, BPF_PROG_TYPE_CGROUP_SOCK, BPF_PROG_TYPE_LWT_IN, BPF_PROG_TYPE_LWT_OUT, BPF_PROG_TYPE_LWT_XMIT, BPF_PROG_TYPE_SOCK_OPS, BPF_PROG_TYPE_SK_SKB, BPF_PROG_TYPE_CGROUP_DEVICE, BPF_PROG_TYPE_SK_MSG, BPF_PROG_TYPE_RAW_TRACEPOINT, BPF_PROG_TYPE_CGROUP_SOCK_ADDR, BPF_PROG_TYPE_LWT_SEG6LOCAL, BPF_PROG_TYPE_LIRC_MODE2, BPF_PROG_TYPE_SK_REUSEPORT, BPF_PROG_TYPE_FLOW_DISSECTOR, /* See /usr/include/linux/bpf.h for the full list. */ };
  24. @lizrice userspace kernel Network connection Traffic control (ingress) IP Socket

    Raw socket App syscalls Qdisc TCP/UDP/ICMP
  25. @lizrice Traffic control Traffic filters, attached to queueing disciplines Ingress

    / egress (separately) Pass / drop / manipulate / redirect packets example tc(“add-filter)
  26. @lizrice userspace kernel Network connection Traffic control ingress - ping

    reply IP Socket Raw socket App syscalls Qdisc TCP/UDP/ICMP
  27. @lizrice Fewer perf events using TC pingpong sudo perf trace

    -e “net:*” ping -c1 <addr>
  28. @lizrice eBPF networking enables efficiency & high performance

  29. @lizrice host pod app socket veth veth eth0 iptables conntrack

    iptables INPUT Linux routing iptables PREROUTING mangle iptables conntrack iptables FORWARD Linux routing iptables PREROUTING nat iptables POSTROUTING mangle iptables PREROUTING mangle iptables POSTROUTING nat
  30. @lizrice host pod app socket veth veth eth0 iptables conntrack

    iptables INPUT Linux routing iptables PREROUTING mangle Linux routing
  31. @lizrice eBPF can instrument apps without any app or config

    changes
  32. @lizrice userspace pod container sidecar container A sidecar has a

    view across one pod
  33. @lizrice userspace pod container sidecar container my-app.yaml containers: - name:

    my-app ... - name: my-app-init … - name: my-sidecar ... Sidecars need YAML
  34. @lizrice userspace pod container container my-app.yaml containers: - name: my-app

    ... - name: my-app-init … eBPF does not need app changes kernel
  35. @lizrice Inspect packets → Observability Identity-aware data flows, message parsing,

    security forensics... Drop or modify packets → Security Network policies, encryption... Redirect packets → Networking functions Load balancing, routing, service mesh... eBPF-enabled networking capabilities
  36. @lizrice eBPF enables next-gen service mesh high performance without any

    app or config changes
  37. @lizrice Thank you github.com/lizrice/ebpf-beginners ebpf.io | cilium.io | isovalent.com