Save 37% off PRO during our Black Friday Sale! »

Advanced Linux Performance Monitoring With eBPF – Heinrich Hartmann

Advanced Linux Performance Monitoring With eBPF – Heinrich Hartmann

Recent kernel versions (4.5+, Ubuntu 16.4) allow a fundamentally new way of instrumenting operating systems. Instead of reading data from /proc, a large variety of kernel events can be traced and aggregated inside the kernel with eBPF. In the talk we will give a short overview of how to collect, store, and analyse high frequency events like IO-latencies and syscall counts, scheduling latencies, etc. with a monitoring system.

027edc76bf9f9c030820807f87c5dbdc?s=128

DevOpsDays Zurich

May 03, 2018
Tweet

Transcript

  1. Heinrich.Hartmann@Circonus.com Linux System Monitoring with eBPF DevOpsDays Zurich, 2018-05-03 Heinrich

    Hartmann
  2. Heinrich.Hartmann@Circonus.com System Monitoring is about Kernel & Hardware

  3. Heinrich.Hartmann@Circonus.com Best Practice: The USE Method https://www.circonus.com/2017/08/system-monitoring-with-the-use-dashboard CPU Memory Network

    Disks Utilization Saturation Errors
  4. Heinrich.Hartmann@Circonus.com Best Practice: The USE Method https://www.circonus.com/2017/08/system-monitoring-with-the-use-dashboard CPU Memory Network

    Disks Utilization Saturation Errors
  5. Heinrich.Hartmann@Circonus.com Lot’s of Unknowns remaining https://www.circonus.com/2017/08/system-monitoring-with-the-use-dashboard ? ? ? ~

    ~ ~ CPU Memory Network Disks Utilization Saturation Errors
  6. Heinrich.Hartmann@Circonus.com eBPF allows unparalleled insights https://github.com/iovisor/bcc Credits: - Brendan Gregg

    @ Netflix (Sun) - Sasha Goldshtein @ Sela, Microsoft - Brenden Blanco @ VMWare - Linus Torvalds, et. al.
  7. Heinrich.Hartmann@Circonus.com eBPF allows unparalleled insights https://github.com/iovisor/bcc Credits: - Brendan Gregg

    @ Netflix (Sun) - Sasha Goldshtein @ Sela, Microsoft - Brenden Blanco @ VMWare - Linus Torvalds, et. al.
  8. Heinrich.Hartmann@Circonus.com CPU: Scheduling Latency

  9. Heinrich.Hartmann@Circonus.com Disk: Block-I/O Latency

  10. Heinrich.Hartmann@Circonus.com Disk: Block-I/O Latency

  11. Heinrich.Hartmann@Circonus.com Disk: Block-I/O Latency over time

  12. Heinrich.Hartmann@Circonus.com Disk: Block-I/O Latency over time

  13. Heinrich.Hartmann@Circonus.com Don’t shout in the Datacenter Brendan Gregg (2008) https://www.youtube.com/watch?v=tDacjrSCeq4

  14. Heinrich.Hartmann@Circonus.com System Calls: The Kernel API Monitor Rate Errors Duration

    System Call API
  15. Heinrich.Hartmann@Circonus.com Syscalls: Rate / Count sched_yield (2tn) clock_time (1.5tn) recvfrom

    (300bn) 394 Metrics
  16. Heinrich.Hartmann@Circonus.com Syscalls: Duration 1 us 10 us

  17. Heinrich.Hartmann@Circonus.com Syscall durations span >8 orders of magnitude 1s 100

    ms 10 us 1.5 tn events total
  18. Heinrich.Hartmann@Circonus.com File System: Latency

  19. Heinrich.Hartmann@Circonus.com Memory: Allocation Latency

  20. Heinrich.Hartmann@Circonus.com Further Reading Slides: @HeinrichHartman / #DevOpsDaysZH Code: https://github.com/circonus-labs/nad/.../bccbpf Blog:

    http://www.circonus.com/2018/05/linux-system-monitoring-with-ebpf/