Performance Monitoring

Performance Monitoring

Herein we explore approaching service level objectives a bit differently and explore how one can measure time with less overhead to enable accounting of low-latency operations.

565250c4b8bbc8db56d434a482029a6d?s=128

Theo Schlossnagle

September 23, 2016
Tweet

Transcript

  1. PERFORMANCE MONITORING AND NOW FOR SOMETHING
 ENTIRELY DIFFERENT @postwait

  2. None
  3. PERFORMANCE IMPACTS PEOPLE REMEMBER WHY YOU DO THIS

  4. CONSIDER A GOAL 99TH PERCENTILE AT 1500MS

  5. THEY AREN’T HARD TO UNDERSTAND, JUST DECEPTIVE AT TIMES. QUICK

    TL;DR ON PERCENTILES • 99th percentile: q(0.99) • 99% of the samples are lower • 1% of the samples are higher q(0.99) = 149μs q(1) = 63ms
  6. OCCUPY! PERFORMANCE OF THE 99%

  7. None
  8. NOW CONSIDER THE PEOPLE YOU’RE BLIND 1266ms 860

  9. PERCENTAGES
 ARE NOT PEOPLE

  10. None
  11. COMPARE YOUR SLA TWO MOMENTARY VIOLATIONS VS. AN EPIC OUTAGE

  12. WITH THE ACTUAL TRAGEDY

  13. WHAT IF I TOLD YOU IT WAS OKAY TO CARE

    I KNOW IT SOUNDS CRAZY, BUT
  14. THEY’RE FASTER
 THAN
 USERS SYSTEMS

  15. IT’S REAL PROBE EFFECT important_op() st := hrtime() important_op() fn

    := hrtime() log(fn-st)
  16. OF PROBING RULES • fixed O(1) operations • no latency

    bubbles • no allocations
  17. TIME IS AN ILLUSION, LUNCHTIME DOUBLY SO - Douglas Adams

  18. TIME BUT FASTER MTEV_TIME https://github.com/circonus-labs/libmtev

  19. FAST
 &
 CORRECT LIBCIRCMETRICS https://github.com/circonus-labs/libcircmetrics

  20. OCTOPUS THE TECHNOLOGY

  21. FIGHT
 THE OCTOPUS GET OUT THERE - @postwait

  22. None