PERFORMANCE
MONITORING
AND NOW FOR SOMETHING
ENTIRELY DIFFERENT
@postwait
Slide 2
Slide 2 text
No content
Slide 3
Slide 3 text
PERFORMANCE IMPACTS PEOPLE
REMEMBER WHY YOU DO THIS
Slide 4
Slide 4 text
CONSIDER A GOAL
99TH PERCENTILE AT 1500MS
Slide 5
Slide 5 text
THEY AREN’T HARD TO UNDERSTAND, JUST DECEPTIVE AT TIMES.
QUICK TL;DR ON PERCENTILES
• 99th percentile: q(0.99)
• 99% of the samples are lower
• 1% of the samples are higher
q(0.99) = 149μs
q(1) = 63ms
Slide 6
Slide 6 text
OCCUPY!
PERFORMANCE OF THE 99%
Slide 7
Slide 7 text
No content
Slide 8
Slide 8 text
NOW CONSIDER THE PEOPLE
YOU’RE BLIND
1266ms
860
Slide 9
Slide 9 text
PERCENTAGES
ARE NOT PEOPLE
Slide 10
Slide 10 text
No content
Slide 11
Slide 11 text
COMPARE YOUR SLA
TWO MOMENTARY VIOLATIONS VS. AN EPIC OUTAGE
Slide 12
Slide 12 text
WITH THE ACTUAL TRAGEDY
Slide 13
Slide 13 text
WHAT IF I TOLD YOU IT WAS OKAY TO CARE
I KNOW IT SOUNDS CRAZY, BUT
Slide 14
Slide 14 text
THEY’RE
FASTER
THAN
USERS
SYSTEMS
Slide 15
Slide 15 text
IT’S REAL
PROBE EFFECT
important_op()
st := hrtime()
important_op()
fn := hrtime()
log(fn-st)
Slide 16
Slide 16 text
OF PROBING
RULES
• fixed O(1) operations
• no latency bubbles
• no allocations
Slide 17
Slide 17 text
TIME
IS AN ILLUSION, LUNCHTIME DOUBLY SO
- Douglas Adams
Slide 18
Slide 18 text
TIME
BUT FASTER
MTEV_TIME
https://github.com/circonus-labs/libmtev
Slide 19
Slide 19 text
FAST
&
CORRECT
LIBCIRCMETRICS
https://github.com/circonus-labs/libcircmetrics