André Arko
September 17, 2016
110

# Lies, Damn Lies, and Metrics (Strange Loop 2016)

Metrics are great, and measuring things can provide tremendously useful insights. But there's a problem: metrics lie to you. Metrics just report the numbers that were measured. Analyzing those numbers is up to us, and that analysis can go wrong in so, so many ways. Learn how to arm yourself against human intuition, interpreter pauses, routing, instrumentation lag, and other issues. Don't get so caught up in instrumenting that you lose sight of why metrics exist! Make sure your metrics are telling you actionable information, instead of just accurate numbers.

## André Arko

September 17, 2016

## Transcript

1. Lies, Damn Lies,
and Metrics

2. André Arko
@indirect

3. Bundler
Managing application dependencies since 2009

4. Metrics

5. Metrics
are important

6. Metrics
tell you what
is happening

7. you rn →

8. Metrics
convince you
you understand

9. you later →

10. Averages
convince you
you understand

11. Averages
are lie-candy

12. “Normal”
5
-5 -4 -3 -2 -1 0 1 2 3 4
0
0.1
0.2
0.3
0.4

13. “Normal”
5
-5 -4 -3 -2 -1 0 1 2 3 4
0
0.1
0.2
0.3
0.4

14. Real Life
5
-5 -4 -3 -2 -1 0 1 2 3 4
0
0.1
0.2
0.3
0.4

15. brendangregg.com

16. brendangregg.com

17. just heard
“w
e
have
a
great average” →

18. The problem with averages:
If you put one hand in a bucket of ice
and the other in a bucket of hot coals,
on average, you’re comfortable.
Erik Michaels-Ober
@sferik

19. Averages

20. 10
0 1 2 3 4 5 6 7 8 9
250
0
50
100
150
200

21. Graph
the median

22. 10
0 1 2 3 4 5 6 7 8 9
250
0
50
100
150
200

23. Graph
95th percentile

24. 10
0 1 2 3 4 5 6 7 8 9
250
0
50
100
150
200

25. Graph
99th percentile

26. 10
0 1 2 3 4 5 6 7 8 9
1000
0
250
500
750

27. Aggregate graphs
another average

28. Breakout graphs
show each source

29. Seriously, do it

30. graphic by Schutz and Avenue, CC-Attribution-ShareAlike, taken from from the Wikipedia article on Anscombe's quartet

31. Average of X: 9 Average of X: 9
Average of X: 9 Average of X: 9

32. Average of Y: 7.50 Average of Y: 7.50
Average of Y: 7.50 Average of Y: 7.50

33. Average of X
Average of Y
Variance of X
Variance of Y
Correlation of X and Y
Linear regression
All four data sets have the same

than alive servers

35. site’s up if any
servers are up!

not all the servers

37. Servers

38. Servers
you have no idea
what is going on

39. really.

40. Runtime lag

41. Runtime lag
how do you tell you
lost consciousness?

42. Runtime lag
you have it.

43. Runtime lag
you have it.

44. VM lag

45. VM lag
do you have it?

46. VM lag
do you check for it?

47. VM lag
do you know how
to check for it?

48. Routing

49. Routing

50. Routing
how does it work?

51. Development
App
You

52. Production
People Router
Server
App
App
Router
Server
App
App
Router

53. Routing
how slow is it?

54. Routing
does it back up?

55. Request time

56. Request time
not the time
you measure

57. Request time
wall-clock time
from real clients

58. Request time
make requests from
around the world

59. metrics are good
So, in the end

60. know what you
are measuring
but

61. @indirect
[email protected]
Questions?