A Deep Dive into Monitoring with Skyline

A Deep Dive into Monitoring with Skyline

A talk I gave for the NYC Data Engineering Meetup at eBay. Video: http://g33ktalk.com/etsy-a-deep-dive-into-monitoring-with-skyline/

6601d82cf1b6776afd9c31f3d18294c3?s=128

Abe Stanway

July 23, 2013
Tweet

Transcript

  1. 1.
  2. 2.
  3. 4.

    41 shards 24 api servers 72 web servers 42 Gearman

    boxes 150 node Hadoop cluster 15 memcached boxes 60 search machines
  4. 5.

    41 shards 24 api servers 72 web servers 42 Gearman

    boxes 150 node Hadoop cluster 15 memcached boxes 60 search machines (plus a lot more for various services)
  5. 8.

    de • ploy /diˈploi/ Verb To release your code for

    the world to see, hopefully without breaking the Internet
  6. 18.
  7. 24.

    [1358731200, 20] [1358731200, 20] [1358731200, 20] [1358731200, 20] [1358731200, 20]

    [1358731200, 20] [1358731200, 20] [1358731200, 20] [1358731200, 60] [1358731200, 20] [1358731200, 20]
  8. 26.
  9. 29.
  10. 35.
  11. 36.
  12. 39.
  13. 46.
  14. 50.

    Graphite’s relay agent original graphite backup graphite [statsd.numStats, [1365603422, 82345]]

    pickles [statsd.numStats, [1365603432, 80611]] [statsd.numStats, [1365603412, 73421]]
  15. 51.

    Graphite’s relay agent original graphite skyline [statsd.numStats, [1365603422, 82345]] pickles

    [statsd.numStats, [1365603432, 80611]] [statsd.numStats, [1365603412, 73421]]
  16. 56.
  17. 61.
  18. 66.
  19. 74.
  20. 80.

    Basic algorithm: “A metric is anomalous if its latest datapoint

    is over three standard deviations above its moving average.”
  21. 89.
  22. 103.
  23. 104.
  24. 107.

    !=

  25. 112.