crisis 9 respond to page in the middle of the night whatever alert sent page isn't enough to solve the problem, or you would have automated it if you can see at a glance it's faster than grepping through logs => reduce downtime
anomalies 10 notice things you haven't seen before things you haven't planned for prompt investigation into odd behavior see the interactions between lots of things
jenny downing http://www.flickr.com/photos/jenny-pics/3761648387/ 22 There was one significant anomaly - none of the monks in the adjacent monastery contracted cholera. Investigation showed that this was not an anomaly, but further evidence, for they drank only beer, which they brewed themselves
select ((total_time::float/calls)::int / 50) * 50 as ms, lpad('', count(*)::int, '') || ' ' || count(*) as freq from pg_stat_statements group by 1 order by 1; \watch 1 44
pg-extras https://github.com/heroku/heroku-pg-extras cache hit ratio index usage running queries blocked queries locks index size unused indexes sequential scans bloat 48
data is not information 51 just having a metric in graphite or libretto isn't enough -- other people won't have the same context -- you probably need to do conversions such as derivatives
color http://tinyurl.com/colorrules 54 Gray or muted background soft, nature colors for most, bright sparingly for similar categories: single hue, vary saturation different categories: vary hue, but keep saturation
storage http://www.flickr.com/photos/lightmash/3183278318/ 58 storing time series data is a talk in and of itself =>Ronald’s talk yesterday no way i can do it justice as part of this talk rollup, table rotation