the series - record: job:http_requests:rate5m:avg_over_time_1w expr: avg_over_time(job:http_requests:rate5m[1w]) # Long-term standard deviation for the series - record: job:http_requests:rate5m:stddev_over_time_1w expr: stddev_over_time(job:http_requests:rate5m[1w])
job:http_requests:rate5m offset 1w # Value from last period + job:http_requests:rate5m:avg_over_time_1w # Add 1w growth trend - job:http_requests:rate5m:avg_over_time_1w offset 1w
avg_over_time(job:http_requests:rate5m[4h] offset 166h) # Rounded value from last period + job:http_requests:rate5m:avg_over_time_1w # Add 1w growth trend - job:http_requests:rate5m:avg_over_time_1w offset 1w
The right level of aggregation is the key to anomaly detection • Z-scores will only work with normally distributed data • Seasonal metrics are good for anomaly detection
Prometheus Recording Rules for Anomaly Detection https://gitlab.com/gitlab-com/runbooks • #talk-andrew-newdigate • All the code snippets: https://gitlab.com/snippets/1863717