• Handles millions of time series per instance http_requests_total{path="/home", status="200", method="GET"} 9523 http_requests_total{path="/home", status="500", method="GET"} 233 http_requests_total{path="/settings", status="200", method="GET"} 512 http_requests_total{path="/settings", status="200", method="POST"} 68
• Targets constantly change ✓ • Need high-level overview (by namespace, service, …) ✓ • Need drill-down for investigation (down to pod and below) ✓ • AND: Make monitoring trivial to deploy & operate
tier: frontend spec: selector: matchLabels: tier: frontend endpoints: - port: web path: /metrics interval: 30s Declarative definition of how to monitor a group of services Loosely coupled via labels Part of your cluster’s API