Monitoring Kubernetes Clusters with Prometheus

Monitoring Kubernetes Clusters with Prometheus Fabian Reinartz Software Engineer at
CoreOS @fabxc

Monitoring Challenges • A lot of targets to monitor •
Targets constantly change • Need high-level overview (by namespace, service, …) • Need drill-down for investigation (down to pod and below)

Prometheus Recap • Pull-based monitoring system • Multi-dimensional data model
• Handles millions of time series per instance http_requests_total{path="/home", status="200", method="GET"} 9523 http_requests_total{path="/home", status="500", method="GET"} 233 http_requests_total{path="/settings", status="200", method="GET"} 512 http_requests_total{path="/settings", status="200", method="POST"} 68

Powerful Querying histogram_quantile(0.99, sum by(path, le) (rate(request_latency_seconds_bucket[5m])) ) 99th percentile
latency on API server operations per resource? {path="/status"} 0.012 {path="/"} 0.43 {path="/api/v1/topics/:topic"} 1.31 {path="/api/v1/topics} 0.192

Powerful Querying ALERT DiskWillFillIn4Hours IF predict_linear(node_filesystem_free[1h], 4*3600) < 0 Is
any disk about to run full within 4 hours? 0 now -1h +4h

Kubernetes Integration • Always sync monitoring targets with Kubernetes API
• Use meta information to enrich metrics scrape metrics API Server sync monitoring targets

Metrics of Kubernetes Node node exporter cAdvisor API server Kubelet
etcd0 etcd1 etcd2 ... pods kube-state-metrics

Metrics of Kubernetes cAdvisor API server Kubelet Node node exporter

Metrics of Kubernetes Node node exporter API server Kubelet etcd0
cAdvisor etcd1 etcd2 ... pods kube-state-metrics

Monitoring Challenges • A lot of targets to monitor ✓
• Targets constantly change ✓ • Need high-level overview (by namespace, service, …) ✓ • Need drill-down for investigation (down to pod and below) ✓ • AND: Make monitoring trivial to deploy & operate

Managed Deployments apiVersion: prometheus.coreos.com/v1alpha1 kind: Prometheus metadata: name: prometheus-k8s spec:
replicas: 2 version: v1.3.0 Prometheus TPR defines desired Prometheus setup Operator deploys and manages Prometheus instances Operator deploy & manage Prometheus Server watch

github.com/coreos/kube-prometheus $ cluster-monitoring/deploy

etcd0 etcd1 etcd2 kube-state-metrics pods

Monitoring as a Cluster Feature

Operator (continued) apiVersion: monitoring.coreos.com/v1alpha1 kind: ServiceMonitor metadata: name: frontend labels:
tier: frontend spec: selector: matchLabels: tier: frontend endpoints: - port: web path: /metrics interval: 30s Declarative definition of how to monitor a group of services Loosely coupled via labels Part of your cluster’s API

tier: frontend spec: selector: matchLabels: tier: frontend endpoints: - port: web path: /metrics interval: 30s Select applicable services by their labels

tier: frontend spec: selector: matchLabels: tier: frontend endpoints: - port: web path: /metrics interval: 30s Declare where these services expose metrics

tier: frontend spec: selector: matchLabels: tier: frontend endpoints: - port: web path: /metrics interval: 30s Prometheus deployments include ServiceMonitors by their labels

Operator (continued) apiVersion: monitoring.coreos.com/v1alpha1 kind: Prometheus metadata: name: prometheus-frontend Spec:
version: v1.3.0 serviceMonitors: - selector: matchLabels: tier: frontend Prometheus deployments include ServiceMonitors by their labels

Service 1 Service 2 Service 3 Service 4 Service 5
ServiceMonitor 1 ServiceMonitor 2 Prometheus Operator deploy & manage Prometheus Server watch

Try it out! github.com/coreos/prometheus-operator github.com/coreos/kube-prometheus

[email protected] @fabxc QUESTIONS? Thanks! We’re hiring: coreos.com/careers Let’s talk! #prometheus
on Freenode More events: coreos.com/community LONGER CHAT? also in Berlin!

Monitoring Kubernetes Clusters with Prometheus

Monitoring Kubernetes Clusters with Prometheus

Fabian Reinartz

More Decks by Fabian Reinartz

Other Decks in Technology

Featured

Transcript