Slide 1

Slide 1 text

Setting up Monitoring for Kubernetes Prathamesh Sonpatki Last9.io 1

Slide 2

Slide 2 text

2

Slide 3

Slide 3 text

Monitoring is crucial - Prometheus is King of Kubernetes Monitoring 🔥 - Kube-Prometheus-Stack - Container level monitoring via cAdvisor - Cluster level monitoring via Kube State Metrics 3

Slide 4

Slide 4 text

Story of optimizing cAdvisor metrics 4

Slide 5

Slide 5 text

cAdvisor - https://github.com/google/cadvisor - Analyzes resource usage and performance of running containers - Metrics for specific hardware and software components such as disk, CPU, memory, network, process, TCP, and much more. 5

Slide 6

Slide 6 text

kubelet-hosted cAdvisor - https://github.com/prometheus-community/helm-charts/blob/main/chart s/kube-prometheus-stack - Standard deployment using Helm 6

Slide 7

Slide 7 text

Everything is Good 7

Slide 8

Slide 8 text

Everything is Good.. Right? 8

Slide 9

Slide 9 text

But wait… - 9

Slide 10

Slide 10 text

But wait… - 91K samples per minute 😥 😨 - 21 nodes, 600 pods, 125 containers - ~ 4B per month only for cAdvisor metrics - 80% of metrics are unused! 10

Slide 11

Slide 11 text

The ratio of metric samples scanned/evaluated over those ingested is never 1:1. 11

Slide 12

Slide 12 text

Let’s take an action - We don’t need all of the - Accelerator - Disk - diskIO - Network - .. - TCP - … - Let’s disable these metrics with `/-disable_metrics` 12

Slide 13

Slide 13 text

But wait… - We don’t need all of the - Accelerator - Disk - diskIO - Network - .. - TCP - … - Let’s disable these metrics with `/-disable_metrics` 13

Slide 14

Slide 14 text

Alternate Strategy - Disable this kubelet-hosted cAdvisor. 14

Slide 15

Slide 15 text

Alternate Strategy - Use alternate helm chart - 15 https://github.com/ckotzbauer/helm-charts/tree/main/charts/cadvisor

Slide 16

Slide 16 text

Alternate Strategy - 65% of savings in samples collected! 16 https://github.com/ckotzbauer/helm-charts/tree/main/charts/cadvisor

Slide 17

Slide 17 text

Prathamesh Sonpatki Last9.io Srestories.dev o11y.wiki 17