Kubernetes monitoring 101

©2008–18 New Relic, Inc. All rights reserved Kubernetes Monitoring 101
Contain the Complexity of Kubernetes Sergio Moya - Senior Software Engineer @ New Relic

©2008–18 New Relic, Inc. All rights reserved • Why monitoring
is a must. • What Needs to be Monitored in Kubernetes • Metric sources • How to monitor • Q&A Agenda

©2008–18 New Relic, Inc. All rights reserved Why monitoring is
a must Ephemerality

©2008–18 New Relic, Inc. All rights reserved Kubernetes Cluster Node
Applications Pod/Deployments Containers 4 What Needs to be Monitored in Kubernetes? And more...

©2008–18 New Relic, Inc. All rights reserved • What is
the size of my Kubernetes cluster? • How many nodes, namespaces, deployments, pods, containers do I have running in my Cluster? Cluster Admin Cluster

©2008–18 New Relic, Inc. All rights reserved dsc 6 Cluster
MONITORING FOR: Cluster Overview • What is the size of my Kubernetes cluster? • How many nodes, namespaces, deployments, pods, containers do I have running in my Cluster? Cluster Admin WHAT • Snapshot of what objects are included in a Cluster WHY • Kubernetes is managed by various teams including SREs, SysAdmin, Developers so it can be difficult to keep track of the current state of a Cluster

©2008–18 New Relic, Inc. All rights reserved • Do we
have enough nodes in our cluster? • Are the resource requirements for the deployed applications overbook with existing nodes? Node Operations

©2008–18 New Relic, Inc. All rights reserved dsc 8 Node
MONITORING FOR: Node resource consumption WHAT • Resource consumption (Used cores, Used memory) for each Kubernetes node • Total Memory VS Used WHY • Ensure that your cluster remains healthy • Ensure new deployments will succeed and not be blocked by lack of resources • Do we have enough nodes in our cluster? • Are the resource requirements for the deployed applications overbook with existing nodes? Operations

©2008–18 New Relic, Inc. All rights reserved • Are things
working the way I expect them to? • Are my apps running and healthy? Pods Operations

©2008–18 New Relic, Inc. All rights reserved dsc 10 MONITORING
FOR: Pods not running WHY • Missing pods may indicate: ◦ Insufficient resources to schedule a pod ◦ Unhealthy pods: Liveness probe, readinessProbe, etc. ◦ Others • Are things working the way I expect them to? • Are my apps running and healthy? Operations Pods/ Deployment WHAT • Number of current pods in a Deployment should be the same as desired.

©2008–18 New Relic, Inc. All rights reserved • Are my
containers hitting their resource limits and affecting application performance? • Are there spikes in resource consumption? • Are there any containers in a restart loop? • How many container restarts have there been in X amount of time? Containers DevOps

FOR: Container Resources Usage WHY • If a container hits the limit of CPU usage, the application’s performances will be affected • If a container hits the limit of memory usage, K8s might terminate it or restart it • Are my containers hitting their resource limits and affecting application performance? • Are there spikes in resource consumption? DevOps Containers WHAT • Resource Request: minimum amount of resource which will be guaranteed by the scheduler • Resource Limit: is the maximum amount of the resource that the container will be allowed to consume

FOR: Container Restarts WHY • In normal conditions, container restart should not happen • A restart indicates an issue either with the container itself or the underlying host • Are there any containers in a restart loop? • How many container restarts have there been in X amount of time? DevOps Containers WHAT • A container can be restarted when it crashes or when its memory usage reaches the limit defined

©2008–18 New Relic, Inc. All rights reserved • What and
how many services does my cluster have? • Which is the current status of my Horizontal Pod Autoscalers? • Are my Persistent Volumes well provisioned? • Etc Others You

©2008–18 New Relic, Inc. All rights reserved Metric sources •
Kubernetes API • kube-state-metrics • Heapster (deprecated) • Metrics Server • Kubelet and Cadvisor

©2008–18 New Relic, Inc. All rights reserved K8s API •
No third party • Up to date • Bottleneck • Missing critical data. Ex: Pods resources Pros Cons

©2008–18 New Relic, Inc. All rights reserved Heapster • Tons
of metrics • Different backends (sinks) • Exposes Prometheus format • Plug&Play • No Prometheus backend (sink) • Resource consumption • Some sinks are not maintained • Deprecated (k8s >=v1.13.0) Pros Cons

©2008–18 New Relic, Inc. All rights reserved Kubelet + Cadvisor
• No third party • All data regarding the node, pods and containers resources • Distributed by nature • Only data about nodes, pods and containers • Some data inconsistency between the API and Kubelet Pros Cons

©2008–18 New Relic, Inc. All rights reserved K8s API Pros
- No third party - Up to date - Bottleneck - Missing critical data. Ex: Pods resources Cons kube-state- metrics - Tons of metrics - Well supported - Prometheus format - No data about not-scheduled-yet pods - Only state, no resources Heapster - Tons of metrics - Different backends (sinks) - Exposes Prometheus format - Plug&Play - No Prometheus backend (sink) - Resource consumption - Some sinks are not maintained - Deprecated (k8s >=v1.13.0) Metrics Server - Implements K8s Metrics API standard - Official - Only few metrics (CPU & Memory) - Early stage (incubator) Kubelet + Cadvisor - No third party - All data regarding the node, pods and containers resources - Distributed by nature - Only data about nodes, pods and containers - Some data inconsistency between the API and Kubelet

©2008–18 New Relic, Inc. All rights reserved Custom solutions •
Deployment of pods fetching metrics from any of the sources. • Daemonset fetching metrics the Kubelet + Cadvisor (node) • Combination of both • Others?

Kubernetes monitoring 101

Kubernetes monitoring 101

Sergio Moya

More Decks by Sergio Moya

Other Decks in Programming

Featured

Transcript

©2008–18 New Relic, Inc. All rights reserved Kubernetes Monitoring 101

©2008–18 New Relic, Inc. All rights reserved • Why monitoring

©2008–18 New Relic, Inc. All rights reserved Why monitoring is

©2008–18 New Relic, Inc. All rights reserved Kubernetes Cluster Node

©2008–18 New Relic, Inc. All rights reserved • What is

©2008–18 New Relic, Inc. All rights reserved dsc 6 Cluster

©2008–18 New Relic, Inc. All rights reserved • Do we

©2008–18 New Relic, Inc. All rights reserved dsc 8 Node

©2008–18 New Relic, Inc. All rights reserved • Are things

©2008–18 New Relic, Inc. All rights reserved dsc 10 MONITORING

©2008–18 New Relic, Inc. All rights reserved • Are my

©2008–18 New Relic, Inc. All rights reserved dsc 12 MONITORING

©2008–18 New Relic, Inc. All rights reserved dsc 13 MONITORING

©2008–18 New Relic, Inc. All rights reserved • What and

©2008–18 New Relic, Inc. All rights reserved Metric sources

©2008–18 New Relic, Inc. All rights reserved Metric sources •

©2008–18 New Relic, Inc. All rights reserved K8s API •

©2008–18 New Relic, Inc. All rights reserved kube-state-metrics • Tons

©2008–18 New Relic, Inc. All rights reserved Heapster • Tons

©2008–18 New Relic, Inc. All rights reserved Metrics Server •

©2008–18 New Relic, Inc. All rights reserved Kubelet + Cadvisor

©2008–18 New Relic, Inc. All rights reserved K8s API Pros

©2008–18 New Relic, Inc. All rights reserved How to monitor

©2008–18 New Relic, Inc. All rights reserved Heapster + InfluxDB

©2008–18 New Relic, Inc. All rights reserved Custom solutions •

©2008–18 New Relic, Inc. All rights reserved APM solutions

©2008–18 New Relic, Inc. All rights reserved How New Relic

©2008–18 New Relic, Inc. All rights reserved Q&A

©2008–18 New Relic, Inc. All rights reserved Thank you