Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Monitoring in Kubernetes with Prometheus and Grafana

Monitoring in Kubernetes with Prometheus and Grafana

Kubernetes makes it easy and reliable to deploy and run your services. But in order to optimize performance and scale them, you need to know more than just that they are running. In this talk you will learn how to set up a powerful monitoring infrastructure on Kubernetes with Prometheus and Grafana which lets you measure and analyze the latency, availability and resource usage of your complete system as well as individual services and requests.

Bastian Hofmann

July 21, 2020
Tweet

More Decks by Bastian Hofmann

Other Decks in Programming

Transcript

  1. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 1
    © Copyright 2020 Rancher Labs. All Rights Reserved. 1
    Monitoring in Kubernetes
    with Prometheus and Grafana
    BASTIAN HOFMANN
    Field Engineer - DACH

    View full-size slide

  2. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 2
    Rancher Technical Overview
    Rancher Technical Overview
    © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 5
    Rancher’s recipe for production quality Kubernetes at scale
    Shared Tooling & Services
    Security & Authentication
    Simplified Cluster Operations & Infrastructure Management
    Policy management
    Pod & network
    security policies
    CIS benchmark
    monitoring
    RBAC
    policies
    Configuration
    enforcement
    Visibility &
    diagnostics
    Centralized
    audit
    Monitoring
    & alerting
    Kubernetes version
    management
    Node pool
    management
    Cluster
    provisioning
    Amazon
    EKS
    Azure
    AKS
    Google
    GKE
    Cloud
    Datacenter Dev Branch Edge
    Secure Application Deployment
    Routing Autoscaling
    Metrics
    Load Balancing Canary Git Deployments
    Beta
    © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 1
    Rancher enables production quality Kubernetes
    operations everywhere
    Amazon
    EKS
    Azure
    AKS
    Google
    GKE
    Containerized App 1 Containerized App 2 Containerized App 3
    Cloud
    Datacenter Dev Branch Edge
    Central
    Management
    Shared Tooling & Services
    Security & Authentication
    Simplified Cluster Operations & Infrastructure Management
    Policy management
    Pod & network
    security policies
    CIS benchmark
    monitoring
    RBAC
    policies
    Configuration
    enforcement
    Visibility &
    diagnostics
    Centralized
    audit
    Monitoring
    & alerting
    Kubernetes version
    management
    Node pool
    management
    Cluster
    provisioning
    Infrastructure
    agnostic

    View full-size slide

  3. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 3
    Agenda
    • How to set up Prometheus and Grafana
    • How to get and visualize metrics from Kubernetes
    • How to get and visualize metrics from your own applications
    • How to add alerts
    • How to collect logs from your applications
    • How Service Meshes help with traffic observability

    View full-size slide

  4. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 4 4
    Demos, Demos, Demos

    View full-size slide

  5. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 5 5
    First, we need a Kubernetes Cluster

    View full-size slide

  6. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 6 6
    Demo

    View full-size slide

  7. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 7 7
    Extending Kubernetes

    View full-size slide

  8. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 8
    Standardization of
    compute, network and service discovery

    View full-size slide

  9. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 9
    Extensible API

    View full-size slide

  10. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 10
    Easy integration of additional tools

    View full-size slide

  11. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 11
    Custom Resource Definitions
    • Extend the Kubernetes API with additional Resource Definitions
    • Certificate
    • MySQLCluster
    • Prometheus
    • …
    • Deploy a controller into the cluster that listens on creation, change and
    deletion of these resources and perform the necessary actions

    View full-size slide

  12. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 12

    View full-size slide

  13. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 13 13
    Prometheus-operator

    View full-size slide

  14. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 14
    Manages and Configures
    Prometheus, Alertmanager (and Grafana)

    View full-size slide

  15. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 15
    CustomResourceDefinitions
    • Workload Management
    • Prometheus
    • Alertmanager
    • Prometheus Configuration
    • ServiceMonitor
    • PodMonitor
    • Rule

    View full-size slide

  16. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 16

    View full-size slide

  17. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 17 17
    Demo

    View full-size slide

  18. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 18 18
    Monitoring external resources

    View full-size slide

  19. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 19
    Blackbox-monitor

    View full-size slide

  20. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 20 20
    Demo

    View full-size slide

  21. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 21 21
    Central log management

    View full-size slide

  22. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 22
    Several solutions
    • Commercial
    • OpenSource
    • ElasticSearch – FluentD/FluentBit/Logstash – Kibana
    • Loki – Promtail – Grafana

    View full-size slide

  23. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 23 23
    Demo

    View full-size slide

  24. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 24 24
    Service Meshes

    View full-size slide

  25. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 25
    Service Meshes
    • Observability
    • Traffic Control
    • Traffic Security
    • Connectivity
    • On top of the Kubernetes Pod network

    View full-size slide

  26. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 26
    Service Meshes
    • Istio
    • Linkerd
    • Maesh
    • Kuma
    • …

    View full-size slide

  27. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 27

    View full-size slide

  28. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 28 28
    Service Mesh Features

    View full-size slide

  29. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 29
    Traffic Security
    • Automatic mutual TLS encryption of traffic
    • Automatic certificate management
    • Optional certificate-based authentication of traffic

    View full-size slide

  30. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 30

    View full-size slide

  31. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 31
    Traffic Control
    • Advanced traffic management between services in the mesh
    • Blue/Green Deployments
    • Canary Deployments
    • A/B Testing
    • Fault injection
    • Circuit breakers

    View full-size slide

  32. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 32

    View full-size slide

  33. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 33
    Multi-cluster Service Mesh
    • Connect separate Kubernetes cluster together
    • Securely handle traffic between clusters

    View full-size slide

  34. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 34
    Observability
    • Monitor all TCP connections, HTTP and GRCP requests
    • Bytes in/out
    • Amount of requests
    • Success rate
    • Response time
    • Visualize service communication

    View full-size slide

  35. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 36
    Tracing
    • Trace a single incoming request through all services
    • Enhance with application spans through Open Tracing

    View full-size slide

  36. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 38 38
    Demo

    View full-size slide

  37. © Copyright 2020 Rancher Labs. All Rights Reserved. Confidential 39 39
    Thank you
    Bastian Hofmann
    Field Engineer, DACH
    * [email protected]

    View full-size slide