Upgrade to Pro — share decks privately, control downloads, hide ads and more …

D2-2 Jorge Salamero Sanz - You’re Monitoring Co...

D2-2 Jorge Salamero Sanz - You’re Monitoring Containers Wrong

I hate to say it, but there’s a lot of bad advice out there on how to monitor your containers in production. In this talk I hope to add some much-needed clarity on how to best monitor containers to manage the health of your applications.

My advice is built off our experience creating container monitoring solutions, and the information we’ve gotten from working with hundreds of customers deploying Docker. I’ll cover key questions like:

Why is it so hard to get visibility into Docker containers?
How far can you get with the Docker stats API?
What metrics really matter for your containerized applications?
How does Kubernetes improve monitoring? How does it get in the way?
What open source tools can help with these challenges?
In addition to this theory, I’ll get into some real examples that will ground our discussion. You’ll walk away with a new appreciation of what it takes to monitor your environment right, as well as a few ideas that you can put into practice right away.

DevOpsDays Zurich

May 08, 2017
Tweet

More Decks by DevOpsDays Zurich

Other Decks in Technology

Transcript

  1. @bencerillo / @sysdig How to Monitor Microservices on Docker? Apps

    Infra Health Checks JVM/JMX Custom metrics Metrics Processing Unicorns, rainbows And cute dashboards
  2. @bencerillo / @sysdig % whoami Jorge Salamero Sanz <[email protected]> •

    Working on OSS last 12 years • Working on monitoring last 3 years • Containers gamer @sysdig @bencerillo @sysdig
  3. @bencerillo / @sysdig Agenda • Challenges of container infrastructures •

    Traditional monitoring limitations • Best practices monitoring Microservices • Sysdig, container native monitoring & troubleshooting
  4. Traditional deployment Full host OS kernel systemd syslogd App services

    MySQL Nginx OpenSSL Java App A App B App C Ops Devs
  5. @bencerillo / @sysdig The Docker revolution Servers Virtual Machines Containers

    Unit: machine Orchestration: no Architecture: monolithic Unit: machine Orchestration: external Architecture: monolithic Unit: (micro)service Orchestration: native Architecture: distributed
  6. Containerized deployment Full host OS kernel + Docker MySQL App

    A Ops DevOps Nginx + OpenSSL App B Java 8.0 build XXX App C
  7. @bencerillo / @sysdig … but in reality Database App Cache/Frontend

    Computing node Computing node Computing node Computing node Computing node Computing node
  8. @bencerillo / @sysdig … but in reality Database App Cache/Frontend

    Computing node Computing node Computing node Computing node Computing node Computing node
  9. @bencerillo / @sysdig Container monitoring New challenges: 1. How do

    we get the metrics? 2. How do we shape all this amount of metrics? 3. Analysis and troubleshooting 4. Teams on Microservices infrastructure
  10. @bencerillo / @sysdig Container monitoring New challenges: 1. How do

    we get the metrics? 2. How do we shape all this amount of metrics? 3. Analysis and troubleshooting 4. Teams on Microservices infrastructure
  11. @bencerillo / @sysdig 1. Metric collection • We Ǻ containers,

    because: – are simple – are small – are isolated – less dependencies • … but they are an opaque blackbox
  12. @bencerillo / @sysdig “Workarounds” Agent in the Docker container Agent

    in the Kubernetes pod Export metrics through an external agent App Agent App Agent App Agent App App App 1. Complex instrumentation (x2 because just the monitoring) plus service monitoring configuration 2. Limited and pre-established metric collection (Docker API, etc)
  13. @bencerillo / @sysdig Why this is cool? • Just one

    instrumentation per host: – spawning or destroying a container is instrumentation-less • Full visibility: all the system calls: – automatic service discovery – all metrics collection (no filtering) – application monitoring without instrumentation (magic of decoding protocols)
  14. @bencerillo / @sysdig Container monitoring New challenges: 1. How do

    we get the metrics? 2. How do we shape all this amount of metrics? 3. Analysis and troubleshooting 4. Teams on Microservices infrastructure
  15. @bencerillo / @sysdig Remember... but in reality: Database App Cache/Frontend

    Computing node Computing node Computing node Computing node Computing node Computing node
  16. @bencerillo / @sysdig 2. Information aggregation • Infrastructure monitoring should

    be transparent and automatic (no instrumentation no configuration) • You should handle your custom/biz metrics • All metrics should be tagged automatically • All metrics should be aggregated and segmented on a service level basis
  17. @bencerillo @sysdig • New layers • Services / containers not

    coupled with nodes • Number of instances can dynamically change • Killing a process most probably it’s fine • Understanding Kubernetes metadata is an absolute requirement • Tagging is cool, autodiscovery is better How Kubernetes changes monitoring?
  18. @bencerillo @sysdig • Application • Services • Kubernetes deployment •

    Kubernetes internals • Host / node The 5 layers to monitor in Kubernetes
  19. @bencerillo / @sysdig Container monitoring New challenges: 1. How do

    we get the metrics? 2. How do we shape all this amount of metrics? 3. Analysis and troubleshooting 4. Teams on Microservices infrastructure
  20. @bencerillo / @sysdig 3. Analysis & troubleshooting • Imagine: strace

    + wireshark + htop + lsof + iostat + vmstat + * • Not available on containers, don’t understand namespaces • Metrics and logs can bite your in the ass, system calls have all the truth • Infrastructure gets more complex and volatile
  21. @bencerillo / @sysdig Container monitoring New challenges: 1. How do

    we get the metrics? 2. How do we shape all this amount of metrics? 3. Analysis and troubleshooting 4. Teams on Microservices infrastructure
  22. @bencerillo / @sysdig 4. Teams by service • Tags/Metadata from

    the orchestration platform, eg Kubernetes: – namespaces (dev, prod) – services, deployments, RCs, pods – custom tags • ACLs out of the box (dashboards, alerts, etc) on multi-tenant/PaaS scenarios
  23. @bencerillo / @sysdig Container monitoring New challenges: 1. How do

    we get the metrics? 2. How do we shape all this amount of metrics? 3. Analysis and troubleshooting 4. Teams on Microservices infrastructure
  24. @bencerillo / @sysdig Sysdig • 100% open-source • 1M+ downloads

    • Host analysis • sysdig.org • SaaS & on-prem • 200+ customers • Cluster analysis: Kubernetes, Openshift, Swarm, Mesos, etc. • Dashboards, alerts, events, teams • sysdig.com