Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

Monitoring Containers Correctly

Michael
October 01, 2018

Monitoring Containers Correctly

Michael Kehoe walks you through building a small monitoring utility for cgroup containers to illustrate best practices in container monitoring. You'll explore various cgroup constraints and learn how to specifically monitor for each of them to ensure that your application is behaving as expected. Along the way, Michael shares tricks and tips about monitoring containerized applications.

Michael

October 01, 2018
Tweet

More Decks by Michael

Other Decks in Programming

Transcript

  1. Getting Started • Setup your workshop platform: • https://app.strigo.io/event/QXDpmTiRAuf Q4LBis

    • Token: F7C7 • Background slides: https://bit.ly/2NcEBQN • Code repo: https://github.com/michael- kehoe/container-monitoring-workshop • Please let me know ASAP if you’re having problems
  2. Today’s agenda 1 Introductions 2 Container Primitives 3 What we’ll

    monitor 4 Cgroup interface file formats 5 Exercises
  3. Today’s agenda Exercises 100 CPU Basics 101 CPU Enhanced 102

    CPU Advanced 200 Memory Basics 201 Memory Enhanced 300 IO Basics 400 PID
  4. Michael Kehoe $ WHOAMI • Staff Site Reliability Engineer @

    LinkedIn • Production-SRE Team • Funny accent = Australian + 4 years American • Worked on: • Networks • Micro-services • Traffic Engineering • Databases
  5. Production-SRE Team @ LinkedIn $ WHOAMI • Disaster Recovery -

    Planning & Automation • Incident Response – Process & Automation • Visibility Engineering – Making use of operational data • Reliability Principles – Defining best practice & automating it
  6. Containers Limiting the resources that can be used by a

    process/ set of processes cgroups Isolating filesystem resources Namespaces Implicit sharing or shadowing Copy on Write Locking down container privileges Linux Security Modules
  7. Cgroup • Abbreviation for ‘Control Groups’ • Provides • Resource

    Limiting • Prioritization • Accounting • Control
  8. • 100: Basic cgroup CPU utilization • 101: Enhanced cgroup

    CPU utilization (with percentiles • 102: cgroup throttles What we’ll monitor CPU
  9. • 200: Memory Basics • Cgroup utilization • 201: Enhanced

    Memory Metrics What we’ll monitor MEMORY