Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Monitor your containers with the Elastic Stack

Monica Sarbu
November 15, 2016

Monitor your containers with the Elastic Stack

Containers as well as orchestration systems like Kubernetes are quickly gaining popularity as the prefered tools for deploying and running microservices. While being easier to deploy and isolate, containerized applications are creating new challenges for the logging and monitoring systems.

One popular solution for logging and monitoring is the Elastic Stack composed of Elasticsearch, Logstash, Kibana, and Beats. This talk shows you how to use the Elastic Stack, and in particular the Beats lightweight shippers, to collect logs and metrics from your containers.

The session includes details about how to:
fetch the logs of the containers with Filebeat
collect container metrics with Metricbeat
monitor the network traffic exchanged between containers with Packetbeat
automatically discover metadata from Docker containers
visualize the collected data with predefined Kibana dashboards
scaling Logstash deployments

Monica Sarbu

November 15, 2016
Tweet

More Decks by Monica Sarbu

Other Decks in Technology

Transcript

  1. @monicasarbu Multiple data types, one place 7 •Docker metrics •flows

    •MySQL logs •diskIO •HTTP transactions •MySQL transactions •Redis metrics •Apache logs •Redis logs •CPU % •Docker metrics •Docker logs •memory % •filesystem •Redis transactions •flows
  2. • Tails log files, without parsing them • “At least

    once” guarantees, handles backpressure • Extra powers: • Multiline • JSON logs • Filtering 11 Filebeat
  3. • Filebeat adapts its speed automatically to as much as

    the next stage can process • But: be aware when benchmarking 17 This means..
  4. • Filebeat patiently waits • Log lines are not lost

    • It doesn’t allocate memory, it doesn’t buffer things on disk 18 When the next stage is down..
  5. @monicasarbu Centralize Docker logs: option 1/522 • Use the Docker

    gelf driver and the Logstash-gelf-input • Pros: • No shipper to install, send directly to Logstash • Cons: • UDP based, no delivery guarantees, no congestion control 21
  6. @monicasarbu Centralize Docker logs: option 2/522 • Use the Docker

    JSON driver, use Filebeat with the JSON support • Pros: • Simple (default driver) • Easy to add container metadata (name, labels, etc.) • `docker logs` works • Cons: • JSON driver can slow down Docker 22
  7. @monicasarbu Centralize Docker logs: option 3/522 • Use the Docker

    syslog driver, and a local syslog server, then Filebeat for shipping • Pros: • Good control over the path where the files are written, rotation strategies, etc. • Cons: • you need to manage the syslog server • metadata is serialized as string, needs to be de- serialized again (opportunity for mistakes) • multiline is difficult because data from containers can be mixed 23
  8. @monicasarbu Centralize Docker logs: option 4/522 • Use the Docker

    journald driver then Filebeat for shipping • Pros: • journald is often already available • convenient support for metadata • `docker logs` works • Cons: • Filebeat doesn’t yet support journald (a Journalbeat exists, however) 24
  9. @monicasarbu Centralize Docker logs: option 5/522 • Mount a volume

    and have your app write logs into the volume • Pros: • If your app can rotate it’s own logs, it’s very easy to setup • Scales well • Cons: • Difficult to pass metadata 25
  10. @monicasarbu Centralize Docker logs: conclusion • json driver, syslog driver,

    and shared volume are pretty good options today • journald driver might be better options in the future 26
  11. @monicasarbu Querying the Docker API • Dedicated Docker module •

    Has access to container names and labels • Easy to setup • Offers: • CPU and memory • Docker container information • network (in/out bytes, dropped) • diskIO (reads/writes) • status of containers (# of stopped, running, etc) 31 in progress
  12. @monicasarbu Reading cgroup data from /proc/ • Doesn’t require access

    to the Docker API (can be a security issue) • Works for any container runtime (Docker, rkt, runC, LXD, etc.) • Part of the system module • Automatically enhances process data with cgroup information • Cannot get the container name and labels 32
  13. #velo @monicasarbu Elasticsearch BKD trees 35 • Added for Geo-points

    • faster to index • faster to query • more disk-efficient • more memory efficient
  14. @monicasarbu 0 10000 20000 30000 40000 50000 60000 70000 80000

    float half float scaled float (factor = 4000) scaled float (factor = 100) On Disk Usage in kb Points disk usage (kb) docs_values disk usage (kb) Float values 36 • half floats • scaled floats (using a scaling factor) - great for things like percentage points
  15. #velo @monicasarbu Why Elasticsearch for time series • Horizontal scalability.

    Mature and battle tested cluster support. • Flexible aggregations (incl moving averages & Holt Winters) • One system for both logs and metrics • Timelion UI, Grafana • Great ecosystem: e.g. alerting tools 37
  16. @monicasarbu Unknown traffic, use flows •Look into data for which

    we don’t understand the application layer protocol •TLS •Protocols we don’t yet support •Get data about IP / TCP / UDP layers •number of packets & bytes •retransmissions •inter-arrival time 40