Slide 1

Slide 1 text

Monitoring In Motion Challenges in Monitoring Kubernetes & Containers Cloud Native SF Meetup Feb 25, 2016 Ilan Rabinovitch Director, Community Datadog

Slide 2

Slide 2 text

About Me ● Long time Datadog user. ● Prior to Datadog built automation and monitoring tooling at Ooyala and Edmunds.com ● SCALE and TXLF Co-Founder Ilan Rabinovitch Datadog [email protected] @irabinovitch

Slide 3

Slide 3 text

Agenda • Monitoring 101 - Crash Course • Challenges in Monitoring Dynamic Infrastructure • Demo Time • Questions?

Slide 4

Slide 4 text

Monitoring Everything

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

@honest_update on Twitter

Slide 7

Slide 7 text

Quick Overview of Datadog • Monitoring for modern applications. • Time series storage of metrics and events. • Trending, alerting and anomaly detection. • Hundreds of integrations out of the box.

Slide 8

Slide 8 text

Monitoring 101: Categorization More at: http://goo.gl/t1Rgcg

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

Monitoring 101: Focus on symptoms More at: http://goo.gl/t1Rgcg

Slide 11

Slide 11 text

Recurse until you find root cause. More at: http://goo.gl/t1Rgcg

Slide 12

Slide 12 text

Container Monitoring Challenges

Slide 13

Slide 13 text

https://www.datadoghq.com/docker-adoption/

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

Operational Complexity •Average containers per host: N (N=4, 10/2015) •N-times as many “hosts” to manage •Affects everything

Slide 17

Slide 17 text

Operational Complexity: Scale 100 instances 400 containers

Slide 18

Slide 18 text

Operational Complexity: Scale 160 metrics per host 640 metrics per host

Slide 19

Slide 19 text

Operational Complexity: Scale 100 instances 64,000 metrics

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

Host Centric vs Service Centric

Slide 22

Slide 22 text

Host Centric vs Service Centric

Slide 23

Slide 23 text

Query Based Monitoring … … …

Slide 24

Slide 24 text

•Use tags, labels, etc on your hosts and metrics. •Pull in existing labels from your infrastructure (Region, Docker Images, K8S Tags..) Query Based Monitoring By using tags, auto-adapt!

Slide 25

Slide 25 text

Where is my application running ? What’s the total throughput of App X ? What’s its response time per tag ? (pod, version, DC) What’s the distribution of 5xx from Nginx per pod ?

Slide 26

Slide 26 text

Auto Discovery

Slide 27

Slide 27 text

Docker API Kubelet API Monitoring Agent Container A O A O A O Application Container Off-The-Shelf Application (Redis, PostgreSQL, …) Containers List Metadata Additional Metadata (Pod names, RC, …) Config Backend Integration Configurations Host Level Metrics

Slide 28

Slide 28 text

Some Pictures Dashboards and Metrics Alerts Sharing

Slide 29

Slide 29 text

Demo time