Monitoring nginx

Monitoring nginx

Presented at nginx.conf 2014

6bcba0c09e7fdeed29218918248fec2f?s=128

Alexis Lê-Quôc

October 22, 2014
Tweet

Transcript

  1. Monitoring nginx Alexis Lê-Quôc, Datadog @alq

  2. Agenda • Dramatis personae • Observations • Monitoring 1 nginx

    (plus) with logs • Monitoring 1 nginx (plus) with metrics • Monitoring N nginx effectively
  3. @alq CTO at Datadog

  4. Datadog == monitoring • Monitoring as a service • Work

    really will with large, dynamic environments (e.g. clouds) • Aggregate performance metrics • Correlate nginx performance with the rest of your infrastructure
  5. None
  6. None
  7. Observations From the field

  8. Some stats • Across all monitored servers • nginx ~10%

    • Apache ~5% • CPU and CPU/$ is the dominant resource
  9. % of instances per core count 0% 10% 20% 30%

    40% Core count 1 2 4 8 12 16 24 32 10% 1% 3% 10% 30% 7% 39% 10%
  10. % of instances per type (AWS only) 0% 7.5% 15%

    22.5% 30% EC2 type c3.l c3.2xl c1.xl c3.8xl m3.l c3.xl m3.m cc2.8xl t2.m c3.4xl rest 8.6% 3.1% 4.4% 4.5% 4.7% 5% 5.3% 7.6% 13% 14% 30%
  11. Monitoring nginx 1. Monitoring with logs 2. Monitoring with status

    3. Monitoring with statsd
  12. Monitoring with logs • Canonical example of log indexers •

    Your choice of: • logstash • splunk • logentries, sumologic, loggly, etc. nginx log forwarder indexer UI
  13. Monitoring with logs nginx log forwarder indexer UI Strengths Weaknesses

    forensics & anomalies low signal-to-noise ratio content-driven analysis “black box”
  14. Monitoring with metrics • open-source: ngx_http_stub_status_module • bare-bone metrics •

    human-readable text presentation • plus: ngx_http_status_module • a lot more metrics for each function • json format • Your choice of… • Datadog, Nagios, Zabbix, etc. for open-source • Datadog for nginx plus nginx status collector aggregator UI/alerts
  15. Monitoring with metrics nginx status collector aggregator UI/alerts Strengths Weaknesses

    lightweight & real-time no insight into content “white box”
  16. Simple metrics taxonomy 1. What it measures • Work or

    resource • Focus on work because work == value • Resource analysis useful to understand performance • Use Brendan Gregg’s USE • Utilization (% over time) • Saturation (queue length) • Errors (count over time) 2. Type • Gauge: sample • Counter: accumulated sample, needs to be derived to be meaningful http://www.brendangregg.com/usemethod.html
  17. Open-source metrics Class Type Resource/Work Notes Current connections Gauge Resource

    reading, writing, idle Accepted connections Counter Resource Handled connections Counter Resource <= accepted if resource limit Requests Counter Work True purpose of the server •Latency must be measured using logs or statsd.
  18. Key “plus” metrics Class Type Resource/Work Notes 5xx Errors Counter

    Work without log analysis 5xx/sum(Nxx) Gauge Work error rate % idle/dropped connections Gauge Resource saturation active/total connections Gauge Resource upstream capacity Requests Counter Work true purpose of the server • Latency must be measured using logs or statsd.
  19. Monitoring with statsd nginx statsd UI/alerts Strengths Weaknesses lightweight, real-time,

    standard not comprehensive custom metrics, content-aware https://github.com/zebrafishlabs/nginx-statsd
  20. Example

  21. Monitoring nginx 1. Logs for content-analysis (forensics, anomalies, marketing) 2.

    Status for (white box) performance monitoring 3. statsD for custom metrics No single method gives you everything you need.
  22. Monitoring a lot of nginx 1. Requires aggregation 2. It’s

    all about Metadata (“Pet-to-cattle” mindset) 3. Correlation
  23. Aggregation • By default for log-based monitoring • Not by

    default for metric-based monitoring
  24. Metadata • Analyze by properties that are not the host

    identity • Find anomalies that are not obvious • Pet-to-cattle evolution: hosts don’t matter, services do
  25. Correlation • nginx is only one piece of the infrastructure

  26. #plug www.datadog.com

  27. Thank you! Questions/Comments? @alq