The Telegraf Toolbelt (InfluxDays SF, 2019)

The Telegraf Toolbelt (InfluxDays SF, 2019)

Telegraf is an agent for collecting, processing, aggregating, and writing metrics. With over 200 plugins, Telegraf can fetch metrics from a variety of sources, allowing you to build aggregations and write those metrics to InfluxDB, Prometheus, Kafka, and more.

In this talk, we will take a look at some of the lesser known, but awesome, plugins that are often overlooked; as well as how to use Telegraf for monitoring of Cloud Native systems.

69172dc4e4cc3e4cdd234c40adf395fa?s=128

David McKay

October 02, 2019
Tweet

Transcript

  1. The Telegraf Toolbelt It can do that, really?

  2. David McKay Developer Advocate at InfluxData @rawkode Scottish Has 9

    Pets Esoteric Programming Languages ☸ Kubernetes Release Team
  3. The Plugin Ecosystem Establishing a Base Line

  4. © 2019 InfluxData. All rights reserved. 4 Over 200 Plugins

    ➔ 169 Inputs ➔ 35 Outputs ➔ 15 Processors ➔ 14 Parsers ➔ 9 Serializers ➔ 8 Aggregators
  5. © 2019 InfluxData. All rights reserved. 5 V1.10 (May) ➔

    Inputs ◆ Google Cloud PubSub ◆ Kinesis Consumer ◆ Kube Inventory ◆ Neptune Apex ◆ Nginx Upstream Checks ◆ Multifile ◆ Stack Driver ➔ Outputs ◆ Google Cloud PubSub ➔ Serializers ◆ Nowmetric ◆ Carbon2
  6. © 2019 InfluxData. All rights reserved. 6 V1.11 (June) ➔

    Inputs ◆ bind9 ◆ Cisco GNMI ◆ Cisco MDT ◆ ECS & Fargate ◆ GitHub ◆ OpenWeatherMap ◆ PowerDNS ➔ Outputs ◆ Health ◆ Syslog ➔ Serializers ◆ Wavefront ➔ Aggregators ◆ Final
  7. © 2019 InfluxData. All rights reserved. 7 V1.12 (September) ➔

    Inputs ◆ apcupsd ◆ Docker Logs ◆ Fireboard ◆ Logstash ◆ MarkLogic ◆ OpenNTPD ◆ uWSGI ➔ Outputs ◆ Exec ➔ Parsers ◆ Form ➔ Processors ◆ Date ◆ Pivot ◆ Unpivot ◆ Tag Limit
  8. © 2019 InfluxData. All rights reserved. 8

  9. There are 3249 telegraf.conf files on GitHub

  10. © 2019 InfluxData. All rights reserved. 10

  11. I could only get 1000 telegraf.conf files from GitHub

  12. I used a sample of 1000 telegraf.conf files from GitHub

  13. Interval 73% Use 10s (Default) 5.6% use 1s 4% use

    5s 2% use 1m 1% use 30s
  14. Round Interval 90% True

  15. Jitter 90% None

  16. Omit Hostname 90% False

  17. © 2019 InfluxData. All rights reserved. 17 Output Plugins ➔

    71% 1 Output ➔ 5% 2 Outputs ➔ 2% 0 Outputs ➔ .6% 3 Outputs
  18. © 2019 InfluxData. All rights reserved. 18 Output Plugins ➔

    72% InfluxDB ➔ 5% File ➔ 2% Prometheus Client ➔ .9% Graphite ➔ .6% Kafka
  19. © 2019 InfluxData. All rights reserved. 19 Input Plugins ➔

    17% 1 Input ➔ 12% 9 Inputs ➔ 10% 8 Inputs ➔ 5% 10 Inputs ➔ 5% 11 Inputs ➔ 5% 6 Inputs ➔ 5% 7 Inputs ➔ 1 56 Inputs ?!?!
  20. © 2019 InfluxData. All rights reserved. 20 Input Plugins ➔

    58% CPU ➔ 53% Mem ➔ 52% Disk ➔ 51% System ➔ 47% DiskIO ➔ 47% Swap ➔ 40% Process ➔ 31% Kernel ➔ 28% Docker ➔ 23% Net
  21. 12% Invalid

  22. Dual Write to InfluxDB 1 and 2 Multiple Outputs 5.6%

  23. © 2019 InfluxData. All rights reserved. 23 Multiple Outputs [[outputs.influxdb]]

    urls = ["http:/ /influxdb:8086"] [[outputs.influxdb_v2]] urls = ["http:/ /influxdb2:9999"]
  24. Fetching Configuration at Runtime v1.10 Remote Configuration

  25. © 2019 InfluxData. All rights reserved. 25 Remote Configuration telegraf

    --config <http_uri>
  26. © 2019 InfluxData. All rights reserved. 26 Remote Configuration telegraf

    --config https://raw.githubusercontent.com /influxdata/telegraf/master/etc/telegraf.conf
  27. © 2019 InfluxData. All rights reserved. 27 Remote Configuration SOME_VAR=abc123

    telegraf --config <http_uri>
  28. © 2019 InfluxData. All rights reserved. 28 Remote Configuration [agent]

    interval = “${INTERVAL}” [[outputs.influxdb_v2]] token = “${TOKEN}”
  29. Handling failures without losing metrics v0.10 (2015!) Output Resiliency

  30. © 2019 InfluxData. All rights reserved. 30 Output Resiliency Use

    metric_buffer_limit to handle downtime of your outputs
  31. © 2019 InfluxData. All rights reserved. 31 InfluxDB Telegraf Application

    Software Dependencies Bare Metal / VMs Network Buffer Batch
  32. © 2019 InfluxData. All rights reserved. 32 Telegraf Application Software

    Dependencies Bare Metal / VMs Network Buffer Batch
  33. © 2019 InfluxData. All rights reserved. 33 InfluxDB Telegraf Application

    Software Dependencies Bare Metal / VMs Network Buffer Batch
  34. © 2019 InfluxData. All rights reserved. 34 Output Resiliency influxdb_listener

    http_listener v1.9 / 0.01% / It was me!
  35. © 2019 InfluxData. All rights reserved. 35 influxdb_listener Allows Telegraf

    to serve as a proxy for the /write endpoint of the InfluxDB HTTP API. Output Resiliency
  36. influxdb_listener [[inputs.influxdb_listener]] service_address = ":8086"

  37. © 2019 InfluxData. All rights reserved. 37

  38. v1 Client Libraries

  39. © 2019 InfluxData. All rights reserved. 39 http_listener_v2 Allows Telegraf

    to accept metrics over HTTP in any supported format Output Resiliency
  40. http_listener_v2 [[inputs.http_listener_v2]] service_address = ":8080" data_format = "json"

  41. JSON [[inputs.http_listener_v2]] data_format = "json" json_name_key = "name" tag_keys =

    [“go_version”] Docs
  42. Cloud Native Telegraf Bring Your Own Telegraf

  43. Plugins come at a cost

  44. ➜ docker image ls byot-sample 19.3MB telegraf 254MB

  45. © 2019 InfluxData. All rights reserved. 45 Bring Your Own

    Telegraf FROM rawkode/telegraf:byo AS build FROM alpine:3.7 AS telegraf COPY --from=build /etc/telegraf /etc/telegraf COPY --from=build /binary /bin/telegraf ENTRYPOINT [ "/bin/telegraf" ]
  46. © 2019 InfluxData. All rights reserved. 46 Bring Your Own

    Telegraf BYOT BYOT Example Click Me! Click Me!
  47. © 2019 InfluxData. All rights reserved. 47 InfluxDB Telegraf Application

    Software Dependencies Bare Metal / VMs Network Buffer Batch
  48. © 2019 InfluxData. All rights reserved. 48 InfluxDB Application Software

    Dependencies Bare Metal / VMs Network Telegraf Telegraf Telegraf Telegraf
  49. Internal Telegraf Exposes Metrics Too! v1.2 / 0.1%

  50. Internal ➔ Keep an eye on buffer_size ➔ Make sure

    you alert on metrics_dropped
  51. Sophisticated Health Checks Health Output v1.11 / 0%

  52. @app.route("/health") def healthcheck(): return "OK”

  53. @app.route("/health") def healthcheck(): return "OK”

  54. © 2019 InfluxData. All rights reserved. 54 Health Output [[outputs.health]]

    service_address = "http://:5559" namepass = ["web-metrics"] [[outputs.health.compares]] field = "response_time" lt = 0.300
  55. Controlling Ingress Kafka In, Kafka Out

  56. © 2019 InfluxData. All rights reserved. 56 InfluxDB Application Software

    Dependencies Bare Metal / VMs Network Telegraf Telegraf Telegraf Telegraf Kafka Telegraf Telegraf
  57. Demo

  58. We’ve covered less than 10 of the 200 plugins

  59. @rawkode Thank you