Upgrade to Pro — share decks privately, control downloads, hide ads and more …

INTERFACE by apidays 2023 - Data Collection Bas...

INTERFACE by apidays 2023 - Data Collection Basics, Anais Dotis-Georgiou, InfluxData

INTERFACE by apidays 2023
APIs for a “Smart” economy. Embedding AI to deliver Smart APIs and turn into an exponential organization
June 28 & 29, 2023

Data Collection Basics
Anais Dotis-Georgiou, Lead Developer Advocate at InfluxData

------

Check out our conferences at https://www.apidays.global/

Do you want to sponsor or talk at one of our conferences?
https://apidays.typeform.com/to/ILJeAaV8

Learn more on APIscene, the global media made by the community for the community:
https://www.apiscene.io

Explore the API ecosystem with the API Landscape:
https://apilandscape.apiscene.io/

apidays

July 11, 2023
Tweet

More Decks by apidays

Other Decks in Programming

Transcript

  1. I N F L U X D B U N

    I V E R S I T Y Data Collection Basics Getting Started Training Series Anais Dotis-Georgiou Developer Advocate, InfluxData
  2. Brought to you by InfluxDB University InfluxDB University offers free

    live and self -paced training on: • InfluxDB • Telegraf • Flux • Kapacitor • and more Scan to explore the course catalog influxdbu.com
  3. Agenda • What is Telegraf • Plugin Ecosystem • Getting

    Started with Telegraf • Extending the Ecosystem
  4. Characteristics of the data • Time-stamped • Generated in regular

    (metric) and irregular (event) time periods • Huge volumes • Real time and time sensitive
  5. InfluxDB is 3 things API & Toolset POWERFUL for real-time

    apps HIGH PERFORMANCE MASSIVE for real-time data workloads of cloud & open source developers 1 2 3 Time Series Engine Community & Ecosystem
  6. Data Collection Options • 300+ Telegraf plugins • Regular cadence

    of releases • Why use it? ◦ No code ◦ Large community ◦ Lightweight but powerful ◦ Customizable Agent-based Push (aka Telegraf) Client Libraries Agentless Pull (aka Scrapers) Native/Ecosystem • 12 Libraries: Python, C#, Java, GO, Javascript/Node.js, Ruby, PHP, et. al. • Handles batching, chunking, setting right headers, etc. • Why use them? ◦ Easy way to get started ◦ Need libraries when building custom applications • Prometheus scraper (OSS only) • Flux prometheus.from • Flux csv.from(url) • Why use them? ◦ Get data in quickly ◦ Doesn’t require agent downloads on monitoring device • Source system speaks line protocol • Examples: JMeter, NiFi, Vector, Fluentd • Influx CLI CSV Import • Why use them? ◦ Know what you want to monitor, quick and easy integration
  7. Telegraf provides the benefits of… • Low/No code • Robust

    scheduler • High-speed ingestion • Full-streaming support • Metric routing • Flexible parsing, formatting, serializing • Customizable and Extensible (ExecD plugins, Starlark) Instead of… • Writing long data scraping scripts • Worrying about unreliable data collection • Trouble scaling your data collection • Resulting in messy data • Having a lot of unnecessary data in your database
  8. Telegraf: Agent for Collecting Metrics & Events Plugin-driven server agent

    for collecting and reporting metrics • Written in Go • Single Binary, No external dependencies • Minimal memory footprint • Optimized for writing to InfluxDB • Optimized for streaming data Telegraf HTTP Syslog Kubernetes Apache Kafka InfluxDB Purpose-Built Time Series Database Collect Downsample Transform 300+ Plugins AWS Kinesis Azure Event Hubs GCP PubSub
  9. Core Telegraf functionality • Robust scheduler • Adjustments for clock-drift

    • Adjustments for job scheduling issues that may occur • In-memory metric buffers • Metric tracking with flow back-pressure in plugins like Kafka • Full-streaming support • Metric routing: name & field pass & drop • Flexible parsing, formatting, serializing
  10. Telegraf Plugin Ecosystem Input Plugins • collect metrics from system,

    services, or third-party APIs Output Plugins • write data to various destinations Processors • transform, decorate, and/or filter metrics Aggregators • create aggregate metrics (e.g. mean, min, max, quantiles, etc.) Supports:
  11. Telegraf Plugin Types Input Output Processors Aggregators 200+ 50+ 25+

    5+ Covering: Applications / Build and Deploy / Logging / Messaging / Networking / IoT/ Systems
  12. Input Plugins activemq aerospike amqp_consumer apache apcupsd aurora azure_storage_queue bcache

    beanstalkd bind bond burrow cassandra ceph cgroup chrony cisco_telemetry_mdt clickhouse cloud_pubsub cloud_pubsub_push cloudwatch conntrack consul couchbase couchdb cpu dcos disk diskio disque dmcache dns_query docker docker_log dovecot ecs elasticsearch ethtool eventhub_consumer exec execd fail2ban fibaro file filecount filestat fireboard fluentd github gnmi graylog haproxy hddtemp http http_listener_v2 http_response httpjson icinga2 infiniband influxdb
  13. Input Plugins (2) influxdb_listener influxdb_v2_listener intel_rdt internal interrupts ipmi_sensor ipset

    iptables ipvs jenkins jolokia jolokia2 jti_openconfig_telemetry kafka_consumer kafka_consumer_legacy kapacitor kernel kernel_vmstat kibana kinesis_consumer kube_inventory kubernetes lanz leofs linux_sysctl_fs logparser logstash lustre2 mailchimp marklogic mcrouter mem memcached mesos minecraft modbus mongodb monit mqtt_consumer multifile mysql nats nats_consumer neptune_apex net net_response nginx nginx_plus nginx_plus_api nginx_sts nginx_upstream_check nginx_vts nsd nsq nsq_consumer nstat ntpq nvidia_smi opcua openldap
  14. Input Plugins (3) openntpd opensmtpd openweathermap passenger pf pgbouncer phpfpm

    ping postfix postgresql postgresql_extensible powerdns powerdns_recursor processes procstat prometheus proxmox puppetagent rabbitmq raindrops ras redfish redis rethinkdb riak salesforce sensors sflow smart snmp snmp_legacy snmp_trap socket_listener solr sqlserver stackdriver statsd suricata swap synproxy syslog sysstat system systemd_units tail tcp_listener teamspeak temp tengine tomcat trig twemproxy udp_listener unbound uwsgi varnish vsphere webhooks win_eventlog win_perf_counters win_services wireguard wireless x509_cert zfs zipkin zookeeper
  15. Processor Plugins clone converter date dedup defaults enum execd filepath

    ifname override parser pivot port_name printer regex rename reverse_dns s2geo starlark strings tag_limit template topk unpivot
  16. Output Plugins amon amqp application_insights azure_monitor cloud_pubsub cloudwatch cratedb datadog

    discard dynatrace elasticsearch exec execd file graphite graylog health http influxdb influxdb_v2 instrumental kafka kinesis librato logzio mqtt nats newrelic nsq opentsdb prometheus_client riemann riemann_legacy socket_writer stackdriver sumologic syslog timestream warp10 wavefront yandex_cloud_monitoring
  17. One Telegraf, Multiple Plugins InfluxDB File Kafka CloudWatch CPU Mem

    Disk Docker Kubernetes /metrics Kafka MySQL CloudWatch InfluxDB Purpose-Built Time Series Database Collect Downsample Transform Input Process Aggregate Output - mean - min,max - count - variance - stddev - transform - decorate - filter Data Systems Data Sources
  18. Can’t find the plugin you need? Telegraf is 100% open

    source with a strong community of contributors It’s easy to write your own Telegraf plugin! 1. Follow the contribution guide for Go 2. Write your plugin in any language and run it externally with ExecD
  19. What’s in a configuration file? • The Telegraf config file

    needs to be specified for Telegraf agent to operate properly. • It contains setup for the agent, global tags, and enabled outputs (through commenting out or removing unnecessary lines)
  20. You can extend Telegraf by: • Work with the open-source

    community or submit upgrades or enhancements for existing plugins • Use ExecD to write a plugin in Go or a language of your choice • Starlark processor: calls a Starlark function for each matched metric, allowing for custom programmatic metric processing • Math operations • String operations • Renaming tags • Logic operations
  21. External plugins via ExecD • Plugin runs in its own

    process • Requires line protocol • Avoid the need for review by Telegraf team • Supports the same API as an internal plugin • Can use for non-GO plugins • Can use for licensed software plugins • Can use for any type of plugin (input, output, processor, aggregator)
  22. Customer Quotes “Telegraf is like a swiss army knife for

    connecting various MQTT sources and OPC UA sources.” —Fr. Ant. Niedermayr “Our next-generation pipeline takes advantage of Kafka and the Telegraf streaming service to create a more robust data topology. Essentially this allows us to explicitly implement the four R’s: routability, retention, resilience, and redundancy.” —Wayfair
  23. Get involved with Telegraf Telegraf GitHub: github.com/influxdata/telegraf Community Slack: influxdata.com/slack

    • #telegraf • #telegraf-dev Community Website: community.influxdata.com
  24. References • Getting started with Telegraf https://docs.influxdata.com/telegraf/latest/introduction/getting -started/ • Telegraf

    plugins https://docs.influxdata.com/telegraf/latest/plugins/ • Telegraf GitHub Page https://github.com/influxdata/telegraf • External Plugins Guide https://github.com/influxdata/telegraf/blob/master/docs/EXTERNAL_PLUGINS.md
  25. Keep Learning with InfluxDB University Gain skills and and earn

    shareable badges from InfluxDB University: • InfluxDB • Telegraf • Flux • Kapacitor • and more