Upgrade to Pro — share decks privately, control downloads, hide ads and more …

TICK_Monitoring

5bdfc027790621e6657e58279c107b4d?s=47 praveen
February 15, 2017
170

 TICK_Monitoring

5bdfc027790621e6657e58279c107b4d?s=128

praveen

February 15, 2017
Tweet

Transcript

  1. TICK Talk Monitoring with InfluxData’s TICK stack By @_praveenshukla SystemEngineer@gojek

  2. Monitoring ? • Collecting, Processing, aggregating and displaying real-time quantitative

    data about a system, such as query counts and types, error counts and types, processing times and server life times.
  3. Why Monitoring ? • Know when things go wrong ◦

    To call human to prevent a business level issue • Be able to debug issues • Trends to see changes over time and drive technical and business decision. • To feed into other systems(Ex Automation, Security) • Alerting • Building dashboard
  4. First Principle of monitoring • Collecting the right data ▪

    Receive meaningful info ▪ Quickly investigate and get to the bottom of perf issues
  5. Metrics There are two important categories of metrics: • Work

    Metrics • Resource Metrics
  6. What good data looks like 1. Well-understood 2. Granular 3.

    Tagged by scope 4. Long-lived
  7. Monitoring Architectures • Services push metrics to the monitoring system

    • Monitoring system pulls metrics from services
  8. Monitoring PULL BASED ARCHITECTURE Application Server

  9. • Discovery • Scalability • Security • Operational complexity •

    Flexibility • Other
  10. TICK MONITORING STACK THE PLATFORM FOR TIME-SERIES DATA

  11. Overview of TICK stack • Quickly describe what time series

    data is • Describe what influxDB Data model. • Dive into internal component.
  12. None
  13. None
  14. Time series data is... A time series is a sequence

    of data points, typically consisting of successive measurements made from the same source over a time interval.
  15. None
  16. Time series database is... A database where we manage and

    store time series data.
  17. Why can’t I just use any tradition database ? •

    Yes, you can. • But by this you will end up creating time-series database not solution to a monitoring problem.
  18. None
  19. InfluxDB Basics • How do we represent points textually ?

    • Using Line protocol • Measurement Tagset Fieldset Timestamp ◦ cpu , host=H1 values=20 1445555009
  20. Measurement Tags FieldSet Timestamp Series CPU USAGE S1 S2 S3

  21. None
  22. None
  23. Telegraf • Example of Telgraf plugins • How to use

    Telegraf • Input and Output plugin architectures
  24. Telegraf Telegraf is an agent written in Go for collecting

    metrics from local and remote sources. - Designed for minimal footprint - Ingests metrics from - The host system - Common services - Third party API’s - Custom end-points - Write multiple output at the same time.
  25. None
  26. What all things can be done by Telegraf • Inputs

    ◦ Gather local system metrics ◦ Status checks on processes and services ◦ Collect data from remote API’s over HTTP ◦ Test HTTP responsiveness ◦ Parse log file using patterns to collect metrics ◦ Run custom scripts in regular intervals • Outputs ◦ Converts metrics formats ◦ Buffer metrics ◦ Reroutes metrics ◦ Batches metrics
  27. Telegraf Plugins • Input plugins ◦ Services that Telegraf can

    collect data from ◦ | cpu, mem, disk, diskio, docker | • Output Plugins ◦ Services that Telegraf can write data to ◦ | InfluxDB, Graphite, Kafka, Datadog | • Service Plugins ◦ Services can that push data to Telegraf ◦ | TCP, UDP, statsd, kafka_consumer |
  28. How telegraf works ?

  29. None
  30. Demo

  31. Questions ?