Combining Logs, Metrics, and Traces for Unified Observability @ Infoshare 2020

Combining Logs, Metrics, and Traces for Unified Observability @ Infoshare 2020

Learn how Elasticsearch efficiently combines data in a single store and how Kibana is used to analyze it. Plus, see how recent developments help identify, troubleshoot, and resolve operational issues faster.

9a328142924c93e5e148c75356ba6d42?s=128

Sebastian Grodzicki

September 25, 2020
Tweet

Transcript

  1. Sebastian Grodzicki Engineering Manager @ Elastic Infoshare 2020 Combining Logs,

    Metrics, and Traces for Unified Observability
  2. #ObservaBLT Observability = + +

  3. None
  4. Higher resource utilization increases monitoring complexity • Orchestration/Hypervisor • Dynamic/ephemeral

    jobs • You can no longer "point" to where that job lives
 Shift to cloud-native yields maintainable code, with costs • Traditional licensing models don't scale as well as your applications • Hurdles with autoscaling Monitoring Complexity Hardware & software trends are evolving in tandem Evolving Architectures ~↑ Monitoring Complexity
  5. None
  6. Applications VMs/Containers Other DBs, Services & Middleware Orchestration Infrastructure APM

    Metrics Logs Uptime Uptime APM Metrics APM Logs APM APM Metrics Logs Uptime Metrics Logs Uptime APM
  7. Development Team Ops: Log Monitoring Uptime Response Time Uptime Tool

    Ops: Infra Monitoring Web Logs App Logs Database Logs Container Logs Log Tool Ops: Service Monitoring Real User Monitoring Txn Perf Monitoring Distributed Tracing APM Tool Container Metrics Host Metrics Database Metrics Network Metrics Storage Metrics Metrics Tool Status Quo: Siloed Collection of Tools
  8. Observability is a search use case

  9. APM Data Uptime Data Metrics Data Log Data Elastic Approach

    to Observability Uptime Response Time Correctness Certificate Validation Web Logs App Logs Database Logs Container Logs Real User Monitoring Txn Perf Monitoring Distributed Tracing Dependency Mapping Host/Container Metrics Database Metics Network Metrics Storage Metrics Dev & Ops Teams
  10. Unified User Interface Same UI for KPI summaries and root-cause

    analysis
  11. Correlate multiple data sources for more intelligent anomaly detection Unified

    Machine Learning
  12. Trigger off any operational data to provide unified SLA monitoring

    Unified Alerting
  13. Pricing aligned with business value Unified Licensing Model PER AGENT

    $$$$ PER HOST $$$$ PER INGEST $$$$ PER MONITOR $$$$ PER ADD-ON $$$$ • Intuitive Single, unified pricing model. No add-ons. • Cloud native No problem using with container workloads and serverless. • Future proof You pay for capacity and are not locked into a specific use case.
  14. Elastic Stack for logs

  15. Logs 64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "GET /mailman/listinfo/hsdivision HTTP/1.1" 200

    6291 64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "POST /twiki/bin/view/TWiki/WikiSyntax HTTP/1.1" 404 7352 64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "GET /twiki/bin/view/Main/DCCAndPostFix HTTP/1.1" 200 5253 For each event, print out what happened. Logs are chronological records of events
  16. Ongoing investment in log ingest & long-term retention 2015 2016

    2018 2017 2019 ELK Stack is born Logstash and Kibana released, forming an OSS logging alternative 2011-12
  17. Ongoing investment in log ingest & long-term retention 2015 2016

    2018 Elastic welcomes Beats to the family, introducing light-weight data shippers 2017 2019 ELK Stack is born Logstash and Kibana released, forming an OSS logging alternative 2011-12 Filebeat: Lightweight log shipper
  18. Ongoing investment in log ingest & long-term retention 2015 2016

    2018 Filebeat: Lightweight log shipper Elastic welcomes Beats to the family, introducing light-weight data shippers 2017 2019 Simplified ingest architecture with Filebeat modules & ingest node ELK Stack is born Logstash and Kibana released, forming an OSS logging alternative 2011-12 Modules: Out-of-the-box log parsers
  19. Elastic welcomes Beats to the family, introducing light-weight data shippers

    Ongoing investment in log ingest & long-term retention 2015 2016 Hosted Logging in Elastic Cloud & ECE Introduction of ECE enabling log clusters with index curation, hot/warm templates 2018 Filebeat: Lightweight log shipper 2017 2019 Modules: Out-of-the-box log parsers Simplified ingest architecture with Filebeat modules & ingest node ELK Stack is born Logstash and Kibana released, forming an OSS logging alternative 2011-12
  20. Ongoing investment in log ingest & long-term retention 2015 2016

    Hosted Logging in Elastic Cloud & ECE Introduction of ECE enabling log clusters with index curation, hot/warm templates 2018 2017 Cold storage for logging: Frozen Indices & ILM Curated log-based troubleshooting, improved cold storage efficiency and index lifecycle management 2019 Modules: Out-of-the-box log parsers Simplified ingest architecture with Filebeat modules & ingest node Hot. Warm. Cold. Delete. ELK Stack is born Logstash and Kibana released, forming an OSS logging alternative 2011-12 Elastic welcomes Beats to the family, introducing light-weight data shippers Filebeat: Lightweight log shipper
  21. Ongoing investment in log ingest & long-term retention 2015 2016

    Hosted Logging in Elastic Cloud & ECE Introduction of ECE enabling log clusters with index curation, hot/warm templates 2018 2017 Cold storage for logging: Frozen Indices & ILM Curated log-based troubleshooting, improved cold storage efficiency and index lifecycle management 2019 Modules: Out-of-the-box log parsers Simplified ingest architecture with Filebeat modules & ingest node ELK Stack is born Logstash and Kibana released, forming an OSS logging alternative 2011-12 Logs UI: Integrating Logs with Metrics and APM Logging libraries support Elastic Common Schema, trace-id in logs, workflow from Logs to APM Elastic welcomes Beats to the family, introducing light-weight data shippers Filebeat: Lightweight log shipper
  22. None
  23. Elastic Stack for metrics

  24. Metrics vs Logs Metrics are periodic measurement of numeric KPIs

    07/Jan/2019 16:10:00 all 2.58 0.00 0.70 1.12 0.05 95.55 server1 containerX regionA
 07/Jan/2019 16:20:00 all 2.56 0.00 0.69 1.05 0.04 95.66 server2 containerY regionB
 07/Jan/2019 16:30:00 all 2.64 0.00 0.65 1.15 0.05 95.50 server2 containerZ regionC
 
 Every x minutes, measure the CPU load, print it out, and annotate with meta-data.
 64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "GET /mailman/listinfo/hsdivision HTTP/1.1" 200 6291 64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "POST /twiki/bin/view/TWiki/WikiSyntax HTTP/1.1" 404 7352 64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "GET /twiki/bin/view/Main/DCCAndPostFix HTTP/1.1" 200 5253 For each event, print out what happened. Logs are chronological records of events
  25. Evolution of Elastic Stack to a Metrics Store BKD trees

    Data structures optimized for numerical time series analysis. Columnar storage Structured data storage, resulting in compact storage and faster analytics Rollups Aggregate older data into bigger time buckets Aggregations framework Analytics features to slice and dice data along various dimensions 2012 2016 2014 2018
  26. Elastic as an Infrastructure Metrics Solution 201? 2017 Users start

    putting metrics in Elastic Need for high-cardinality aggregations, and correlating metrics and logs 2016 2018 2019
  27. Elastic as an Infrastructure Metrics Solution 201? 2017 Users start

    putting metrics in Elastic Need for high-cardinality aggregations, and correlating metrics and logs 2016 2018 2019 Metricbeat: Turnkey metric collection Metricbeat is introduced for turnkey metrics collection
  28. Elastic as an Infrastructure Metrics Solution 201? 2017 Users start

    putting metrics in Elastic Need for high-cardinality aggregations, and correlating metrics and logs 2016 2018 2019 Metricbeat: Turnkey metric collection Metricbeat is introduced for turnkey metrics collection Time Series Visual Builder UI for advanced metrics visualization, working with pipeline aggregations
  29. Elastic as an Infrastructure Metrics Solution 201? 2017 Users start

    putting metrics in Elastic Need for high-cardinality aggregations, and correlating metrics and logs 2016 2018 2019 Metricbeat: Turnkey metric collection Metricbeat is introduced for turnkey metrics collection Time Series Visual Builder UI for advanced metrics visualization, working with pipeline aggregations Prometheus / OpenMetrics integration Enables turnkey collection in Kubernetes ecosystem and beyond
  30. Elastic as an Infrastructure Metrics Solution 201? 2017 Users start

    putting metrics in Elastic Need for high-cardinality aggregations, and correlating metrics and logs 2016 2018 2019 Metricbeat: Turnkey metric collection Metricbeat is introduced for turnkey metrics collection Time Series Visual Builder UI for advanced metrics visualization, working with pipeline aggregations Prometheus / OpenMetrics integration Enables turnkey collection in Kubernetes ecosystem and beyond Infrastructure Metrics UI Containers, hosts, services, cloud monitoring, ad-hoc metrics exploration
  31. None
  32. Elastic Stack for APM

  33. 33 Why APM? 03:43:45 Request "GET cyclops.ESProductDetailView" 03:43:57 Response "cyclops.ESProductDetailView

    200 OK" 12 seconds - zZzzZZz Example: Slow response or load times
  34. Why APM? 03:43:59 Request "POST /api/checkout" 03:43:59 Response "/api/checkout 500

    ERROR" Example: Errors & Exceptions
  35. 35 Distributed Tracing Span Span Span HTTP request Response Transaction

    Single Transaction
  36. Distributed Tracing Trace A Transaction 1 Span Span Transaction 2

    Span Transaction 3 Span Span Span Multiple Services
  37. 37 Evolution of Elastic Stack to Open Source APM Elastic

    joins forces with Opbeat A next-generation APM solution designed for developers 2017 6.1 Search for APM + more agents Enabled search & Machine Learning for APM, Java agents GA, RUM GA 6.4 Elastic APM beta release Including APM Server and curated APM UI native to Kibana 6.2 Support for open tracing enabled with Distributed tracing, added Go Agent, integrated UI with Logs & Metrics 6.6 Elastic APM GA Agents for Python, Node.js, Ruby, Javascript; Real User Monitoring Beyond
  38. APM Agents • Java • Go • .NET • Javascript

    (React / Angular) • RUM (Real User Monitoring) Language Support • Python • Ruby • Node.js • Easy to add to your applications • Designed to be lightweight • Open source • Support distributed tracing • OpenTracing compliant Auto-instrumentation of common programming frameworks
  39. Distributed Tracing & OpenTracing End-to-end transaction tracking with auto-instrumentation or

    OpenTracing IDs
  40. None
  41. • Correlate data from different sources • Ability to re-use

    analysis content • Ability to re-use Elastic-provided content Benefits • Published at: github.com/elastic/ecs • Supported in Beats and APM since 7.0 • Community feedback welcome! Status Elastic Common Schema (ECS) Supports ad-hoc analysis in Kibana Dashboards
  42. 42 Demo

  43. What now? Try it yourself!

  44. While you observe, why not protect? Elastic SIEM & Endpoint

  45. Thank you!

  46. Questions?