$30 off During Our Annual Pro Sale. View Details »

Combining Logs, Metrics, and Traces for Unified Observability @ Infoshare 2020

Combining Logs, Metrics, and Traces for Unified Observability @ Infoshare 2020

Learn how Elasticsearch efficiently combines data in a single store and how Kibana is used to analyze it. Plus, see how recent developments help identify, troubleshoot, and resolve operational issues faster.

Sebastian Grodzicki

September 25, 2020
Tweet

More Decks by Sebastian Grodzicki

Other Decks in Technology

Transcript

  1. Sebastian Grodzicki
    Engineering Manager @ Elastic
    Infoshare 2020
    Combining Logs, Metrics, and Traces
    for Unified Observability

    View Slide

  2. #ObservaBLT
    Observability
    =
    +
    +

    View Slide

  3. View Slide

  4. Higher resource utilization
    increases monitoring complexity
    • Orchestration/Hypervisor
    • Dynamic/ephemeral jobs
    • You can no longer "point" to where
    that job lives

    Shift to cloud-native yields
    maintainable code, with costs
    • Traditional licensing models don't
    scale as well as your applications
    • Hurdles with autoscaling
    Monitoring Complexity
    Hardware & software trends are evolving in tandem
    Evolving Architectures ~↑ Monitoring Complexity

    View Slide

  5. View Slide

  6. Applications VMs/Containers
    Other DBs,
    Services &
    Middleware
    Orchestration Infrastructure
    APM
    Metrics
    Logs
    Uptime
    Uptime
    APM Metrics
    APM Logs
    APM
    APM
    Metrics
    Logs
    Uptime
    Metrics
    Logs
    Uptime
    APM

    View Slide

  7. Development
    Team
    Ops: Log
    Monitoring
    Uptime
    Response Time
    Uptime Tool
    Ops: Infra
    Monitoring
    Web Logs
    App Logs
    Database Logs
    Container Logs
    Log Tool
    Ops: Service
    Monitoring
    Real User Monitoring
    Txn Perf Monitoring
    Distributed Tracing
    APM Tool
    Container Metrics
    Host Metrics
    Database Metrics
    Network Metrics
    Storage Metrics
    Metrics Tool
    Status Quo: Siloed Collection of Tools

    View Slide

  8. Observability is a search use case

    View Slide

  9. APM Data Uptime Data
    Metrics Data
    Log Data
    Elastic Approach to Observability
    Uptime
    Response Time
    Correctness
    Certificate
    Validation
    Web Logs
    App Logs
    Database Logs
    Container Logs
    Real User Monitoring
    Txn Perf Monitoring
    Distributed Tracing
    Dependency Mapping
    Host/Container Metrics
    Database Metics
    Network Metrics
    Storage Metrics
    Dev & Ops Teams

    View Slide

  10. Unified User Interface
    Same UI for KPI summaries and root-cause analysis

    View Slide

  11. Correlate multiple data sources for more intelligent anomaly detection
    Unified Machine Learning

    View Slide

  12. Trigger off any operational data to provide unified SLA monitoring
    Unified Alerting

    View Slide

  13. Pricing aligned with business value
    Unified Licensing Model
    PER
    AGENT
    $$$$
    PER
    HOST
    $$$$
    PER
    INGEST
    $$$$
    PER
    MONITOR
    $$$$
    PER
    ADD-ON
    $$$$
    • Intuitive
    Single, unified pricing model. No add-ons.
    • Cloud native
    No problem using with container workloads and serverless.
    • Future proof
    You pay for capacity and are not locked into a specific use case.

    View Slide

  14. Elastic Stack for logs

    View Slide

  15. Logs
    64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "GET /mailman/listinfo/hsdivision HTTP/1.1" 200 6291

    64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "POST /twiki/bin/view/TWiki/WikiSyntax HTTP/1.1" 404 7352

    64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "GET /twiki/bin/view/Main/DCCAndPostFix HTTP/1.1" 200 5253

    For each event, print out what happened.
    Logs are chronological records of events

    View Slide

  16. Ongoing investment in log ingest & long-term retention
    2015
    2016
    2018
    2017
    2019
    ELK Stack is born
    Logstash and Kibana released, forming an
    OSS logging alternative
    2011-12

    View Slide

  17. Ongoing investment in log ingest & long-term retention
    2015
    2016
    2018
    Elastic welcomes Beats to the family,
    introducing light-weight data shippers
    2017
    2019
    ELK Stack is born
    Logstash and Kibana released, forming an
    OSS logging alternative
    2011-12
    Filebeat: Lightweight log shipper

    View Slide

  18. Ongoing investment in log ingest & long-term retention
    2015
    2016
    2018
    Filebeat: Lightweight log shipper
    Elastic welcomes Beats to the family,
    introducing light-weight data shippers
    2017
    2019
    Simplified ingest architecture with Filebeat
    modules & ingest node
    ELK Stack is born
    Logstash and Kibana released, forming an
    OSS logging alternative
    2011-12
    Modules: Out-of-the-box log parsers

    View Slide

  19. Elastic welcomes Beats to the family,
    introducing light-weight data shippers
    Ongoing investment in log ingest & long-term retention
    2015
    2016
    Hosted Logging in Elastic Cloud & ECE
    Introduction of ECE enabling log clusters with index
    curation, hot/warm templates
    2018
    Filebeat: Lightweight log shipper
    2017
    2019
    Modules: Out-of-the-box log parsers
    Simplified ingest architecture with Filebeat
    modules & ingest node
    ELK Stack is born
    Logstash and Kibana released, forming an
    OSS logging alternative
    2011-12

    View Slide

  20. Ongoing investment in log ingest & long-term retention
    2015
    2016
    Hosted Logging in Elastic Cloud & ECE
    Introduction of ECE enabling log clusters with index
    curation, hot/warm templates
    2018
    2017
    Cold storage for logging: Frozen Indices & ILM
    Curated log-based troubleshooting, improved cold
    storage efficiency and index lifecycle management
    2019
    Modules: Out-of-the-box log parsers
    Simplified ingest architecture with Filebeat
    modules & ingest node
    Hot. Warm. Cold. Delete.
    ELK Stack is born
    Logstash and Kibana released, forming an
    OSS logging alternative
    2011-12
    Elastic welcomes Beats to the family,
    introducing light-weight data shippers
    Filebeat: Lightweight log shipper

    View Slide

  21. Ongoing investment in log ingest & long-term retention
    2015
    2016
    Hosted Logging in Elastic Cloud & ECE
    Introduction of ECE enabling log clusters with index
    curation, hot/warm templates
    2018
    2017
    Cold storage for logging: Frozen Indices & ILM
    Curated log-based troubleshooting, improved cold
    storage efficiency and index lifecycle management
    2019
    Modules: Out-of-the-box log parsers
    Simplified ingest architecture with Filebeat
    modules & ingest node
    ELK Stack is born
    Logstash and Kibana released, forming an
    OSS logging alternative
    2011-12
    Logs UI: Integrating Logs with Metrics and APM
    Logging libraries support Elastic Common Schema,
    trace-id in logs, workflow from Logs to APM
    Elastic welcomes Beats to the family,
    introducing light-weight data shippers
    Filebeat: Lightweight log shipper

    View Slide

  22. View Slide

  23. Elastic Stack for metrics

    View Slide

  24. Metrics vs Logs
    Metrics are periodic measurement of numeric KPIs
    07/Jan/2019 16:10:00 all 2.58 0.00 0.70 1.12 0.05 95.55 server1 containerX regionA

    07/Jan/2019 16:20:00 all 2.56 0.00 0.69 1.05 0.04 95.66 server2 containerY regionB

    07/Jan/2019 16:30:00 all 2.64 0.00 0.65 1.15 0.05 95.50 server2 containerZ regionC


    Every x minutes, measure the CPU load, print it out, and annotate with meta-data.

    64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "GET /mailman/listinfo/hsdivision HTTP/1.1" 200 6291
    64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "POST /twiki/bin/view/TWiki/WikiSyntax HTTP/1.1" 404 7352
    64.242.88.10 - - [07/Jan/2019:16:10:02 -0800] "GET /twiki/bin/view/Main/DCCAndPostFix HTTP/1.1" 200 5253
    For each event, print out what happened.
    Logs are chronological records of events

    View Slide

  25. Evolution of Elastic Stack to a Metrics Store
    BKD trees
    Data structures optimized for numerical time
    series analysis.
    Columnar storage
    Structured data storage, resulting in compact
    storage and faster analytics
    Rollups
    Aggregate older data into bigger time buckets
    Aggregations framework
    Analytics features to slice and dice data along
    various dimensions
    2012
    2016
    2014
    2018

    View Slide

  26. Elastic as an Infrastructure Metrics Solution
    201?
    2017
    Users start putting metrics in Elastic
    Need for high-cardinality aggregations, and
    correlating metrics and logs
    2016
    2018
    2019

    View Slide

  27. Elastic as an Infrastructure Metrics Solution
    201?
    2017
    Users start putting metrics in Elastic
    Need for high-cardinality aggregations, and
    correlating metrics and logs
    2016
    2018
    2019
    Metricbeat: Turnkey metric collection
    Metricbeat is introduced for turnkey metrics
    collection

    View Slide

  28. Elastic as an Infrastructure Metrics Solution
    201?
    2017
    Users start putting metrics in Elastic
    Need for high-cardinality aggregations, and
    correlating metrics and logs
    2016
    2018
    2019
    Metricbeat: Turnkey metric collection
    Metricbeat is introduced for turnkey metrics
    collection
    Time Series Visual Builder
    UI for advanced metrics visualization, working
    with pipeline aggregations

    View Slide

  29. Elastic as an Infrastructure Metrics Solution
    201?
    2017
    Users start putting metrics in Elastic
    Need for high-cardinality aggregations, and
    correlating metrics and logs
    2016
    2018
    2019
    Metricbeat: Turnkey metric collection
    Metricbeat is introduced for turnkey metrics
    collection
    Time Series Visual Builder
    UI for advanced metrics visualization, working
    with pipeline aggregations
    Prometheus / OpenMetrics integration
    Enables turnkey collection in Kubernetes
    ecosystem and beyond

    View Slide

  30. Elastic as an Infrastructure Metrics Solution
    201?
    2017
    Users start putting metrics in Elastic
    Need for high-cardinality aggregations, and
    correlating metrics and logs
    2016
    2018
    2019
    Metricbeat: Turnkey metric collection
    Metricbeat is introduced for turnkey metrics
    collection
    Time Series Visual Builder
    UI for advanced metrics visualization, working
    with pipeline aggregations
    Prometheus / OpenMetrics integration
    Enables turnkey collection in Kubernetes
    ecosystem and beyond
    Infrastructure Metrics UI
    Containers, hosts, services, cloud monitoring,
    ad-hoc metrics exploration

    View Slide

  31. View Slide

  32. Elastic Stack for APM

    View Slide

  33. 33
    Why APM?
    03:43:45 Request "GET cyclops.ESProductDetailView"
    03:43:57 Response "cyclops.ESProductDetailView 200 OK"
    12 seconds - zZzzZZz
    Example: Slow response or load times

    View Slide

  34. Why APM?
    03:43:59 Request "POST /api/checkout"
    03:43:59 Response "/api/checkout 500 ERROR"
    Example: Errors & Exceptions

    View Slide

  35. 35
    Distributed Tracing
    Span
    Span
    Span
    HTTP request Response
    Transaction
    Single Transaction

    View Slide

  36. Distributed Tracing
    Trace A
    Transaction 1
    Span
    Span
    Transaction 2
    Span
    Transaction 3
    Span
    Span
    Span
    Multiple Services

    View Slide

  37. 37
    Evolution of Elastic Stack to Open Source APM
    Elastic joins forces with Opbeat
    A next-generation APM solution designed for
    developers
    2017
    6.1
    Search for APM + more agents
    Enabled search & Machine Learning for APM,
    Java agents GA, RUM GA
    6.4
    Elastic APM beta release
    Including APM Server and curated APM UI
    native to Kibana
    6.2
    Support for open tracing enabled with
    Distributed tracing, added Go Agent,
    integrated UI with Logs & Metrics
    6.6
    Elastic APM GA
    Agents for Python, Node.js, Ruby, Javascript;
    Real User Monitoring
    Beyond

    View Slide

  38. APM Agents
    ● Java
    ● Go
    ● .NET
    ● Javascript (React / Angular)
    ● RUM (Real User Monitoring)
    Language Support
    ● Python
    ● Ruby
    ● Node.js
    • Easy to add to your applications
    • Designed to be lightweight
    • Open source
    • Support distributed tracing
    • OpenTracing compliant
    Auto-instrumentation of common programming frameworks

    View Slide

  39. Distributed Tracing & OpenTracing
    End-to-end transaction tracking with auto-instrumentation or OpenTracing IDs

    View Slide

  40. View Slide

  41. • Correlate data from different sources
    • Ability to re-use analysis content
    • Ability to re-use Elastic-provided content
    Benefits
    • Published at: github.com/elastic/ecs
    • Supported in Beats and APM since 7.0
    • Community feedback welcome!
    Status
    Elastic Common Schema (ECS)
    Supports ad-hoc analysis in Kibana Dashboards

    View Slide

  42. 42
    Demo

    View Slide

  43. What now?
    Try it yourself!

    View Slide

  44. While you observe, why not protect?
    Elastic SIEM & Endpoint

    View Slide

  45. Thank you!

    View Slide

  46. Questions?

    View Slide