Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Elastic{ON} 2018 - Logs, Metrics, and APM: The Holy Trinity of Operations

Elastic Co
March 01, 2018

Elastic{ON} 2018 - Logs, Metrics, and APM: The Holy Trinity of Operations

Elastic Co

March 01, 2018
Tweet

More Decks by Elastic Co

Other Decks in Technology

Transcript

  1. Elastic March 1, 2018 Logs, Metrics, and APM: The Holy

    Trinity of Operations Tanya Bragin Senior Director, Product
  2. Logs vs Metrics 9 64.242.88.10 - - [07/Mar/2017:16:10:02 -0800] "GET

    /mailman/listinfo/hsdivision HTTP/1.1" 200 6291 64.242.88.10 - - [07/Mar/2017:16:11:58 -0800] "POST /twiki/bin/view/TWiki/WikiSyntax HTTP/1.1" 404 7352 64.242.88.10 - - [07/Mar/2017:16:20:55 -0800] "GET /twiki/bin/view/Main/DCCAndPostFix HTTP/1.1" 200 5253 For each event, print out what happened. 07/Mar/2017 16:10:00 all 2.58 0.00 0.70 1.12 0.05 95.55 server1 containerX regionA 07/Mar/2017 16:20:00 all 2.56 0.00 0.69 1.05 0.04 95.66 server2 containerY regionB 07/Mar/2017 16:30:00 all 2.64 0.00 0.65 1.15 0.05 95.50 server2 containerZ regionC Every x minutes, measure the CPU load and print it out. Metrics are periodic measurements of some KPIs
  3. Structured logs and metrics analyzed together 10 Metric: CPU (avg,

    per interval) Logs: response time (avg, per interval)
  4. APM data looks a lot like logs and metrics 12

    • APM agents look at: • Transaction durations • Application errors • Send across text-heavy, rich metadata: • Code path executions (“spans”) • Code statements associated with the error • Could also get metrics, e.g. in-app memory usage
  5. • Elasticsearch primarily used for application search • Lucene data

    structure: Inverted index Elasticsearch beginnings 14 Circa 2010
  6. • 2011: Logstash is open sourced, key part is “grok”

    for structuring logs • 2012: Kibana is open-sourced; ELK is widely used to search structured logs and create operational dashboards From Elasticsearch to ELK 15 ~ 2010 to 2014
  7. • 2011: Logstash is open sourced, key part is “grok”

    for structuring logs • 2012: Kibana is open-sourced; ELK is widely used to search structured logs and create operational dashboards • 2014: Kibana adds dashboarding; ELK stack gains prominence for log analytics From Elasticsearch to ELK 16 ~ 2010 to 2014
  8. • 2010: Elasticsearch adds support for “fielddata” (column-oriented view of

    the data in memory) Elasticsearch evolving to support analytics 17 ~ 2010 to 2014 https://www.elastic.co/blog/elasticsearch-as-a-column-store
  9. • 2010: Elasticsearch adds support for “fielddata” (column-oriented view of

    the data in memory) • 2012: Lucene introduces off-heap columnar store for numbers (“doc values”) • 2014: Elasticsearch 1.0 adds support for “doc values” (column store) Elasticsearch evolving to support analytics 18 ~ 2010 to 2014 https://www.elastic.co/blog/elasticsearch-as-a-column-store
  10. • 2017: Elasticsearch 6.0 improves Lucene sparse values storage efficiency

    (41.5% in Metricbeat index size) Elasticsearch storage efficiencies 19 2014 to Present https://www.elastic.co/blog/minimize-index-storage-size-elasticsearch-6-0
  11. • 2015: Elasticsearch 2.0 more aggressive text compression (with DEFLATE)

    Elasticsearch storage efficiencies 20 2014 to Present https://www.elastic.co/blog/store-compression-in-lucene-and-elasticsearch
  12. • 2016: Elasticsearch 5.0 adds more data structures for efficient

    storing and querying numbers (BKD Trees) Elasticsearch storage efficiencies 21 2014 to Present https://www.elastic.co/blog/lucene-points-6.0 1-Dimension 2-Dimensions
  13. • 2016: Elasticsearch 5.0 adds more data structures for efficient

    storing and querying numbers (BKD Trees) Elasticsearch query efficiencies 22 2014 to Present 1-Dimension 2-Dimensions https://www.elastic.co/blog/lucene-points-6.0
  14. • Speed up common queries and aggregations • 2014: Per

    shard result cache • 2016: Advanced query rewriting Elasticsearch query efficiencies 23 2014 to Present https://www.elastic.co/blog/instant-aggregations-rewriting-queries-for-fun-and-profit
  15. • Reduce memory usage of complex filters • 201?: Filter

    cache • 2015: Roaring bitmaps Elasticsearch query efficiencies 24 2014 to Present https://www.elastic.co/blog/frame-of-reference-and-roaring-bitmaps
  16. Elasticsearch for search and numerical analytics 25 Inverted Index for

    full-text search Columnar store for structured data BKD Trees for numerical operations Caches shard-level request/result caches, filter cache, etc.
  17. CENTRALIZED COLLECTION Logstash Elasticsearch Transform Store ingest node data node

    27 network devices DISTRIBUTED COLLECTION Beats servers, containers Elastic evolving ingest story
  18. Immediate insights with modules • Turnkey experience for specific data

    types • Data to dashboard in just one step • Automated parsing and enrichment • Default dashboards, alerts, ML jobs Logging Metrics Security Available with 28
  19. Logging modules 29 System • Linux / MacOS • Windows

    Events Containers • Docker • Kubernetes Infrastructure Applications Databases • MySQL • PostgreSQL Queues • Kafka • Redis Web servers • Apache • Nginx Audit data • Filesystem • System calls WINLOGBEAT FILEBEAT AUDITBEA T
  20. Metrics modules 30 System • Linux • MacOS • Windows

    • Perfmon Infrastructure Cloud • AWS • GCP • Azure • DigitalOcean Containers • Docker • Kubernetes Virtualization • vSphere PACKETBEAT METRICBEAT Network • Netflow • Packets • TLS Envelope Storage • Ceph LOGSTASH
  21. Metrics modules 31 Applications Datastores • MySQL • PostgreSQL •

    MongoDB • Couchbase • Aerospike • Graphite Web servers • Apache • Nginx Other • HAProxy • Zookeeper Queues • Kafka • Redis • RabbitMQ Caches • Memcached Uptime • Heartbeat Custom apps • JMX/Jolokia • PHP-FPM • Golang PACKETBEAT METRICBEAT LOGSTASH HEARTBEAT
  22. • First open-source alternative to traditional APM tools • Focused

    on underserved areas by traditional vendors • Active roadmap to expand programming languages Elastic APM 36 APM adds end-user experience and application-level monitoring to the stack
  23. Elastic APM 37 How it works Kibana Beats Logstash Elasticsearch

    APM Server APM Agents Logs Metrics Packets ... Datastore JMX
  24. • Opbeat migrated from combination of Cassandra and Redis to

    Elasticsearch • Much of the data that was pre-aggregated before is now stored as raw document in Elasticsearch • Ad-hoc querying flexibility for the user • New feature development agility for engineering Elasticsearch as APM datastore 38 The Journey
  25. • New Beats and Logstash inputs and modules • Improved

    dashboards and ML jobs / alerts for existing modules • Agentless shippers • Distributed tracing New operational data sources 46 It all starts with the data
  26. • Correlate data from different sources • Ability to re-use

    analysis content • Ability to re-use Elastic-provided content Elastic Common Schema 47 Benefits • Preliminary review • Working closely with the community • Will provide more information via usual channels Status
  27. 48 Rollup support • Caveat: Lose ability to query individual

    events on rolled-up data • Recommended for long retention use cases, such as capacity planning • Can accomplish this today with Watcher-enabled rollups • Built-in rollup support in active development
  28. • Instrument newer projects built on new frameworks and technologies

    • For legacy projects, start with unifying most important KPIs and events • During re-architecture efforts, consider consolidating datastores / tools How do I get started? 53 Practical initial deployment and migration strategies