Slide 1

Slide 1 text

!1 Thomas Watson Mar 2019 Logging, Metrics, and APM: The Holy Trinity of Operations @wa7son

Slide 2

Slide 2 text

Logs Metrics APM @wa7son

Slide 3

Slide 3 text

Who am I? • Thomas Watson • Open Source developer at github.com/watson • Principal Software Engineer at Elastic • Node.js Core Member • Tweets as @wa7son @wa7son

Slide 4

Slide 4 text

!4 Benefits of Logs + Metrics + APM in one stack @wa7son

Slide 5

Slide 5 text

!5 Unified Dashboards Same UI for KPI summaries and root cause analysis @wa7son

Slide 6

Slide 6 text

!6 Unified Alerting Trigger off any operational data to provide unified SLA monitoring @wa7son

Slide 7

Slide 7 text

!7 Unified Machine Learning Correlate multiple data sources for more intelligent anomaly detection @wa7son

Slide 8

Slide 8 text

!8 Operational gains Single technology for operational data saves on administrative costs @wa7son

Slide 9

Slide 9 text

!9 Elastic Stack for logs @wa7son

Slide 10

Slide 10 text

Logs 64.242.88.10 - - [07/Mar/2017:16:10:02 -0800] "GET /mailman/listinfo/hsdivision HTTP/1.1" 200 6291 64.242.88.10 - - [07/Mar/2017:16:11:58 -0800] "POST /twiki/bin/view/TWiki/WikiSyntax HTTP/1.1" 404 7352 64.242.88.10 - - [07/Mar/2017:16:20:55 -0800] "GET /twiki/bin/view/Main/DCCAndPostFix HTTP/1.1" 200 5253 For each event, print out what happened. Logs are chronological records of events @wa7son

Slide 11

Slide 11 text

Making logging more turnkey with ‘modules’ • Turnkey experience for specific data types • Data to dashboard in just one step • Automated parsing and enrichment • Default dashboards, alerts, ML jobs @wa7son

Slide 12

Slide 12 text

Logging modules System • Linux / MacOS • Windows Events Containers • Docker • Kubernetes Databases • MySQL • PostgreSQL Queues • Kafka • Redis Web servers • Apache • Nginx Audit data • Filesystem • System calls WINLOGBEAT FILEBEAT AUDITBEAT Infrastructure Applications @wa7son

Slide 13

Slide 13 text

!13 Ad-hoc log search and visualization Kibana Discover, Visualize, Dashboard @wa7son

Slide 14

Slide 14 text

!14 Elastic Stack for metrics @wa7son

Slide 15

Slide 15 text

Metrics vs Logs 64.242.88.10 - - [07/Mar/2017:16:10:02 -0800] "GET /mailman/listinfo/hsdivision HTTP/1.1" 200 6291 64.242.88.10 - - [07/Mar/2017:16:11:58 -0800] "POST /twiki/bin/view/TWiki/WikiSyntax HTTP/1.1" 404 7352 64.242.88.10 - - [07/Mar/2017:16:20:55 -0800] "GET /twiki/bin/view/Main/DCCAndPostFix HTTP/1.1" 200 5253 For each event, print out what happened. Logs are chronological records of events 07/Mar/2017 16:10:00 all 2.58 0.00 0.70 1.12 0.05 95.55 server1 containerX regionA
 07/Mar/2017 16:20:00 all 2.56 0.00 0.69 1.05 0.04 95.66 server2 containerY regionB
 07/Mar/2017 16:30:00 all 2.64 0.00 0.65 1.15 0.05 95.50 server2 containerZ regionC
 
 Every x minutes, measure the CPU load and print it out, and annotate with meta-data.
 Metrics are periodic measurements of numeric KPIs @wa7son

Slide 16

Slide 16 text

!16 Evolution of Elasticsearch into a Metrics Store @wa7son

Slide 17

Slide 17 text

Elasticsearch beginnings Primarily used for application search Search engine Inverted index primary data structure, and is great for search 2010 @wa7son

Slide 18

Slide 18 text

Source: Computer Graphics - Principles and Practice @wa7son

Slide 19

Slide 19 text

Source: Computer Graphics - Principles and Practice

Slide 20

Slide 20 text

@wa7son

Slide 21

Slide 21 text

Elasticsearch beginnings Primarily used for application search Search engine Inverted index primary data structure, and is great for search 2010 @wa7son

Slide 22

Slide 22 text

2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics Elasticsearch evolves to support analytics https://www.elastic.co/blog/elasticsearch-as-a-column-store Columnar Store, Built on Lucene "doc values" Search engine Inverted index primary data structure, and is great for search 2010 @wa7son

Slide 23

Slide 23 text

2014 Aggregation Framework Analytics features to slice and dice data along various dimensions Aggregation Framework Out-of-this-world aggregations https://www.elastic.co/blog/out-of-this-world-aggregations Search engine Inverted index primary data structure, and is great for search 2010 2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics @wa7son

Slide 24

Slide 24 text

BKD trees and sparse fields Data structures optimized for numbers. Faster analytics, lower storage footprint 2016 2014 Aggregation Framework Analytics features to slice and dice data along various dimensions Elasticsearch storage efficiencies BKD Trees & Sparse Fields https://www.elastic.co/blog/searching-numb3rs-in-5.0 1-Dimension 2-Dimensions Sparse Data Search engine Inverted index primary data structure, and is great for search 2010 2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics @wa7son

Slide 25

Slide 25 text

Rollups Roll up or aggregate older data into bigger time buckets and save on disk space 2018 Rollup support for long-term retention Added in Elasticsearch 6.3 https://www.elastic.co/blog/data-rollups-in-elasticsearch-you-know-for-saving-space Search engine Inverted index primary data structure, and is great for search 2010 BKD trees and sparse fields Data structures optimized for numbers. Faster analytics, lower storage footprint 2016 2014 Aggregation Framework Analytics features to slice and dice data along various dimensions 2012 Columnar storage Structured data storage, resulting in compact storage and faster analytics @wa7son

Slide 26

Slide 26 text

!26 Elastic Stack as a Metrics Solution @wa7son

Slide 27

Slide 27 text

Metrics modules System • Linux • MacOS • Windows • Perfmon Infrastructure Cloud • AWS • GCP • Azure • DigitalOcean • Alibaba Containers • Docker • Kubernetes Virtualization • vSphere PACKETBEAT METRICBEAT Network • Netflow • Packets • TLS Envelope Storage • Ceph HEARTBEAT @wa7son

Slide 28

Slide 28 text

Applications Datastores • MySQL • PostgreSQL • MongoDB • Couchbase • Aerospike • Graphite Web servers • Apache • Nginx Other • HAProxy • Zookeeper Queues • Kafka • Redis • RabbitMQ Caches • Memcached Uptime • Heartbeat Custom apps • JMX/Jolokia • PHP-FPM • Golang Metrics modules PACKETBEAT METRICBEAT HEARTBEAT @wa7son

Slide 29

Slide 29 text

Heartbeat: Uptime Monitoring @wa7son

Slide 30

Slide 30 text

Heartbeat: Uptime Monitoring

Slide 31

Slide 31 text

Functionbeat: Serverless data shipper Cloudwatch Cloudwatch Logs @wa7son

Slide 32

Slide 32 text

Functionbeat: Serverless data shipper @wa7son

Slide 33

Slide 33 text

• Correlate data from different sources • Ability to re-use analysis content • Ability to re-use Elastic-provided content Correlation between logs, metrics, and APM Benefits • v1.0.0 published: github.com/elastic/ecs • Integrating into Elastic products in progress • Community feedback welcome! Status Elastic Common Schema @wa7son

Slide 34

Slide 34 text

Visualizing time series data Time Series Visual Builder @wa7son

Slide 35

Slide 35 text

Visualizing time series data Annotations @wa7son

Slide 36

Slide 36 text

!36 Elastic Stack for APM @wa7son

Slide 37

Slide 37 text

What is APM? Example 08:32:10 Request "/api/checkout" 08.32:11 Response "/api/checkout 500 ERROR" @wa7son

Slide 38

Slide 38 text

What is APM? Example 08:32:10 Request "/api/products/top" 08.32:17 Response "/api/products/top 200 OK" 7 seconds - zZzzZZz @wa7son

Slide 39

Slide 39 text

How does APM work? Data processor apm-server Data storage elasticsearch Browser Agent Web server Agent Web server Agent Web server Agent UI kibana Browser Agent Browser Agent @wa7son

Slide 40

Slide 40 text

• Focuses on search experience on top of APM data • ‘Just another index’ in Elastic Stack Elastic APM APM adds end-user experience and application-level monitoring to the stack Language support ● Python
 ● Node.js
 ● Ruby
 ● RUM 
 ● Java ● Go ● .NET (in dev) @wa7son

Slide 41

Slide 41 text

APM is another index in Elasticsearch Need another visualization? Build a dashboard, no need to wait for your vendor @wa7son

Slide 42

Slide 42 text

Single transaction Distributed Tracing Transaction Span Span Span HTTP request Response @wa7son

Slide 43

Slide 43 text

Distributed tracing example Distributed Tracing Trace A Transaction 1 Span Span Span Transaction 2 Span Transaction 3 Span Span @wa7son

Slide 44

Slide 44 text

Distributed Tracing Trace and map across multiple services
 • See the end-to-end view and navigate to individual transactions • Based on the notion of a end-to- end Trace ID across services • Investigating compatibility with OpenTracing API and aligning with W3C trace context spec @wa7son

Slide 45

Slide 45 text

!45 DEMO @wa7son

Slide 46

Slide 46 text

What now? Try it yourself! @wa7son

Slide 47

Slide 47 text

What now? Try it yourself! @wa7son

Slide 48

Slide 48 text

!48 You can always send me a tweet at @wa7son Questions?