Get real-time insights from your application with
Packetbeat & Elasticsearch
Slide 2
Slide 2 text
www.elastic.co
2
In today’s program
• Lots of JSON objects
• A cow, an elk, a violin,
a guitar and a set of
drums
• Anomaly detection
via moving averages
Slide 3
Slide 3 text
www.elastic.co
3
Tudor Golubenco
twitter.com/tudor_g
github.com/tsg
I am
Slide 4
Slide 4 text
www.elastic.co
4
I work for
Slide 5
Slide 5 text
www.elastic.co
5
the company behind
Elasticsearch
Logstash
Kibana
Slide 6
Slide 6 text
www.elastic.co
6
also known as
the
ELK
stack
Photo credit: https://www.flickr.com/photos/lsmith2010/8215026548
Slide 7
Slide 7 text
www.elastic.co
7
Open source culture
Image credit: https://www.flickr.com/photos/tappnel/5798812875
• We live in GitHub
• We talk Pull Requests
• Conferences
• Community
• Blog posts
Slide 8
Slide 8 text
www.elastic.co
8
Started an open source project
Packetbeat
Slide 9
Slide 9 text
www.elastic.co
9
We wanted to make
Monitoring and troubleshooting tools for
complex applications and infrastructures
Slide 10
Slide 10 text
www.elastic.co
10
Idea
look at the communication between
services
Slide 11
Slide 11 text
www.elastic.co
11
Capture network packets
• Visibility into the
infrastructure by
• Passively listening to
network packets
• It doesn’t add latency
• It cannot break your
application
Image credit: https://www.flickr.com/photos/bigdrumthump/3223280727
Slide 12
Slide 12 text
www.elastic.co
12
Packet capturing
1. Using port mirroring 2. As an “agent”
Slide 13
Slide 13 text
www.elastic.co
13
Sniffing from a technical PoV
• libpcap (tcpdump), supports
all Unix like systems
• Winpcap, supports Windows
• For Go, gopacket provides
bindings and more
• High speed API for packet
capturing on Linux:
af_packet
Image credit: https://www.flickr.com/photos/57881779@N04/7930362242/
Slide 14
Slide 14 text
www.elastic.co
14
Decoding
Slide 15
Slide 15 text
www.elastic.co
15
Matching requests and responses
• Pipelining complicates
matching the requests with
the responses.
Slide 16
Slide 16 text
www.elastic.co
16
Create a JSON object for each request-response pair
HTTP transaction
GET method
Response code
Response time
Slide 17
Slide 17 text
www.elastic.co
17
SQL example
SQL method
Query
Error message
Bandwidth
information
Slide 18
Slide 18 text
www.elastic.co
18
DNS example
Query type Domain name
Response time
Slide 19
Slide 19 text
www.elastic.co
19
Packetbeat: Overview
Slide 20
Slide 20 text
www.elastic.co
20
There’s more to apps than packets
Packetbeat
Listens to the “beat” of
the network packets.
Topbeat
Listens to the “beat” of
the operating system
metrics.
Image credits:
https://www.flickr.com/photos/7147684@N03/921738874/
https://www.flickr.com/photos/bigdrumthump/3223280727
https://www.flickr.com/photos/jadeashleyphotography/6584949945/
https://www.flickr.com/photos/mitosettembremusica/2839965900/
Filebeat
Listens to the “beat” of
logs.
Metricsbeat
Listens to the internal
“beat” of systems via
APIs.
Slide 21
Slide 21 text
www.elastic.co
21
Topbeat
• Like the Unix top
command but
sending the data
periodically to
Elasticsearch
• Works also on
Windows
Slide 22
Slide 22 text
www.elastic.co
22
Topbeat system wide and per process stats
CPU “steal” time
Total / used / free
memory
CPU stats
Per process stats
CPU time
consumed
Process pid, name,
parent pid, etc.
Memory used
Slide 23
Slide 23 text
www.elastic.co
23
Topbeat output objects
File system stats
Mount point
Device name
Total, used, free
disk space
Slide 24
Slide 24 text
www.elastic.co
24
Filebeat
• A “Beat” based on the Logstash-Forwarder
source code
• Do one thing well:
• Send log files to Logstash & Elasticsearch
• Light on consumed resources
• Easy to deploy on multiple platforms
Slide 25
Slide 25 text
www.elastic.co
25
Filebeat JSON output
The log message
The timestamp
The log level
Slide 26
Slide 26 text
www.elastic.co
26
Beats have libbeat in common
• Go library
• Provides common things for all
Beats:
• logging, service handling,
configuration file handling,
CLI flags
• Outputs and filters
Dev guide for creating a new Beat: https://www.elastic.co/guide/en/beats/libbeat/current/index.html
Slide 27
Slide 27 text
www.elastic.co
27
Deployment: directly to ES
• Option 1: Insert
directly into
Elasticsearch via
the bulk API
• Security can be
provided via
Shield and HTTPs
Slide 28
Slide 28 text
www.elastic.co
28
Deployment: Send to Logstash
• Option 2: Insert via
Logstash
• Uses the Lumberjack
protocol which offers
security
• Gives the opportunity of
enriching or modifying the
data
Slide 29
Slide 29 text
www.elastic.co
29
Getting insights from the data
• Elasticsearch aggregations
• Split the data into
buckets
• Apply a function over the
data
• Freely combine them by
nesting
• Work with multiple shards
Image credit: https://www.flickr.com/photos/sheeprus/4551642374/
Slide 30
Slide 30 text
www.elastic.co
30
Date histogram
•Splits data in buckets of time
Slide 31
Slide 31 text
www.elastic.co
31
Date histogram response
Slide 32
Slide 32 text
www.elastic.co
32
Percentiles aggregation
95th percentile value means that 95% of the
values are smaller than it
Slide 33
Slide 33 text
www.elastic.co
33
Percentile aggregation response
Slide 34
Slide 34 text
www.elastic.co
34
Percentile aggregation
•Approximate values
•T-digests algorithm by Ted Dunning
•Accurate for small sets of values
•More accurate for extreme percentiles
Slide 35
Slide 35 text
www.elastic.co
35
Date histogram nested with percentiles
www.elastic.co
38
Histogram by response time
• Splits data in buckets by response time
• [0-10ms), [10ms-20ms), …
Slide 39
Slide 39 text
www.elastic.co
39
Latency histogram
Slide 40
Slide 40 text
www.elastic.co
40
Add a date histogram
Slide 41
Slide 41 text
www.elastic.co
41
Response times repartition
Slide 42
Slide 42 text
www.elastic.co
42
Kibana config
Slide 43
Slide 43 text
www.elastic.co
43
Slowest RPC methods
•Combines terms and percentiles aggregations
Slide 44
Slide 44 text
www.elastic.co
44
Terms aggregation
• Buckets are dynamically built: one per unique value
• By default: top 10 by document count
• Approximate because each shard can have a different
Slide 45
Slide 45 text
www.elastic.co
45
Order by 99th percentile
Slide 46
Slide 46 text
www.elastic.co
46
• New in Elasticsearch 2.0
(currently in beta)
• Work on the results of
other aggregations
Pipeline aggregations
Slide 47
Slide 47 text
www.elastic.co
47
Derivative aggregation
• Metric
constantly
growing
• Take first order
derivate to see
the speed of
growth
www.elastic.co
49
Exponentially weighted moving average
• Older values become exponentially less important
Slide 50
Slide 50 text
www.elastic.co
50
Moving average - dynamic thresholds
• yellow - measured values
• purple - moving average (ewma)
• green - threshold, mean + (3 * standard deviation)
Slide 51
Slide 51 text
www.elastic.co
51
Request
Extended stats agg for
mean and std
deviation
Moving averages
aggs for mean and
std
Bucket script agg
Details: https://www.elastic.co/blog/staying-‐in-‐control-‐with-‐moving-‐averages-‐part-‐1
Slide 52
Slide 52 text
www.elastic.co
52
Cyclic trends - anomalies
• EWMA lags behind too much
• The values constantly hit the threshold
Slide 53
Slide 53 text
www.elastic.co
53
Cyclic trends - anomalies
• Holt-Winters (triple exponential) model works better for
seasonal data
• Requires two periods to bootstrap the algorithm
Slide 54
Slide 54 text
www.elastic.co
54
Thanks
• Live demo: http://demo.elastic.co/packetbeat/
• Twitter: @tudor_g
• Come by the booth, we have stickers!