Maximizing Logstash Performance

Optimizing Logstash Performance There's more than one way to parse
a log...

Deﬁne the problem • Logstash is perceived as slow •
No insight into performance bottlenecks • No idea how to gain insight

Logstash? Data In Data Out

Have no fear! Let's shine a light in there

Rule #1

–Jordan Sissel, creator of Logstash...and grok. “If you have to
use grok, you've already lost.”

Toolset 1 • generator input plugin • dots codec •
pipe viewer (pv)

Generator input { generator { lines => [ 'line1', 'line2',
... 'lineN', ] count => 123456 } }

dots codec output { stdout { codec => dots }
}

pv -r, --rate show data transfer rate counter -W, --wait
display nothing until first byte transferred $ bin/logstash -f mytest.conf | pv -Wr > /dev/null [42.0KiB/s]

–No one, Ever “I like waiting!”

Toolset 2 • Elasticsearch, Logstash, & Kibana (+ X-Pack) •
Conﬁgure Logstash to send monitoring data • View in Kibana

X-Pack $ bin/elasticsearch-plugin install x-pack $ bin/kibana-plugin install x-pack $
bin/logstash-plugin install x-pack

Conﬁgure Kibana vi config/kibana.yml # If your Elasticsearch is protected
with basic authentication, # these settings provide the username and password that the # Kibana server uses to perform maintenance on the Kibana # index at startup. Your Kibana users still need to # authenticate with Elasticsearch, which # is proxied through the Kibana server. elasticsearch.username: "elastic" elasticsearch.password: "changeme"

Conﬁgure Logstash vi config/logstash.yml # Periodically check if the configuration
has changed and # reload the pipeline # This can also be triggered manually through the SIGHUP signal # config.reload.automatic: true xpack.monitoring.elasticsearch.url: "http://localhost:9200" xpack.monitoring.elasticsearch.username: elastic xpack.monitoring.elasticsearch.password: changeme

Ready for Launch • Start Elasticsearch • Start Kibana •
Navigate to Monitoring page

Logstash is a Pipeline In case I neglected to mention
it

Pipeline Truths • At most, Logstash can only move data
as fast as it comes in • Unless dropped or eliminated by conditional, each event will exit each output. • If a ﬁlter or output plugin is slow or blocked, the entire pipeline will back up • Filters will slow the pipeline–Some a little, some a lot. • Logstash can only ship data as fast as the slowest output. • No, really. Not kidding.

Consider the following... output { plugin1 {...} plugin2 {...} plugin3
{...} }

Improving performance • Use brokers • Parallel pipelines • Staged
pipelines

Parallel pipeline example input {...} filter {# NONE} output {
redis {...} } redis input { redis {...} } filter { # all } output { plugin1 {...} } input { redis {...} } filter { # all } output { plugin1 {...} } input { redis {...} } filter { # all } output { plugin1 {...} }

Staged pipeline example output { elasticsearch {...} redis {...} }
ES redis input { redis {...} } output { slow_output {...} }

Other potential bottlenecks • Persistent Queues • Conditionals • Especially
if you're doing regular expressions in them

Future methods • Multiple pipelines from 1 JVM • Deﬁnable
in logstash.yml • Each with auto-reload • Pipeline viewer (may be only in X-Pack at release) • See throughput not just as a sum of input/output, but at each plugin and conditional.

Conclusion • If you can't measure it, you can't improve
it, so monitor it. A lot. • Use grok and regular expressions... • ...as sparingly as possible • Don't put all of your pipeline eggs in one basket... • ...unless you've measured it and it meets your expectations • Parallelize and stage your pipeline with brokers FTW

Resources • https://www.elastic.co/guide/en/logstash/current/performance- troubleshooting.html • https://www.elastic.co/guide/en/logstash/current/tuning-logstash.html

Maximizing Logstash Performance

Maximizing Logstash Performance

Aaron Mildenstein

More Decks by Aaron Mildenstein

Other Decks in Programming

Featured

Transcript

Optimizing Logstash Performance There's more than one way to parse

Deﬁne the problem • Logstash is perceived as slow •

Logstash? Data In Data Out

Have no fear! Let's shine a light in there

Rule #1

–Jordan Sissel, creator of Logstash...and grok. “If you have to

Toolset 1 • generator input plugin • dots codec •

Generator input { generator { lines => [ 'line1', 'line2',

dots codec output { stdout { codec => dots }

pv -r, --rate show data transfer rate counter -W, --wait

–No one, Ever “I like waiting!”

Toolset 2 • Elasticsearch, Logstash, & Kibana (+ X-Pack) •

X-Pack $ bin/elasticsearch-plugin install x-pack $ bin/kibana-plugin install x-pack $

Conﬁgure Kibana vi config/kibana.yml # If your Elasticsearch is protected

Conﬁgure Logstash vi config/logstash.yml # Periodically check if the configuration

Ready for Launch • Start Elasticsearch • Start Kibana •

Logstash is a Pipeline In case I neglected to mention

Pipeline Truths • At most, Logstash can only move data

Consider the following... output { plugin1 {...} plugin2 {...} plugin3

Improving performance • Use brokers • Parallel pipelines • Staged

Parallel pipeline example input {...} ﬁlter {# NONE} output {

Staged pipeline example output { elasticsearch {...} redis {...} }

Other potential bottlenecks • Persistent Queues • Conditionals • Especially

Future methods • Multiple pipelines from 1 JVM • Deﬁnable

Conclusion • If you can't measure it, you can't improve

Resources • https://www.elastic.co/guide/en/logstash/current/performance- troubleshooting.html • https://www.elastic.co/guide/en/logstash/current/tuning-logstash.html