Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Maximizing Logstash Performance

Maximizing Logstash Performance

Presentation at OpenWest 2017, accompanied by live demonstrations at the command-line and in Kibana.

Aaron Mildenstein

July 14, 2017
Tweet

More Decks by Aaron Mildenstein

Other Decks in Programming

Transcript

  1. Define the problem • Logstash is perceived as slow •

    No insight into performance bottlenecks • No idea how to gain insight
  2. pv -r, --rate show data transfer rate counter -W, --wait

    display nothing until first byte transferred $ bin/logstash -f mytest.conf | pv -Wr > /dev/null [42.0KiB/s]
  3. Toolset 2 • Elasticsearch, Logstash, & Kibana (+ X-Pack) •

    Configure Logstash to send monitoring data • View in Kibana
  4. Configure Kibana vi config/kibana.yml # If your Elasticsearch is protected

    with basic authentication, # these settings provide the username and password that the # Kibana server uses to perform maintenance on the Kibana # index at startup. Your Kibana users still need to # authenticate with Elasticsearch, which # is proxied through the Kibana server. elasticsearch.username: "elastic" elasticsearch.password: "changeme"
  5. Configure Logstash vi config/logstash.yml # Periodically check if the configuration

    has changed and # reload the pipeline # This can also be triggered manually through the SIGHUP signal # config.reload.automatic: true xpack.monitoring.elasticsearch.url: "http://localhost:9200" xpack.monitoring.elasticsearch.username: elastic xpack.monitoring.elasticsearch.password: changeme
  6. Pipeline Truths • At most, Logstash can only move data

    as fast as it comes in • Unless dropped or eliminated by conditional, each event will exit each output. • If a filter or output plugin is slow or blocked, the entire pipeline will back up • Filters will slow the pipeline–Some a little, some a lot. • Logstash can only ship data as fast as the slowest output. • No, really. Not kidding.
  7. Parallel pipeline example input {...} filter {# NONE} output {

    redis {...} } redis input { redis {...} } filter { # all } output { plugin1 {...} } input { redis {...} } filter { # all } output { plugin1 {...} } input { redis {...} } filter { # all } output { plugin1 {...} }
  8. Staged pipeline example output { elasticsearch {...} redis {...} }

    ES redis input { redis {...} } output { slow_output {...} }
  9. Future methods • Multiple pipelines from 1 JVM • Definable

    in logstash.yml • Each with auto-reload • Pipeline viewer (may be only in X-Pack at release) • See throughput not just as a sum of input/output, but at each plugin and conditional.
  10. Conclusion • If you can't measure it, you can't improve

    it, so monitor it. A lot. • Use grok and regular expressions... • ...as sparingly as possible • Don't put all of your pipeline eggs in one basket... • ...unless you've measured it and it meets your expectations • Parallelize and stage your pipeline with brokers FTW