Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The ELK Stack in a DevOps Environment

The ELK Stack in a DevOps Environment

This deck was presented by Kurt Hurtado, Elasticsearch/Logstash software engineer, at the Cloud Mafia / SFO DevOps meetup at New Relic on November 7, 2014.

It contains hard-won strategies gained over years of organizational experience managing large-scale Elasticsearch, Logstash and Kibana installations.

DevOps ELK users should find some valuable information on installing, maintaining, configuring and scaling their existing infrastructures, as well as suggestions for building out clusters from scratch.

Elasticsearch Inc

November 05, 2014
Tweet

More Decks by Elasticsearch Inc

Other Decks in Technology

Transcript

  1. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Many Options to Choose From • Old-school: grep! perl! sed! awk! • Image libraries: GDImage, ImageMagick, CxImage • Tools: MRTG / Cricket / RRD • Graylog • Graphite • Javascript: HighCharts, flot, jQuery
  2. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited ELK can help! • Visualize your Jenkins build data • Visualizing puppet runs • Monitor the Heartbleed bug in real-time </3 • Use the pagerduty output / Email devs on app ERRORs
  3. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited History of the ELK Stack • Elasticsearch First released in 2010 by Shay Banon, as a distributed search engine Built on top of Lucene. JSON-based. Rich APIs • Logstash Started in 2009 by Jordan Sissel, as a method to stash logs • Kibana Project begun in 2011 by Rashid Khan, to visualize event data • Conceived as separate projects ! !
  4. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Synergy! • Elasticsearch was founded in 2012 • Rashid joined Elasticsearch in January, 2013 • Jordan joined Elasticsearch in August, 2013 • Much of the development on all three projects is now done in-house, in addition to open source contributions • Cross-team projects and cooperation • Federated QA effort helps all products, separately and as a complete stack. !
  5. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Getting started is easy! • The basic ELK stack is almost trivial to install. Download, untar and run! • A POC can be created in an hour or two multiple skill sets help in building the stack • Bigger clusters, higher ingestion rates require beefier architectures, better configurations multiple tiers, hardware upgrades, software tuning • non-ELK software adds value to a robust system nginx, Redis, RabbitMQ, Kafka, etc.
  6. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Manual Installation Methods • Downloading tarballs, zipfile http://www.elasticsearch.org/overview/elkdownloads/ • Install Linux packages deb rpm • git clone, fork, etc. Contains most recent updates. Contribute back to the project! http://github.com/elasticsearch • http://github.com/kurtado/quick-elk
  7. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Configuration Management • Official Puppet Modules (Puppet Approved!) https://forge.puppetlabs.com/elasticsearch Active development, in-house and on github Many configuration options: ES: node level, e.g. cluster.name, discovery.zen (anything with full dot notation), also plugins, templates, client bindings, Java installation,etc. LS: version, upgrade status, service options, configuration files or configuration file snippets, plugins, patterns, etc. • Chef cookbook https://github.com/elasticsearch/cookbook-elasticsearch • Docker, SaltStack, Ansible, etc!
  8. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Elasticsearch Configurations • Set $ES_HEAP_SIZE env var to 1/2 of RAM (but < 32GB) • Make sure the process does not swap (bootstrap.mlockall) • Set user's file ulimit to unlimited (tricky - reboot to check!) check with API call to '/_nodes/process' • Don't overconfigure your cluster settings!!! Elasticsearch default settings work for "most" clusters ! !
  9. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Elasticsearch Configurations • Use unicast discovery mode • 3 lower resource master-eligible nodes in large clusters Due to the distributed nature of an Elasticsearch cluster • Add lightweight client nodes (no data) • Use snapshot and restore This is very useful, but different from replication • Use persistent (keepalive) HTTP connections monitor using /_nodes/stats/http !
  10. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Logstash • Watch out for grok filter overhead (hello GREEDYDATA) • Test configs with logstash -e 'input { … } … output { … }' • use the -w flag to utilize multiple cores e.g. '-w 8' on an eight core machine • Use the generator input for benchmarking • - - debug for far too much logging output Don't forget to turn it off when you're done! • use top -H to see thread names in the JVM
  11. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Kibana • Tune queries on Elasticsearch side for max. performance field data circuit breaker, in particular • configuring # threads in pool In particular, set search low • Save and export dashboards as JSON files • Deploy a proxy • Use as an exploration tool But watch out for over-eager users taxing your cluster! Marketing will love you because…
  12. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Access Control / Security • Security is very high on our radar • Use nginx, Apache, lighttpd, etc. • You can block POST / PUT / DELETE method requests • Disable dynamic scripting, in versions < 1.2 script.disable_dynamic: true • Disable destructive actions action.destructive_requires_name: true • Use aliases to allow users access to subsets of indices
  13. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Elasticsearch Shield plug-in • Role-Based Access Controls granular control over permissions cluster, index, and alias-level permissions for each user ! • Authentication System Support Integrates with LDAP-based authentication systems Integrate with Active Directory Native authentication system for those who want to manage all access within Elasticsearch
  14. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Elasticsearch Shield (cont) • Encrypted Communications Node-to-node encryption protects in-flight data from intruders certificate-based SSL/TLS encryption secure client communications with HTTPS Shield keeps data traveling over the wire protected. • Audit Logging Ensure compliance Monitor security-related activity Record login failures Record attempts to access unauthorized information.
  15. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited VM vs Metal • VMs are convenient • Metal more configurable • Metal can utilize SSD • Cloud VM could suffer from "noisy neighbors" • Start using what you're most familiar with!
  16. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Disks, disks, disks • spinning disks are cheaper per GB • SSDs have better IOPS • SSDs are cheaper WRT: IOPS • SSD manufacturing tolerance can vary • SSD write amplification lessened with Elasticsearch • SAN / NAS can work, if IOPS are sufficient • Don't necessarily need RAID, ES handles redundancy But striping can help with performance
  17. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited ELK can help! • Backfill log data, enrich in "new" ways • A substitute for Graphite, other plotting software • Collect / analyze application logs • Analyze BI data C-level folks will love you for it, too. • Twitter Logstash input • Public data sets: IMDB, Enron, sports, wikipedia
  18. Copyright Elasticsearch 2014. Copying, publishing and/or distributing without written permission

    is strictly prohibited Questions? • Google Groups • github issues / read the code • Elasticsearch Core Training (Public and Private) • Elasticsearch Getting Started Workshop
  19. Copyright Elasticsearch 2014 Copying, publishing and/or distributing without written permission

    is strictly prohibited The legal bits This presentation is Copyright Elasticsearch 2014. ! The lumberjack and elk photos are used under a Creative Commons license: https://www.flickr.com/photos/pasukaru76/5081167157 ! https://www.flickr.com/photos/usfwsmtnprairie/8534525072 ! https://creativecommons.org/licenses/by/2.0/ ! !