Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ElasticSearch for Logging

Brad Lhotsky
September 20, 2013

ElasticSearch for Logging

A brief overview of the landscape of logging data with ElasticSearch followed by a number of lessons learned. By the end of the talk you should want to use ElasticSearch for logging and know enough to prevent shooting yourself in the foot.

Brad Lhotsky

September 20, 2013
Tweet

More Decks by Brad Lhotsky

Other Decks in Technology

Transcript

  1. ElasticSearch for Logging One Man's Sordid Journey of Discovery Brad

    Lhotsky http://twitter.com/reyjrar http://github.com/reyjrar
  2. ‣Agile Development (for Structure!) ‣Test everything (mostly in production) ‣Failure

    is encouraged ‣IT Budget for taking the site down ‣Amazing Business Monitoring ‣KPI's for IT tied to business metrics ‣ElasticSearch was successful for Front-End
  3. LogStash ‣Many Input / Filter / Output Plugins ‣Thriving Community

    ‣Daily Index Layout ‣Front-end? Not so much.
  4. ‣Dealing with "days" make sense ‣Maintenance Operations Easy: Delete, Optimize,

    Close, Open ‣Results in a higher number of shards ‣Which indexes do I search for 1 week of data? ‣Maintenance Operations Expensive ‣Potentially lower number of shards and even index sizes Daily Schema logstash-YYYY.MM.DD Capacity Schema Graylog2
  5. Roll Your Own! Perl ElasticSearch.pm Python pyes Ruby tire JavaScript

    Elastic.js http://www.elasticsearch.org/guide/clients/
  6. index.auto_expand_replicas ‣Clustering order of operations issue ‣Can cause enormous data

    transfers between nodes leaving and entering a cluster ‣Defaults to true ‣You should set it to false
  7. ‣There are no query killers! ‣Memory is limited. ‣Aggressive caching

    by default consumes the heap. ‣This is normally good ‣Except when it's not ‣Thread pools are malleable by default, and maintaining buffers for them can also cost memory. A Perl programmer learns about Java memory management.
  8. threadpool: index: type: fixed size: 30 queue_size: 1000 reject_policy: caller

    Thread Pool Management (less relevant since 0.90.0)
  9. There are no solutions, aside from firewalls. ‣If you can

    search, you can search any data in the cluster. ‣If you can search, you can modify or delete data from that index.
  10. ElasticSearch is not a System of Record ‣Not legit for

    Legal Uses ‣That's O.K. we can handle that use case cheaply.
  11. ElasticSearch, Graphite of Logging? ‣Composable investigations with Kibana ‣Easy access

    to everything for everyone ‣Simple API (REST) and data format (JSON) ‣We can get pretty pictures from it! ‣Encourages interaction with data