Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Training your cluster to take care of itself (and let you eat your dinner in peace)

Training your cluster to take care of itself (and let you eat your dinner in peace)

Brian Cline

May 13, 2014
Tweet

More Decks by Brian Cline

Other Decks in Technology

Transcript

  1. Training your cluster to take care of itself (and let

    you eat your dinner in peace) OpenStack Summit · Juno Atlanta, Georgia, US EMAIL GITHUB IRC TWITTER [email protected] https:/ /github.com/briancline briancline @briancline Tuesday 13 May 2014 Brian Cline
  2. Hong Kong Recap Combining Telemetry and Logs (%@*#$!) Metric and

    Log Processing via Riemann Basic Classification via CRM-114 Alerting and automation
  3. End Goal: Smarter Systems • Reactively resolve errors • Proactively

    resolve degradations • Start with common, low-risk annoyances • Ease into recurring, reproducible problems • Easily automateable and testable
  4. Classifying the Important Stuff • logstash filters • (custom): Invoke

    CRM-114 for text
 classification • add_tag with the resulting classification • …and just about anything else imaginable
  5. Sending to Other Systems • logstash outputs • elasticsearch is

    really, really great • riemann is a really, really great for
 moving-window shared stats
 (and queryable!) • exec helps fire off easy fixes
  6. A Swift Rise of the Machines Disk I/O error? /dev/sdx

    sanity-check, umount, or xfs_check, potentially decrease device weight in rings until 0 New disk hot-swapped in? mkfs.xfs if empty, add to ring,
 gradually increase device weight Running low on memory? notify you via email, text, SMS (or via beeper, if that’s your thing) (Along the way, generate an event each time to show in the data
 when an automatic action took place, and its purpose)
  7. Graphs: Very Fun AND Exciting • Kibana works very well

    with this stack • Real-time visualization • Search queries are very intuitive • Hate Logstash? Kibana works with
 Fluentd and Flume, too
  8. Where to get all of this stuff • Logstash: http://elasticsearch.org/overview/logstash/

    • Elasticsearch: http://elasticsearch.org/overview/elasticsearch/ • Kibana: http://elasticsearch.org/overview/kibana/ • CRM-114: apt-get install crm114 • Riemann: http://riemann.io/