Upgrade to Pro — share decks privately, control downloads, hide ads and more …

monitoring: it gets better

monitoring: it gets better

Talk on Sensu and monitoring given by Adam Horwich, Systems Engineer, at MetaBroadcast on October 16th, 2013

MetaBroadcast

October 16, 2013
Tweet

More Decks by MetaBroadcast

Other Decks in Technology

Transcript

  1. old skool • Limited suite of free applications: Nagios, Cacti,

    Ganglia • Static, Inflexible, Single-purpose • Commercial applications try to achieve everything (badly) • Limited scope to enhance!
  2. hashtag monitoringsucks • A couple of years ago, sysadmins despaired

    • https://github.com/monitoringsucks • The notion that monitoring products are ill suited for DevOps needs • And that most are outdated, and misaligned to Cloud architectures • The one tool to solve all problems model is a con
  3. my first year • My history is with large infrastructure,

    static, rented/owned Datacentre • Started at MetaBroadcast and spent my time hacking Nagios to play well with AWS • We knew what we needed in terms of quality monitoring but it was uphill struggle • Automated infra changes • Metrics gathering and graphing • I cry
  4. cloud surfing • Basically, it’s because of IaaS Clouds and

    DevOps that we need to rethink our models • No longer static hardware with sentimental names and birthdays • Virtual Infrastructure is abstracted. Harder to monitor ‘switches’ and ‘routers’ • VPC Simulates the DC model, but still virtualised • More flexibility but less accessible data
  5. inspiration • Obfuscurity - Tasseo creator • https://speakerdeck.com/obfuscurity/the-state-of-open-source-monitoring • Finding

    the components to build a better way of life • Modular, open-source, framework led design
  6. monitoring manifesto 1. Proactive not reactive metrics and alerts 2.

    Focus on instrumentation not thresholds 3. Flexible architecture, built for a flexible infrastructure 4. Tie with configuration management tools (Puppet)
  7. • Router, Queue, Scheduler • Connects everything together • Designed

    for the Cloud • High Availability Model be sensu my beating heart
  8. kairosdb • Originally Nimrod, but as a developer focused tool,

    couldn’t cope with volume of logs • Proof that architecture principles make components easily interchangeable • Apache performance monitoring • APDEX, Aggregation, and Comparison
  9. graphite and tasseo • Taking data and making it accessible

    and pretty and stuff • Dashboards, comparisons, oh my •
  10. silver linings • AWS CloudWatch: Amazon’s metrics engine • We

    collect ‘privileged’ and AWS specific metrics into our Graphite for comparison and persistence • Spot prices! Billing! All can be monitored in AWS or with your own services • Can push metrics to AWS too. Great for Auto Scaling (spot price ;) ) • CloudWatch isn’t an adequate component in itself (API Speed, Cost, Flexibility, Integration)
  11. monitoring still kinda sucks • Sensu is great but has

    a poor dashboard, visibility, historical access. • Alerts are still based on thresholds rather than trends and patterns • Monitoring the monitoring with more monitoring • Actively developed and improving every day • Can’t escape from that list of RED services. But it’s next to do!
  12. hashtag monitoringlove • But, monitoring is as hard as you

    want to make it • By choosing the right components, they can seamlessly interact • Focus on what you want to achieve, not what you want to monitor • Don’t be afraid of building zords