In this talk I describe how Yelp uses the dynamic monitoring abilities of Sensu to monitor services that are dynamically deployed by Mesos and dynamically routed by Smartstack.
Services: “Also Nagios” • Probably alerts go to OPS anyway • Probably just making sure the LB is up • Very little developer visibility • Hard to articulate to nagios what you want
and Execute checks, but just put the results on the queue • Servers handle results off the queue, route them to things like email, pagerduty, JIRA, etc. • Also API, CLI, check history, silencing, dashboard, etc.
to receive arbitrary events • We already know which team owns each service (started documenting that with the soa-configs) • We already know where services are deployed and what latency zones they are in
simulate a failure of an AZ • We got a replication alert because of of the latency zones didn’t meet our expected replication count. (0 out of 3) • We decided to “remediate” it by expanding our latency zone to “region” • Paasta “Made it so”, and our alert resolved and the status command reflected the fact that we are expecting 6 in that one region
for the “Teams” metadata hash • PaaSTA checks Haproxy in each latency zone because it can read the same SOA configs that SmartStack does! • PaaSTA “Knows” which team owns each service because we told it in SOA configs! • Sensu just processes the event like normal
process arbitrary events for easy integration (Sensu) • Keep service metadata in an easy-to-access place for pieces to integrate easily (SOA configs) • Monitor the exact thing you care about (replication in each latency zone)
A. To Describe how cool Sensu is B. To Make viewers feel inadequate of their own Nagios installation C. To tease viewers about Sensu glue that is not open source yet D. To Inspire viewers to build their own dynamic Monitoring based on some of these ideas! E. Other?
A. To Describe how cool Sensu is B. To Make viewers feel inadequate of their own Nagios installation C. To tease viewers about Sensu glue that is not open source yet D. To Inspire viewers to build their own dynamic Monitoring based on some of these ideas! E. Other?