Working in operations in 2014 is hard.
The infrastructures we manage are growing rapidly, and responsibility is being divided up across multiple teams.
Then something breaks. Your on-call engineer receives 900 SMS in 30 seconds. Her phone melts. You can’t distinguish the signal from the noise. It takes an hour to fix the problem.
Enter Flapjack: an event processing & monitoring alert routing system. Flapjack sits at the end of your monitoring pipeline and sends alerts to the right person.
You should be interested in Flapjack if:
- You want to automatically configure how your on-call are notified from within Puppet
- You want to identify failures faster by rolling up your alerts across multiple monitoring systems.
- You monitor infrastructures that have multiple teams responsible for keeping them up.
In this talk you will learn how to setup Flapjack with Puppet, data-driven Flapjack configuration with Hiera, and how you can leverage Puppet's metadata to effectively route alerts to people who solve problems.
## Credits
http://www.flickr.com/photos/lizadaly/4373330774
http://www.flickr.com/photos/meltwater/420749031
http://www.flickr.com/photos/whatknot/8642836187
http://www.flickr.com/photos/jonmould/5393395335
http://vmfarms.com/static/img/logos/ruby-logo.png
http://www.flickr.com/photos/l1v32r1d3bmx/3985457584
http://www.flickr.com/photos/thomasforsyth/4313764488
http://www.flickr.com/photos/rubodewig/5161937181
http://www.flickr.com/photos/ronwls/7001551988
http://www.flickr.com/photos/sparktography/83217827
http://www.flickr.com/photos/sdphotography/1570906849
http://tosbourn.com/wp-content/uploads/2013/12/redis-logo.png?e0df77
http://www.flickr.com/photos/derekskey/9530097369
http://giphy.com/gifs/yeUxljCJjH1rW
http://en.wikipedia.org/wiki/Broadcast_delay
http://www.flickr.com/photos/karen_d/8448507872
http://www.flickr.com/photos/buzzhoffman/4127280540
http://i.imgur.com/2UduUZ5.gif