New monitoring systems
(and why you should use experiment
with them)
Matt Cottingham
@mattrco
Slide 2
Slide 2 text
Web Operations (Allspaw et. al.)
● Great overview of approaches to ensuring
the uptime of your services
● Useful when I co-founded a startup
Slide 3
Slide 3 text
● Monitor systems, services and applications
● Learn what is expected for a system
● See trends and patterns
● Discover and alert on problems
Monitoring metrics
Slide 4
Slide 4 text
What’s changed since?
Slide 5
Slide 5 text
What’s changed since?
● Multiple deploys per day
● Application Performance Management
● Breadth of IaaS/PaaS offerings
● More applications (Microservices)
● Anomaly detection
● Containers...
Slide 6
Slide 6 text
What needs improving?
● Handle ephemeral nodes
● Thresholds are still a pain
● Manipulating data is still hard
● Make useful for others in the business?
Slide 7
Slide 7 text
Some notable Go projects
Heka (by Mozilla)
● Data collection and processing in use at
Mozilla
● Large no. input and output plugins
● Logs as well as metrics
● Lua sandbox for experimentation
Slide 8
Slide 8 text
Some notable Go projects
Prometheus (by SoundCloud)
● Tagged time series
● Query DSL
Slide 9
Slide 9 text
Some notable Go projects
Bosun (by Stack Exchange)
● Similarities to prometheus
● Run alerts against historical data!
● OpenTSDB datastore
Slide 10
Slide 10 text
Some notable Go projects
InfluxDB
● Time series database
● Based on LevelDB
Slide 11
Slide 11 text
An experiment
Anode (github.com/mattrco/anode)
● Setting thresholds is boring, a computer
should do it
● Inspired by heka and Etsy’s skyline
● Thrown together in a few evenings
Slide 12
Slide 12 text
Building Anode
Channels are a good fit for input, processing,
output
Slide 13
Slide 13 text
Building Anode
https://github.com/dgryski/go-change
Slide 14
Slide 14 text
Inspiration
Slide 15
Slide 15 text
Heka in more detail
● Sandbox allows you implement certain
plugin types at runtime
● Change the .lua file, reload
● Resources constrained
● Low memory footprint (16KiB/plugin)
Slide 16
Slide 16 text
Go runtime statistics
● expvars (in the standard lib) is a thing that is
useful
Slide 17
Slide 17 text
Where we want to be
● Adrian Cockroft’s Velocity keynote is full of
good suggestions: https://vimeo.
com/95064249