Monitoring and Tracing Your Go Services - GothamGo 2017

Monitoring and Tracing Your Go Services - GothamGo 2017

“If a Go microservice falls down in the middle of a server farm, does my pager make a sound?”

If your service is automatically monitored, then the answer is “yes!”. But what if your service isn’t monitored yet? Or what if your monitors alert you when the server is offline, but not on subtler problems like latency spikes or CPU load?

Fortunately, there’s a quick and easy way to get high-resolution metrics for monitoring your services. The Go standard library now contains the basic building blocks for application tracing. When you combine these tools with Veneur, a pure Go distributed metrics aggregator, you can easily answer the questions you care about, like “Which servers are currently running near maximum capacity?”, or “Can our infrastructure handle tomorrow’s product launch?”.

94dcff33cbdf74b5d785369ac54bc1a8?s=128

Aditya Mukerjee

October 05, 2017
Tweet

Transcript

  1. Monitoring and Tracing Your Go Services Aditya Mukerjee Observability Engineer

    at Stripe GothamGo 2017 New York City
  2. Go is used to build…. •Distributed systems •Reliable software •“The

    Cloud™” @chimeracoder
  3. 1. What should I monitor? 2. How do I monitor

    those things in Go? 3. What does the future of Go observability look like? @chimeracoder
  4. Let’s Create an API •Return a list of all Twitter

    followers •Record a copy to the database •Distributed! @chimeracoder API API API DB
  5. Service-Level Agreement: What we promise our clients @chimeracoder Service-Level Indicators:

    Data used to evaluate the SLA
  6. Service Indicators •Rate: Number of requests received •Errors: Number of

    responses written, broken down by HTTP status •Duration: Distribution of response latency @chimeracoder
  7. Every monitor involves a service-level indicator* @chimeracoder *for sufficiently broad

    definitions of “service”
  8. Tool #1: Logs @chimeracoder

  9. Logging in Go •Use structured logging (e.g. logrus) instead of

    standard library @chimeracoder
  10. Logging in Go •Preserve contextual data – don’t just “check,

    log, and return” @chimeracoder
  11. @chimeracoder

  12. Tool #2: Metrics @chimeracoder

  13. Statsd protocol •Local service listening for metrics over UDP •Metric

    aggregation @chimeracoder
  14. @chimeracoder

  15. Aggregation Caveats •Cardinality: No aggregation by IP address (or even

    /24 subnets) •Host-local or fault tolerant: pick one! @chimeracoder
  16. https://veneur.org

  17. •Distributed statsd •Global metric aggregation (cross-server analysis) •Horizontally scalable •Fault-tolerant

    •Written in Go •Higher throughput •Tunable @chimeracoder
  18. Tool #3: Request Traces @chimeracoder

  19. @chimeracoder API API API DB

  20. @chimeracoder

  21. Tracing Your Context •Like profiling, but across servers •Take a

    snapshot of a request and inspect each function @chimeracoder
  22. Putting it all together: Logs, Metrics, and Traces @chimeracoder

  23. @chimeracoder

  24. Does it really have to be so complicated? @chimeracoder

  25. @chimeracoder Application logs metrics traces

  26. What’s the difference? •If you squint, it’s hard to tell

    them apart •A log is a metric with “longer” information •A trace is a metric that allows “inner joins” @chimeracoder
  27. Standard Sensor Format @chimeracoder

  28. @chimeracoder

  29. The future of distributed systems is being written in Go

    @chimeracoder The future of observability will be written in Go, too
  30. What does the future of observability, written in Go, look

    like? @chimeracoder
  31. Thank you! Aditya Mukerjee @chimeracoder https://veneur.org #veneur on Freenode