$30 off During Our Annual Pro Sale. View Details »

Monitoring and Tracing Your Go Services - GothamGo 2017

Monitoring and Tracing Your Go Services - GothamGo 2017

“If a Go microservice falls down in the middle of a server farm, does my pager make a sound?”

If your service is automatically monitored, then the answer is “yes!”. But what if your service isn’t monitored yet? Or what if your monitors alert you when the server is offline, but not on subtler problems like latency spikes or CPU load?

Fortunately, there’s a quick and easy way to get high-resolution metrics for monitoring your services. The Go standard library now contains the basic building blocks for application tracing. When you combine these tools with Veneur, a pure Go distributed metrics aggregator, you can easily answer the questions you care about, like “Which servers are currently running near maximum capacity?”, or “Can our infrastructure handle tomorrow’s product launch?”.

Aditya Mukerjee

October 05, 2017
Tweet

More Decks by Aditya Mukerjee

Other Decks in Technology

Transcript

  1. Monitoring and Tracing Your Go
    Services
    Aditya Mukerjee
    Observability Engineer at Stripe
    GothamGo 2017
    New York City

    View Slide

  2. Go is used to build….
    •Distributed systems
    •Reliable software
    •“The Cloud™”
    @chimeracoder

    View Slide

  3. 1. What should I monitor?
    2. How do I monitor those things in Go?
    3. What does the future of Go observability look like?
    @chimeracoder

    View Slide

  4. Let’s Create an API
    •Return a list of all Twitter followers
    •Record a copy to the database
    •Distributed!
    @chimeracoder
    API
    API
    API
    DB

    View Slide

  5. Service-Level Agreement: What we promise our clients
    @chimeracoder
    Service-Level Indicators: Data used to evaluate the SLA

    View Slide

  6. Service Indicators
    •Rate: Number of requests received
    •Errors: Number of responses written, broken down by HTTP status
    •Duration: Distribution of response latency
    @chimeracoder

    View Slide

  7. Every monitor involves a service-level indicator*
    @chimeracoder
    *for sufficiently broad definitions of “service”

    View Slide

  8. Tool #1: Logs
    @chimeracoder

    View Slide

  9. Logging in Go
    •Use structured logging (e.g. logrus) instead of standard library
    @chimeracoder

    View Slide

  10. Logging in Go
    •Preserve contextual data – don’t just “check, log, and return”
    @chimeracoder

    View Slide

  11. @chimeracoder

    View Slide

  12. Tool #2: Metrics
    @chimeracoder

    View Slide

  13. Statsd protocol
    •Local service listening for metrics over UDP
    •Metric aggregation
    @chimeracoder

    View Slide

  14. @chimeracoder

    View Slide

  15. Aggregation Caveats
    •Cardinality: No aggregation by IP address (or even /24 subnets)
    •Host-local or fault tolerant: pick one!
    @chimeracoder

    View Slide

  16. https://veneur.org

    View Slide

  17. •Distributed statsd
    •Global metric aggregation (cross-server analysis)
    •Horizontally scalable
    •Fault-tolerant
    •Written in Go
    •Higher throughput
    •Tunable
    @chimeracoder

    View Slide

  18. Tool #3: Request Traces
    @chimeracoder

    View Slide

  19. @chimeracoder
    API
    API
    API
    DB

    View Slide

  20. @chimeracoder

    View Slide

  21. Tracing Your Context
    •Like profiling, but across servers
    •Take a snapshot of a request and inspect each function
    @chimeracoder

    View Slide

  22. Putting it all together: Logs, Metrics, and Traces
    @chimeracoder

    View Slide

  23. @chimeracoder

    View Slide

  24. Does it really have to be so complicated?
    @chimeracoder

    View Slide

  25. @chimeracoder
    Application
    logs
    metrics
    traces

    View Slide

  26. What’s the difference?
    •If you squint, it’s hard to tell them apart
    •A log is a metric with “longer” information
    •A trace is a metric that allows “inner joins”
    @chimeracoder

    View Slide

  27. Standard Sensor Format
    @chimeracoder

    View Slide

  28. @chimeracoder

    View Slide

  29. The future of distributed systems is being written in Go
    @chimeracoder
    The future of observability will be written in Go, too

    View Slide

  30. What does the future of observability, written in Go, look like?
    @chimeracoder

    View Slide

  31. Thank you!
    Aditya Mukerjee
    @chimeracoder
    https://veneur.org
    #veneur on Freenode

    View Slide