Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building Resilient Services in Go

Building Resilient Services in Go

If your service is automatically monitored, then the answer is “yes!”. But what if your service isn’t monitored yet? Or what if your monitors alert you when the server is offline, but not on subtler problems like latency spikes or CPU load?

Fortunately, there’s a quick and easy way to get high-resolution metrics for observing your services and building scalable, resilient services in Go. When you combine the tools from the Go standard library with an intelligent observability pipeline, you can easily answer the questions you care about, like “Which servers are currently running near maximum capacity?”, or “Can our infrastructure handle tomorrow’s product launch?”.

Aditya Mukerjee

January 30, 2019
Tweet

More Decks by Aditya Mukerjee

Other Decks in Technology

Transcript

  1. Observability measures how well internal states of a system can

    be inferred from knowledge of its external outputs @chimeracoder
  2. 1. What should I monitor? 2. How do I monitor

    those things in Go? 3. What does the future of Go observability look like? @chimeracoder
  3. Let’s Create an API •Return a list of all Twitter

    followers •Record a copy to the database •Distributed! @chimeracoder API API API DB
  4. Service-Level Agreement: What we promise our clients @chimeracoder Service-Level Indicators:

    Data used to evaluate the SLA Service-Level Objective: What we target internally
  5. Service Indicators •Rate: Number of requests received •Errors: Number of

    responses written, broken down by HTTP status •Duration: Distribution of response latency @chimeracoder
  6. @chimeracoder Metrics, logs, and request traces are used to provide

    greater visibility beyond our service indicators
  7. Aggregation Caveats •Cardinality: No aggregation by IP address (or even

    /24 subnets) •Host-local or fault tolerant: pick one! @chimeracoder
  8. Tracing Your Context •Like profiling, but across servers •Take a

    snapshot of a request and inspect each function @chimeracoder
  9. What’s the difference? •If you squint, it’s hard to tell

    them apart •A log is a metric with “longer” information •A trace is a metric that allows “inner joins” @chimeracoder
  10. The future of distributed systems is being written in Go

    @chimeracoder The future of observability will be written in Go, too