Building Resilient Services in Go

Building Resilient Services in Go Aditya Mukerjee Observability Engineer at
Stripe GoDays Berlin

Observability measures how well internal states of a system can
be inferred from knowledge of its external outputs @chimeracoder

Go is used to build…. •Distributed systems •Reliable software •“The
Cloud™” @chimeracoder

1. What should I monitor? 2. How do I monitor
those things in Go? 3. What does the future of Go observability look like? @chimeracoder

Let’s Create an API •Return a list of all Twitter
followers •Record a copy to the database •Distributed! @chimeracoder API API API DB

Service-Level Agreement: What we promise our clients @chimeracoder Service-Level Indicators:
Data used to evaluate the SLA

Service-Level Agreement: What we promise our clients @chimeracoder Service-Level Indicators:
Data used to evaluate the SLA Service-Level Objective: What we target internally

Service Indicators •Rate: Number of requests received •Errors: Number of
responses written, broken down by HTTP status •Duration: Distribution of response latency @chimeracoder

Every monitor involves a service-level indicator* @chimeracoder *for sufficiently broad
definitions of “service”

@chimeracoder Metrics, logs, and request traces are used to provide
greater visibility beyond our service indicators

Tool #1: Logs @chimeracoder

Logging in Go •Use structured logging (e.g. logrus) instead of
standard library @chimeracoder

Logging in Go •Preserve contextual data – don’t just “check,
log, and return” @chimeracoder

@chimeracoder

Tool #2: Metrics @chimeracoder

Statsd protocol •Local service listening for metrics over UDP •Metric
aggregation @chimeracoder

@chimeracoder

Aggregation Caveats •Cardinality: No aggregation by IP address (or even
/24 subnets) •Host-local or fault tolerant: pick one! @chimeracoder

https://veneur.org

•Distributed statsd •Global metric aggregation (cross-server analysis) •Horizontally scalable •Fault-tolerant
•Written in Go •Higher throughput •Tunable @chimeracoder

Tool #3: Request Traces @chimeracoder

@chimeracoder API API API DB

@chimeracoder

Tracing Your Context •Like profiling, but across servers •Take a
snapshot of a request and inspect each function @chimeracoder

Putting it all together: Logs, Metrics, and Traces @chimeracoder

@chimeracoder

Does it really have to be so complicated? @chimeracoder

@chimeracoder Application logs metrics traces

What’s the difference? •If you squint, it’s hard to tell
them apart •A log is a metric with “longer” information •A trace is a metric that allows “inner joins” @chimeracoder

Standard Sensor Format @chimeracoder

@chimeracoder

Define and measure your service indicator metrics @chimeracoder

The future of distributed systems is being written in Go
@chimeracoder The future of observability will be written in Go, too

What does the future of observability, written in Go, look
like? @chimeracoder

Thank you! Aditya Mukerjee @chimeracoder https://veneur.org #veneur on Freenode

Building Resilient Services in Go

Building Resilient Services in Go

More Decks by Aditya Mukerjee

Other Decks in Technology

Featured

Transcript