Slide 1

Slide 1 text

tracing for granularity JBD, Google (@rakyll)

Slide 2

Slide 2 text

@rakyll

Slide 3

Slide 3 text

@rakyll tracing? What is tracing and why do we trace?

Slide 4

Slide 4 text

@rakyll

Slide 5

Slide 5 text

@rakyll clogged?

Slide 6

Slide 6 text

@rakyll leaking?

Slide 7

Slide 7 text

@rakyll path and direction?

Slide 8

Slide 8 text

@rakyll 100% availability (is a lie)

Slide 9

Slide 9 text

“ @rakyll A service is available if users cannot tell there was an outage.

Slide 10

Slide 10 text

@rakyll Without an SLO, your team has no principled way of saying what level of downtime is acceptable. ● Error rate ● Latency or throughput expectations Service Level Objectives (SLOs)

Slide 11

Slide 11 text

@rakyll 28 ms 100 ms 172 ms 56 ms 356 ms what user sees what else we can see sec.Check auth.AccessToken cache.Lookup spanner.Query GET /messages

Slide 12

Slide 12 text

@rakyll 182 ms 56 ms 245 ms what user sees what else we can see sec.Check auth.AccessToken GET /messages 7 ms cache.Lookup

Slide 13

Slide 13 text

@rakyll latency...

Slide 14

Slide 14 text

@rakyll Go is the language to write servers. Many runtime activities occur during the program execution: ● scheduling ● memory allocation ● garbage collection Hard to associate a request with its impact on the runtime.

Slide 15

Slide 15 text

@rakyll clogged?

Slide 16

Slide 16 text

“ @rakyll There is no easy way to tell why latency is high for certain requests. Is it due to GC, scheduler or syscalls? Can you review the code and tell us why? -SRE

Slide 17

Slide 17 text

@rakyll Execution tracer $ go tool trace ● Reports fine-grained runtime events in the lifetime of a goroutine. ● Reports utilization of CPU cores. But cannot easily tell how handling a request impacts the runtime.

Slide 18

Slide 18 text

@rakyll 28 ms 100 ms 172 ms 56 ms 356 ms GET /messages auth.AccessToken cache.Lookup spanner.Query GET /messages

Slide 19

Slide 19 text

@rakyll 5 68µs 8 123µs networking serialization + deserialization garbage collection blocking syscall what actually happens 172 ms auth.AccessToken

Slide 20

Slide 20 text

@rakyll 5 68µs 8 123µs epoll executing sys gc netwrite

Slide 21

Slide 21 text

@rakyll How? ● Mark sections in code using runtime/trace. ● Enable execution tracer temporarily and record data. ● Examine the recorded data.

Slide 22

Slide 22 text

@rakyll Go 1.11 introduces... ● User regions, tasks and annotations. ● Association between user code and runtime. ● Association with distributed traces.

Slide 23

Slide 23 text

@rakyll Go 1.11 runtime/trace import “runtime/trace” ctx, task := trace.NewTask(ctx, “myHandler”) defer task.End() // Handler code here....

Slide 24

Slide 24 text

@rakyll region #1 task #1 Go 1.11 runtime/trace region #2 region #3 region #4 region #5 goroutine #1 goroutine #4 goroutine #5

Slide 25

Slide 25 text

@rakyll import _ "net/http/pprof" go func() { log.Println(http.ListenAndServe("localhost:6060", nil)) }()

Slide 26

Slide 26 text

@rakyll $ curl http://server:6060/debug/pprof/trace?seconds=5 -o trace.out $ go tool trace trace.out 2018/05/04 10:39:59 Parsing trace... 2018/05/04 10:39:59 Splitting trace... 2018/05/04 10:39:59 Opening browser. Trace viewer is listening on http://127.0.0.1:51803

Slide 27

Slide 27 text

Execution tracer tasks for RPCs (/usertasks)

Slide 28

Slide 28 text

Execution tracer tasks for RPCs (/usertasks)

Slide 29

Slide 29 text

RPCs overlapping with garbage collection

Slide 30

Slide 30 text

Execution tracer regions (/userregions)

Slide 31

Slide 31 text

Region summary for conn.ready

Slide 32

Slide 32 text

@rakyll Record in production $ curl http://server/debug/pprof/trace?seconds=5 -o trace.out $ go tool trace trace.out

Slide 33

Slide 33 text

@rakyll Try It! Install the Go 1.11 beta1! golang.org/dl

Slide 34

Slide 34 text

@rakyll $ go get go.opencensus.io/trace import rt “runtime/trace” ctx, span := trace.StartSpan(ctx, “/messages”) defer span.End() rt.WithRegion(ctx, “foo”, func(ctx) { // Do something... })

Slide 35

Slide 35 text

@rakyll Limitations ● Execution tracer cannot do accounting for cross-goroutine operations automatically. ● Exposition format is hard to parse if `go trace tool` is not used.

Slide 36

Slide 36 text

thank you! JBD, Google [email protected] @rakyll