Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Speaker Deck
PRO
Sign in
Sign up
for free
Tracing for Granularity
JBD
June 02, 2018
Programming
2
1.5k
Tracing for Granularity
JBD
June 02, 2018
Tweet
Share
More Decks by JBD
See All by JBD
rakyll
5
1.2k
rakyll
8
1.6k
rakyll
3
1.1k
rakyll
1
130
rakyll
0
120
rakyll
2
880
rakyll
0
3.7k
rakyll
2
210
rakyll
1
110
Other Decks in Programming
See All in Programming
doyaaaaaken
1
330
yattom
32
11k
chatii
2
290
fr0gger
2
2.8k
tooppoo
1
450
dnskimo
7
1.4k
hr01
0
1.9k
deepu105
1
190
takuyaa
4
480
daipresents
0
330
dhmegane
0
230
y__mattu
0
160
Featured
See All Featured
sachag
445
36k
pauljervisheath
195
15k
chrislema
173
14k
bermonpainter
343
26k
danielanewman
201
20k
marktimemedia
7
450
jcasabona
8
590
frogandcode
128
20k
jonyablonski
21
1.3k
eileencodes
114
25k
caitiem20
311
17k
sugarenia
233
880k
Transcript
tracing for granularity JBD, Google (@rakyll)
@rakyll
@rakyll tracing? What is tracing and why do we trace?
@rakyll
@rakyll clogged?
@rakyll leaking?
@rakyll path and direction?
@rakyll 100% availability (is a lie)
“ @rakyll A service is available if users cannot tell
there was an outage.
@rakyll Without an SLO, your team has no principled way
of saying what level of downtime is acceptable. • Error rate • Latency or throughput expectations Service Level Objectives (SLOs)
@rakyll 28 ms 100 ms 172 ms 56 ms 356
ms what user sees what else we can see sec.Check auth.AccessToken cache.Lookup spanner.Query GET /messages
@rakyll 182 ms 56 ms 245 ms what user sees
what else we can see sec.Check auth.AccessToken GET /messages 7 ms cache.Lookup
@rakyll latency...
@rakyll Go is the language to write servers. Many runtime
activities occur during the program execution: • scheduling • memory allocation • garbage collection Hard to associate a request with its impact on the runtime.
@rakyll clogged?
“ @rakyll There is no easy way to tell why
latency is high for certain requests. Is it due to GC, scheduler or syscalls? Can you review the code and tell us why? -SRE
@rakyll Execution tracer $ go tool trace • Reports fine-grained
runtime events in the lifetime of a goroutine. • Reports utilization of CPU cores. But cannot easily tell how handling a request impacts the runtime.
@rakyll 28 ms 100 ms 172 ms 56 ms 356
ms GET /messages auth.AccessToken cache.Lookup spanner.Query GET /messages
@rakyll 5 68µs 8 123µs networking serialization + deserialization garbage
collection blocking syscall what actually happens 172 ms auth.AccessToken
@rakyll 5 68µs 8 123µs epoll executing sys gc netwrite
@rakyll How? • Mark sections in code using runtime/trace. •
Enable execution tracer temporarily and record data. • Examine the recorded data.
@rakyll Go 1.11 introduces... • User regions, tasks and annotations.
• Association between user code and runtime. • Association with distributed traces.
@rakyll Go 1.11 runtime/trace import “runtime/trace” ctx, task := trace.NewTask(ctx,
“myHandler”) defer task.End() // Handler code here....
@rakyll region #1 task #1 Go 1.11 runtime/trace region #2
region #3 region #4 region #5 goroutine #1 goroutine #4 goroutine #5
@rakyll import _ "net/http/pprof" go func() { log.Println(http.ListenAndServe("localhost:6060", nil)) }()
@rakyll $ curl http://server:6060/debug/pprof/trace?seconds=5 -o trace.out $ go tool trace
trace.out 2018/05/04 10:39:59 Parsing trace... 2018/05/04 10:39:59 Splitting trace... 2018/05/04 10:39:59 Opening browser. Trace viewer is listening on http://127.0.0.1:51803
Execution tracer tasks for RPCs (/usertasks)
Execution tracer tasks for RPCs (/usertasks)
RPCs overlapping with garbage collection
Execution tracer regions (/userregions)
Region summary for conn.ready
@rakyll Record in production $ curl http://server/debug/pprof/trace?seconds=5 -o trace.out $
go tool trace trace.out
@rakyll Try It! Install the Go 1.11 beta1! golang.org/dl
@rakyll $ go get go.opencensus.io/trace import rt “runtime/trace” ctx, span
:= trace.StartSpan(ctx, “/messages”) defer span.End() rt.WithRegion(ctx, “foo”, func(ctx) { // Do something... })
@rakyll Limitations • Execution tracer cannot do accounting for cross-goroutine
operations automatically. • Exposition format is hard to parse if `go trace tool` is not used.
thank you! JBD, Google jbd@google.com @rakyll