go tool trace
for correct and effective concurrency
Slide 2
Slide 2 text
Let's talk about Go
Slide 3
Slide 3 text
Let's talk about go
Slide 4
Slide 4 text
A special tool for Go's needs
* and Chromium
Slide 5
Slide 5 text
goroutine scheduling
instrumented
Slide 6
Slide 6 text
Concurrency,
and how to manage it
Slide 7
Slide 7 text
Parallelism,
and how to exploit it
Slide 8
Slide 8 text
$ go doc runtime/trace
package trace // import "runtime/trace"
Go execution tracer. The tracer captures
a wide range of execution events ...
A trace can be analyzed later with
'go tool trace' command.
Slide 9
Slide 9 text
$ go doc cmd/trace
Trace is a tool for viewing trace files.
...
View the trace in a web browser:
go tool trace trace.out
Slide 10
Slide 10 text
pp-quiet-10.first.trace.redacted.html
Slide 11
Slide 11 text
grpc.pre-1005-5.exectrace
Slide 12
Slide 12 text
The tool in three demos:
1. A timing-dependent bug
2. What it doesn't show
3. Latency during GC
Slide 13
Slide 13 text
#1: A race condition
Slide 14
Slide 14 text
go test -race
go build -race
go install -race
Slide 15
Slide 15 text
==================
WARNING: DATA RACE
Read at 0x00c420141500 by goroutine 27:
...
Previous write at 0x00c420141500 by
goroutine 26:
...
Slide 16
Slide 16 text
a logical race, not a data race
Slide 17
Slide 17 text
gRPC, HTTP/2,
and flow control
Slide 18
Slide 18 text
grpc.pre-1005-5.exectrace
stuff
happening
stuff
not happening
Slide 19
Slide 19 text
$ go tool trace trace.out
$ go test -trace=trace.out
- func (qb *quotaPool) cancel() {
- qb.mu.Lock(); defer qb.mu.Unlock()
-
- select {
- case n := <-qb.c:
- qb.quota += n
- default:
- }
- }
Slide 51
Slide 51 text
func (qb *quotaPool) add(v int) {
qb.mu.Lock(); defer qb.mu.Unlock()
select {
case n := <-qb.c:
qb.quota += n
default:
}
if qb.quota += v; qb.quota > 0 {
qb.c <- qb.quota
qb.quota = 0
}
}
Slide 52
Slide 52 text
func (qb *quotaPool) add(v int) {
qb.mu.Lock(); defer qb.mu.Unlock()
select {
case n := <-qb.c:
qb.quota += n
default:
}
if qb.quota += v; qb.quota > 0 {
qb.c <- qb.quota
qb.quota = 0
}
}
Slide 53
Slide 53 text
#2: It's not a panacea
Slide 54
Slide 54 text
Three ways to get data:
1. Testing with -trace flag
2. Direct runtime/trace use
3. net/http/pprof handlers
Slide 55
Slide 55 text
pp-quiet-10.first.trace.redacted.html
suspicious gaps
every five seconds
garbage
collection
unlike other
goroutines
Slide 56
Slide 56 text
pp-quiet-10.first.trace.redacted.html
nothing else runs
during these 90ms
Slide 57
Slide 57 text
runtime.ReadMemStats
• and Go 1.8 or earlier
• and large (40GB) heap
• leads to long pauses
(fixed in Go 1.9)
Slide 58
Slide 58 text
pp-quiet-10.first.trace.redacted.html
what does this goroutine do?
how does it spend its CPU time?
can it run more efficiently?
can it finish sooner (parallelism)?
Slide 59
Slide 59 text
pp-quiet-10.first.trace.redacted.html
net/http.(*conn).serve
what request did it serve?
which handler func ran?
what status code was returned?
Slide 60
Slide 60 text
#3: The GC is still improving
Slide 61
Slide 61 text
tc.1.third.trace.redacted.html
patchy network
long run queue
some behavior is
different during GC
* this is probably
bad, but how good
should we expect?
Slide 62
Slide 62 text
Go garbage collection timeline:
Go 1: program fully stopped
Go 1.1: GC uses parallel threads
Go 1.4: precise collection
Go 1.5: global pauses <10ms
Go 1.8: goroutine pauses <100µs
Slide 63
Slide 63 text
garbage
collection
Go 1 GC
whole program
stopped ("STW")
user code ("mutator")
Slide 64
Slide 64 text
user code ("mutator")
garbage
collection
Go 1.1 GC
whole program
stopped ("STW")
Slide 65
Slide 65 text
garbage
collection
Go 1.5 GC
user code ("mutator")
whole program
stopped ("STW")
Slide 66
Slide 66 text
garbage
collection
Go 1.8 GC
whole program
stopped ("STW")
user code ("mutator")
Slide 67
Slide 67 text
#3A: Stop-The-World pauses
Slide 68
Slide 68 text
Everything stops when the GC
begins and ends mark phase
Stopping everything takes time
Slide 69
Slide 69 text
ta.nopreempt-2.fourth.trace.redacted.html
GC begins
stragglers
GC really begins
3.6ms
unused CPU
Slide 70
Slide 70 text
Goroutines can stop when...
✓ allocating memory
✓ calling functions
✓ communicating
✗ crunching numbers in a loop
Slide 71
Slide 71 text
Seen in:
• encoding/base64
• encoding/json
• .../golang/protobuf/proto
it's measurable for 1MB values
... if you're looking for it
* check code lines before "End Stack Trace"
Slide 72
Slide 72 text
go func() {
for i := 0; i < 1e9; i++ {
}
}
Or write your own tight loop:
Slide 73
Slide 73 text
golang.org/issue/10958
The Go 1.10 compiler should
have a general, permanent fix
Workaround available now
Slide 74
Slide 74 text
ta.preempt-2.second.trace.redacted.html
quick pauses, less waste
Slide 75
Slide 75 text
#3B: Other awkward pauses
Slide 76
Slide 76 text
A mark/sweep GC:
• "mark" finds in-use memory
• "sweep" reclaims the rest
Slide 77
Slide 77 text
GC needs to make progress
User code works against that
User code is forced to help out
Slide 78
Slide 78 text
tc.1.third.trace.redacted.html
read(2) syscall
* hold 'shift' to draw
a selection box
user code in goroutines
runtime
work below
helping
the GC
Slide 79
Slide 79 text
tc.1.third.trace.redacted.html
this assist ran for 4.4ms
Slide 80
Slide 80 text
tc.1.third.trace.redacted.html
this assist ran for 15.6ms
... and didn't even finish
Slide 81
Slide 81 text
Most assists are well-deserved
But they start suddenly
Sweeping requires assists too
Don't allocate in critical paths?
Slide 82
Slide 82 text
go tool trace
http://.../debug/pprof/trace
Slide 83
Slide 83 text
How can the tool help you?
1. See time-dependent issues
2. Complements other profiles
3. Find latency improvements
Slide 84
Slide 84 text
Be prepared:
practice using the tools
Thank you!
@rhyshiltner