Slide 1

Slide 1 text

Diving into the details with DTrace! Colin Jones @trptcolin

Slide 2

Slide 2 text

systems applications

Slide 3

Slide 3 text

problems solutions

Slide 4

Slide 4 text

problems solutions

Slide 5

Slide 5 text

What even is DTrace?

Slide 6

Slide 6 text

DTrace is… strace

Slide 7

Slide 7 text

DTrace is… strace++

Slide 8

Slide 8 text

strace++ $ sudo dtrace -qn ' syscall:::entry { @[probefunc, execname] = count(); } dtrace:::END { trunc(@, 10); printa(@); }'

Slide 9

Slide 9 text

strace++ $ sudo dtrace -qn 'syscall:::entry { @[probefunc, execname] = count(); } dtrace:::END { trunc(@, 10); printa(@); }' ^C read Slack 1993 select Slack 2017 kevent_qos Slack 2660 kevent_qos WindowServer 2801 psynch_cvsignal Box Sync 2873 psynch_cvwait Box Sync 2873 select Box Sync 4780 madvise Slack 5783 bsdthread_ctl Slack 6910 workq_kernreturn Slack 10622

Slide 10

Slide 10 text

DTrace is… strace++ system resources

Slide 11

Slide 11 text

system resources $ sudo dtrace -qn ' vminfo:::maj_fault { @[execname] = sum(arg0); }'

Slide 12

Slide 12 text

system resources $ sudo dtrace -qn 'vminfo:::maj_fault { @[execname] = sum(arg0); }' ^C 2BUA8C4S2C.com.a 1 mdworker 2 Google Chrome 12 mds_stores 116 Google Chrome He 252

Slide 13

Slide 13 text

DTrace is… strace++ system resources statically-defined tracing

Slide 14

Slide 14 text

statically-defined tracing $ sudo dtrace -qn ' mysql*::: { @[probename] = count(); } dtrace:::END { trunc(@, 10); printa(@); }'

Slide 15

Slide 15 text

statically-defined tracing $ sudo dtrace -qn 'mysql*::: { @[probename] = count(); } dtrace:::END { trunc(@, 10); printa(@); }' ^C query-exec-start 3 query-parse-done 3 query-parse-start 3 query-start 3 select-done 3 select-start 3 net-read-done 18 net-read-start 18 net-write-done 18 net-write-start 18

Slide 16

Slide 16 text

DTrace is… strace++ system resources statically-defined tracing dynamic tracing

Slide 17

Slide 17 text

Dynamic tracing $ sudo dtrace -qn ' fbt:mach_kernel::entry { self->in = timestamp; } fbt:mach_kernel::return /self->in/ { @ = quantize(timestamp - self->in) self->in = 0; }'

Slide 18

Slide 18 text

Dynamic tracing $ sudo dtrace -n 'fbt:mach_kernel::entry { self->in = timestamp; } fbt:mach_kernel::return /self->in/ { @ = quantize(timestamp - self->in); self->in = 0; }' dtrace: description 'fbt:mach_kernel::entry ' matched 24482 probes ^C value ------------- Distribution ------------- count 256 | 0 512 |@@@@@@@@@@@@@@@@@@ 13521 1024 |@@@@@@@@@@@@@@@@@@@ 14961 2048 |@@@ 2172 4096 | 79 8192 | 71 16384 | 7 32768 | 3 65536 | 3 131072 | 2 262144 | 0

Slide 19

Slide 19 text

DTrace is… strace++ system resources statically-defined tracing dynamic tracing

Slide 20

Slide 20 text

Weird slowness… it happens

Slide 21

Slide 21 text

Slooooooooow tests (…sometimes)

Slide 22

Slide 22 text

Multiple tests, multiple machines

Slide 23

Slide 23 text

Affects some teammates worse than others

Slide 24

Slide 24 text

annoying productivity-sapping stressful

Slide 25

Slide 25 text

Hypotheses?

Slide 26

Slide 26 text

Use DTrace!

Slide 27

Slide 27 text

Rule out the usual suspect(s)? ruby*:::gc-mark-begin { self->mark_begin = timestamp; } ruby*:::gc-mark-end /self->mark_begin/ { self->duration = (timestamp - self->mark_begin) / 1000000; self->phase = "mark"; self->mark_begin = 0; } ruby*:::gc-*-end /self->duration/ { printf("%-8d %-8d %-8s\n", pid, self->duration, self->phase); }

Slide 28

Slide 28 text

Garbage collection? $ sudo ruby-gc.d PID TIME(ms) PHASE 30152 7 mark 30152 33 mark 30152 7 mark 30152 9 mark 30152 38 mark 30152 17 mark 30152 16 mark 30152 17 mark 30152 22 mark 30152 71 mark 30152 22 mark 30152 6 sweep 30152 3 sweep 30152 4 sweep 30152 1 sweep 30152 23 mark

Slide 29

Slide 29 text

System resources CPU Memory Disk Network

Slide 30

Slide 30 text

CPU? $ sudo dtrace -qn ' profile-997 { @[execname] = count(); } '

Slide 31

Slide 31 text

CPU? $ sudo dtrace -qn 'profile-997 { @[execname] = count(); }' ^C [...] Box Sync 3 Google Drive 3 ruby 3 Google Chrome 9 systemstatsd 11 iTerm 14 Google Chrome He 24 WindowServer 25 hidd 26 Box Sync Monitor 47 kernel_task 9527

Slide 32

Slide 32 text

CPU? $ sudo dtrace -qn ' profile-997 /execname=="kernel_task"/ { @[stack()] = count(); } '

Slide 33

Slide 33 text

CPU? $ sudo dtrace -qn 'profile-997 /execname=="kernel_task"/ { @[stack()] = count(); }' ^C [...] kernel`0xffffff80010f3d30+0x358 kernel`0xffffff800158d890+0x793 kernel`kevent+0x44 kernel`unix_syscall64+0x251 kernel`hndl_unix_scall64+0x16 116 kernel`processor_idle+0x107 121 kernel`machine_idle+0x2e0 kernel`call_continuation+0x17 82967

Slide 34

Slide 34 text

CPU $ sudo dtrace -qn 'profile-997 { @[execname] = count(); }' ^C [...] Box Sync 3 Google Drive 3 ruby 3 Google Chrome 9 systemstatsd 11 iTerm 14 Google Chrome He 24 WindowServer 25 hidd 26 Box Sync Monitor 47 kernel_task 9527

Slide 35

Slide 35 text

System resources CPU Memory Disk Network

Slide 36

Slide 36 text

Network connections https://github.com/brendangregg/DTrace-book-scripts/blob/master/Chap6/soconnect_mac.d

Slide 37

Slide 37 text

Network connections https://github.com/brendangregg/DTrace-book-scripts/blob/master/Chap6/soconnect_mac.d $ sudo soconnect_mac.d PID PROCESS FAM ADDRESS PORT LAT(us) RESULT 88161 ruby 2 127.0.0.1 5432 144 Success 88161 ruby 2 127.0.0.1 5432 171 Success 88161 ruby 2 127.0.0.1 5432 150 Success 114 AirPlayXPCHelper 2 192.168.1.27 7000 1762 In progress 88161 ruby 2 127.0.0.1 5432 141 Success 88161 ruby 2 127.0.0.1 5432 179 Success 88161 ruby 2 127.0.0.1 5432 137 Success 88161 ruby 2 72.52.4.119 80 29977 Success 88161 ruby 2 72.52.4.119 80 42121 Success 88161 ruby 2 72.52.4.119 80 29471 Success 88161 ruby 2 72.52.4.119 80 29360 Success 88161 ruby 2 72.52.4.119 80 34731 Success 88161 ruby 2 72.52.4.119 80 28824 Success

Slide 38

Slide 38 text

DNS pid$target::getaddrinfo:entry { self->start = timestamp; self->host = copyinstr(arg0); } pid$target::getaddrinfo:return /self->start/ { this->now = (timestamp - timezero) / 1000000; this->time = (timestamp - self->start) / 1000000; printf("%-8d %-8d %s\n", this->now, this->time, self->host); self->start = 0; self->host = 0; }

Slide 39

Slide 39 text

DNS $ sudo dns_latency.d -p 41161 Password: TICK ms HOST 74450 274 redclay.local 74734 0 redclay.local 105006 30075 someplace.com 145171 30075 someplace.com 145191 1 redclay.local 145192 0 redclay.local 177055 0 example.com 252314 2 redclay.local 252315 0 redclay.local 284392 0 example.com

Slide 40

Slide 40 text

Now we know what to fix!

Slide 41

Slide 41 text

What did we learn?

Slide 42

Slide 42 text

Specific vs. General

Slide 43

Slide 43 text

DTrace is… strace++ system resources statically-defined tracing dynamic tracing

Slide 44

Slide 44 text

You can do it!

Slide 45

Slide 45 text

Where can we learn more?

Slide 46

Slide 46 text

Brendan Gregg's blog: http://www.brendangregg.com/dtrace.html The DTrace guide http://dtrace.org/guide DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X, and FreeBSD http://amzn.to/2bqBczw Read

Slide 47

Slide 47 text

Scripts that ship with OS X find /usr/bin -name "*.d" more DTrace scripts https://github.com/brendangregg/DTrace-book-scripts Examples

Slide 48

Slide 48 text

DTrace: not just performance performance: not just DTrace Extrapolate

Slide 49

Slide 49 text

Try it out!

Slide 50

Slide 50 text

DTrace & SIP on El Capitan: http://internals.exposed/blog/dtrace-vs-sip.html Julia Evans' strace zine: http://jvns.ca/blog/2015/04/14/strace-zine/ Jeff Hodges "Notes on Distributed Systems for Young Bloods": http://bit.ly/2b2mxLf Brendan Gregg's blog: http://www.brendangregg.com/dtrace.html The DTrace guide: http://dtrace.org/guide DTrace book: http://amzn.to/2bqBczw DTrace Toolkit: https://github.com/opendtrace/toolkit DTrace book scripts: https://github.com/brendangregg/DTrace-book-scripts SystemTap (Linux): https://sourceware.org/systemtap/ bcc: https://github.com/iovisor/bcc Resources

Slide 51

Slide 51 text

Thanks! Colin Jones @trptcolin [email protected]