Upgrade to Pro — share decks privately, control downloads, hide ads and more …

strace'ing everything

VM Farms Inc.
November 13, 2012

strace'ing everything

strace is a fantastic tool for troubleshooting issues. This presentation demonstrates some of its power.

VM Farms Inc.

November 13, 2012
Tweet

More Decks by VM Farms Inc.

Other Decks in Technology

Transcript

  1. what is strace? is a diagnostic and debugging tool to

    Linux. monitors the user/kernel boundary and intercepts any system calls. ---------------------- | ----------------------------------- | Application | | | Kernel | ---------------------- | ----------------------------------- f = open('fileA', 'r') ------|----> open("fileA", O_RDONLY) = 3 f.read() <-----|----- read(3, "file stuff"..., 40) = 40 f.close() ------|----> close(3) | [ strace ]
  2. why use strace? readily available on all Linux distros. quick

    and dirty way to “capture” a running process to see what it’s doing. does not require any code changes, debugging flags, or even a restart.
  3. what to use it for? identifying current problems - i.e.

    production problems. helps you determine if it’s your app or the system. find slow responding services (network, DNS, IO). inspect SQL queries being issued. permission issues (files, folders, etc...). find which file/module a certain app chokes on.
  4. what it’s not good for? profiling your code - there

    are better tools for this. finding coding hot spots.
  5. pitfalls? it impacts performance. use with caution. on a busy

    app, output can be enormous - use an output file.
  6. how to use it? (basic) $ strace ls execve("/bin/ls", ["ls"],

    [/* 39 vars */]) = 0 brk(0) = 0x192f000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f362864f000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=51877, ...}) = 0 mmap(NULL, 51877, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f3628642000 close(3) = 0 ... statfs("/selinux", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=4875805, f_bfree=1151405, f_bavail=905070, f_files=1233024, f_ffree=784428, f_fsid={-1347478562, 1449920506}, ... open("/proc/filesystems", O_RDONLY) = 3 ... write(1, "10gen-mms-agent.tar.gz\t\t fr"..., 17810gen-mms-agent.tar.gz freeipmi-1.2.1-1.fc18.src.rpm nginx-1.0.15-1.el6.ngx.src.rpm pdf-renderer-0-0.12.128svn. 20110310.fc19.src.rpm rubygem-passenger-3.0.11-9.src.rpm ) = 178 ... close(1) = 0 close(2) = 0 exit_group(0) = ? 133 lines...
  7. how to use it? (specific calls) $ strace -e open

    ls open("/etc/ld.so.cache", O_RDONLY) = 3 open("/lib64/libselinux.so.1", O_RDONLY) = 3 open("/lib64/librt.so.1", O_RDONLY) = 3 open("/lib64/libcap.so.2", O_RDONLY) = 3 open("/lib64/libacl.so.1", O_RDONLY) = 3 open("/lib64/libc.so.6", O_RDONLY) = 3 open("/lib64/libdl.so.2", O_RDONLY) = 3 open("/lib64/libpthread.so.0", O_RDONLY) = 3 open("/lib64/libattr.so.1", O_RDONLY) = 3 open("/proc/filesystems", O_RDONLY) = 3 open("/usr/lib/locale/locale-archive", O_RDONLY) = 3 open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
  8. how to use it? (multiple calls) $ strace -e trace=open,read

    ls open("/etc/ld.so.cache", O_RDONLY) = 3 open("/lib64/libselinux.so.1", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 X\240\2229\0\0\0"..., 832) = 832 open("/lib64/librt.so.1", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@!\300\3222\0\0\0"..., 832) = 832 open("/lib64/libcap.so.2", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\23\200\2001\0\0\0"..., 832) = 832 open("/lib64/libacl.so.1", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\35\0\1771\0\0\0"..., 832) = 832 open("/lib64/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\355\301{1\0\0\0"..., 832) = 832 open("/lib64/libdl.so.2", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\r\340\2249\0\0\0"..., 832) = 832 open("/lib64/libpthread.so.0", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \\@|1\0\0\0"..., 832) = 832 open("/lib64/libattr.so.1", O_RDONLY) = 3 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\23@\2001\0\0\0"..., 832) = 832 open("/proc/filesystems", O_RDONLY) = 3 read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tb"..., 1024) = 323 read(3, "", 1024) = 0 open("/usr/lib/locale/locale-archive", O_RDONLY) = 3 open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
  9. how to use it? (output file) $ strace -o output.txt

    ls 10gen-mms-agent.tar.gz freeipmi-1.2.1-1.fc18.src.rpm nginx-1.0.15-1.el6.ngx.src.rpm ovaldi-5.9.1-5.fc18.src.rpm rpmbuild $ cat output.txt execve("/bin/ls", ["ls"], [/* 39 vars */]) = 0 brk(0) = 0x22a0000 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffa13573000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 ...
  10. how to use it? (perms) # cat test_module/__init__.py def testing():

    print "success!" # ls -la test_module total 16 drwx------ 2 root root 4096 Nov 14 14:52 . drwx------. 4 vmfarms vmfarms 4096 Nov 14 14:53 .. -rw------- 1 root root 34 Nov 14 14:52 __init__.py -rw------- 1 root root 245 Nov 14 14:52 __init__.pyc $ python -c 'import test_module' Traceback (most recent call last): File "<string>", line 1, in <module> ImportError: No module named test_module $ strace -e stat python -c 'import test_module' ... stat("test_module", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0 stat("test_module/__init__.py", 0x7fff5d9ca580) = -1 EACCES (Permission denied) stat("test_module/__init__.pyc", 0x7fff5d9ca580) = -1 EACCES (Permission denied) ...
  11. how to use it? (tracing pids) $ strace -p <PID>

    -o output.txt Process <PID> attached - interrupt to quit ^CProcess <PID> detached $ less output.txt ... write(9, "\324\0\0\0\3SELECT (1) AS `a` FROM `fake_table1` WHERE (`fake_table1`.`id` = 36 AND `fake_table1`.`other_id` = 199 AND `fake_table1`.`date` = '2012-10-18' ) LIMIT 1", 216) = 216 read(9, "\1\0\0\1\1\27\0\0\2\3def\0\0\0\1a\0\f?", 16384) = 50 gettimeofday({1352845540, 332140}, NULL) = 0 poll([{fd=9, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) write(9, "\1\0\0\0\16", 5) = 5 read(9, "\7\0\0\1\0\0\0\1\0\0\0", 16384) = 11 poll([{fd=9, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) write(9, "\241\0\0\0\3SELECT (1) AS `a` FROM `fake_table1` WHERE (`fake_table1`.`id` = 36 AND `fake_table1`.`other_id` = 199 ) LIMIT 1", 165) = 165 read(9, "\1\0\0\1\1\27\0\0\2\3def\0\0\0\1a\0\f?", 16384) = 56 poll([{fd=9, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) write(9, "\1\0\0\0\16", 5) = 5 read(9, "\7\0\0\1\0\0\0\1\0\0\0", 16384) = 11 ... $ lsof -p <PID> COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME ... python <PID> deploy 9u IPv4 14675804 0t0 TCP <LOCAL_IP>:45064-><REMOTE_IP>:mysql (ESTABLISHED) ...
  12. how to use it? (all together) $ strace -s 5000

    -p <PID> -t -e trace=write,read -o output2.txt $ less output2.txt ... 17:39:41 write(9, "\324\0\0\0\3SELECT (1) AS `a` FROM `fake_table1` WHERE (`fake_table1`.`id` = 36 AND `fake_table1`.`other_id` = 199 AND `fake_table1`.`date` = '2012-10-23' ) LIMIT 1", 216) = 216 17:39:45 read(9, "\1\0\0\1\1\27\0\0\2\3def\0\0\0\1a\0\f? \0\1\0\0\0\10\201\0\0\0\0\5\0\0\3\376\0\0\1\0\5\0\0\4\376\0\0\1\0", 16384) = 50 17:39:45 write(9, "\1\0\0\0\16", 5) = 5 17:39:45 read(9, "\7\0\0\1\0\0\0\1\0\0\0", 16384) = 11 17:39:45 write(9, "\241\0\0\0\3SELECT (1) AS `a` FROM `fake_table1` WHERE (`fake_table1`.`id` = 36 AND `fake_table1`.`other_id` = 199 ) LIMIT 1", 165) = 165 17:39:49 read(9, "\1\0\0\1\1\27\0\0\2\3def\0\0\0\1a\0\f? \0\1\0\0\0\10\201\0\0\0\0\5\0\0\3\376\0\0\1\0\2\0\0\4\0011\5\0\0\5\376\0\0\1\0", 16384) = 56
  13. how to use it? (fork me) $ strace -f -p

    <PID> -o output.txt Process <PID> attached with 18 threads - interrupt to quit Process <PID2> attached Process <PID3> attached ... $ less output.txt 31673 gettimeofday({1352847681, 719431}, NULL) = 0 31673 setsockopt(40, SOL_SOCKET, SO_RCVTIMEO, "\200p\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0 31673 read(40, <unfinished ...> 31674 <... read resumed> "A\0\0\1", 4) = 4 31674 read(42, <unfinished ...> 31673 <... read resumed> "H\0\0\0", 4) = 4 31674 <... read resumed> "\205\242\2\0\0\0\0@ \10\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65) = 65 31673 read(40, <unfinished ...> 31674 write(42, "\7\0\0\2\0\0\0\2\0\0\0", 11 <unfinished ...> 31673 <... read resumed> "\3SELECT option_name, option_valu"..., 72) = 72 31674 <... write resumed> ) = 11 31673 setsockopt(40, SOL_SOCKET, SO_RCVTIMEO, "\36\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16 <unfinished ...> 31674 setsockopt(42, SOL_SOCKET, SO_RCVTIMEO, "\36\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0 31673 <... setsockopt resumed> ) = 0 31674 setsockopt(42, SOL_SOCKET, SO_SNDTIMEO, "<\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16 <unfinished ...> 31673 gettimeofday( <unfinished ...> 31674 <... setsockopt resumed> ) = 0 31673 <... gettimeofday resumed> {1352847681, 720320}, NULL) = 0 31674 gettimeofday({1352847681, 720417}, NULL) = 0
  14. how to use it? (fork me again) $ strace -ff

    -p <PID> -o output.txt Process <PID> attached with 18 threads - interrupt to quit Process <PID2> attached Process <PID3> attached ... $ ls output* output.<PID> output.<PID2> output.<PID3> $ less output.<PID2> write(40, "ed=$matches[2]\";s:32:\"tag/([^/]+"..., 8584) = 8584 pread(54, "\v\0\241\0\0\0\0\0\t'T;i:1;s:86", 20, 444276) = 20 pread(54, ":\"www.example.com/products-"..., 152, 444296) = 152 pread(54, "\v\0029\0\0\0\0\0\7\372@orthern-g", 20, 524172) = 20 pread(54, "love/.needs-rebuild\";i:1;s:92:\"w"..., 560, 524192) = 560 pread(54, "\v\1\31\0\0\0\0\0\th\2007:\"www.ex", 20, 522816) = 20 pread(54, "ample.com/people/book/.need"..., 272, 522836) = 272 pread(54, "\v\0\235\0\0\0\0\0\10\332xild\";i:1;", 20, 68168) = 20 pread(54, "s:59:\"www.example.com/colle"..., 148, 68188) = 148 pread(54, "\v\0\31\0\0\0\0\0\10((ions/pola", 20, 449192) = 20 pread(54, "r-bears-internat", 16, 449212) = 16 pread(54, "\v\1\315\0\0\0\0\0\0-\374ional/.ne", 20, 534568) = 20 pread(54, "eds-rebuild\";i:1;s:41:\"www.examp"..., 452, 534588) = 452 pread(54, "\v\0\335\0\0\0\0\0\tj\34-page/pro", 20, 11772) = 20 pread(54, "ducts/tremblant-jacket/\";i:1;s:6"..., 212, 11792) = 212
  15. how to use it? (stats) $ strace -c -p <PID>

    Process <PID> attached - interrupt to quit ^CProcess <PID> detached % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 51.71 0.008147 4 2235 recvfrom 41.03 0.006465 3 2581 read 3.71 0.000585 0 2235 sendto 1.75 0.000276 0 7051 gettimeofday 0.95 0.000150 0 2582 write 0.50 0.000078 1 95 17 futex 0.34 0.000054 0 2582 poll 0.00 0.000000 0 7 brk ------ ----------- ----------- --------- --------- ---------------- 100.00 0.015755 19368 17 total