$30 off During Our Annual Pro Sale. View Details »

strace'ing everything - a.k.a. why is my app so slow?

Hany Fahim
November 15, 2012

strace'ing everything - a.k.a. why is my app so slow?

strace is one of those tools that you probably never heard of, but can't live without it once you start using it.

Hany Fahim

November 15, 2012
Tweet

More Decks by Hany Fahim

Other Decks in Technology

Transcript

  1. strace’ing everything
    a.k.a. why is my app so slow?
    @vmfarms
    Hany Fahim
    November 13, 2012

    View Slide

  2. what is strace?
    is a diagnostic and debugging tool to Linux.
    monitors the user/kernel boundary and intercepts any
    system calls.
    ---------------------- | -----------------------------------
    | Application | | | Kernel |
    ---------------------- | -----------------------------------
    f = open('fileA', 'r') ------|----> open("fileA", O_RDONLY) = 3
    f.read() <-----|----- read(3, "file stuff"..., 40) = 40
    f.close() ------|----> close(3)
    |
    [ strace ]

    View Slide

  3. why use strace?
    readily available on all Linux distros.
    quick and dirty way to “capture” a running process to see
    what it’s doing.
    does not require any code changes, debugging flags, or even
    a restart.

    View Slide

  4. what to use it for?
    identifying current problems - i.e. production problems.
    helps you determine if it’s your app or the system.
    find slow responding services (network, DNS, IO).
    inspect SQL queries being issued.
    permission issues (files, folders, etc...).
    find which file/module a certain app chokes on.

    View Slide

  5. what it’s not good for?
    profiling your code - there are better tools for this.
    finding coding hot spots.

    View Slide

  6. pitfalls?
    it impacts performance. use with caution.
    on a busy app, output can be enormous - use an output file.

    View Slide

  7. how to use it? (basic)
    $ strace ls
    execve("/bin/ls", ["ls"], [/* 39 vars */]) = 0
    brk(0) = 0x192f000
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f362864f000
    access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
    open("/etc/ld.so.cache", O_RDONLY) = 3
    fstat(3, {st_mode=S_IFREG|0644, st_size=51877, ...}) = 0
    mmap(NULL, 51877, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f3628642000
    close(3) = 0
    ...
    statfs("/selinux", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=4875805,
    f_bfree=1151405, f_bavail=905070, f_files=1233024, f_ffree=784428, f_fsid={-1347478562,
    1449920506},
    ...
    open("/proc/filesystems", O_RDONLY) = 3
    ...
    write(1, "10gen-mms-agent.tar.gz\t\t fr"..., 17810gen-mms-agent.tar.gz
    freeipmi-1.2.1-1.fc18.src.rpm nginx-1.0.15-1.el6.ngx.src.rpm pdf-renderer-0-0.12.128svn.
    20110310.fc19.src.rpm rubygem-passenger-3.0.11-9.src.rpm
    ) = 178
    ...
    close(1) = 0
    close(2) = 0
    exit_group(0) = ?
    133 lines...

    View Slide

  8. how to use it? (specific calls)
    $ strace -e open ls
    open("/etc/ld.so.cache", O_RDONLY) = 3
    open("/lib64/libselinux.so.1", O_RDONLY) = 3
    open("/lib64/librt.so.1", O_RDONLY) = 3
    open("/lib64/libcap.so.2", O_RDONLY) = 3
    open("/lib64/libacl.so.1", O_RDONLY) = 3
    open("/lib64/libc.so.6", O_RDONLY) = 3
    open("/lib64/libdl.so.2", O_RDONLY) = 3
    open("/lib64/libpthread.so.0", O_RDONLY) = 3
    open("/lib64/libattr.so.1", O_RDONLY) = 3
    open("/proc/filesystems", O_RDONLY) = 3
    open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
    open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3

    View Slide

  9. how to use it? (multiple calls)
    $ strace -e trace=open,read ls
    open("/etc/ld.so.cache", O_RDONLY) = 3
    open("/lib64/libselinux.so.1", O_RDONLY) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 X\240\2229\0\0\0"..., 832) = 832
    open("/lib64/librt.so.1", O_RDONLY) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@!\300\3222\0\0\0"..., 832) = 832
    open("/lib64/libcap.so.2", O_RDONLY) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\23\200\2001\0\0\0"..., 832) = 832
    open("/lib64/libacl.so.1", O_RDONLY) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\35\0\1771\0\0\0"..., 832) = 832
    open("/lib64/libc.so.6", O_RDONLY) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\355\301{1\0\0\0"..., 832) = 832
    open("/lib64/libdl.so.2", O_RDONLY) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\340\r\340\2249\0\0\0"..., 832) = 832
    open("/lib64/libpthread.so.0", O_RDONLY) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \\@|1\0\0\0"..., 832) = 832
    open("/lib64/libattr.so.1", O_RDONLY) = 3
    read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\23@\2001\0\0\0"..., 832) = 832
    open("/proc/filesystems", O_RDONLY) = 3
    read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tb"..., 1024) = 323
    read(3, "", 1024) = 0
    open("/usr/lib/locale/locale-archive", O_RDONLY) = 3
    open(".", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3

    View Slide

  10. how to use it? (output file)
    $ strace -o output.txt ls
    10gen-mms-agent.tar.gz freeipmi-1.2.1-1.fc18.src.rpm
    nginx-1.0.15-1.el6.ngx.src.rpm ovaldi-5.9.1-5.fc18.src.rpm rpmbuild
    $ cat output.txt
    execve("/bin/ls", ["ls"], [/* 39 vars */]) = 0
    brk(0) = 0x22a0000
    mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7ffa13573000
    access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
    open("/etc/ld.so.cache", O_RDONLY) = 3
    ...

    View Slide

  11. how to use it? (perms)
    # cat test_module/__init__.py
    def testing():
    print "success!"
    # ls -la test_module
    total 16
    drwx------ 2 root root 4096 Nov 14 14:52 .
    drwx------. 4 vmfarms vmfarms 4096 Nov 14 14:53 ..
    -rw------- 1 root root 34 Nov 14 14:52 __init__.py
    -rw------- 1 root root 245 Nov 14 14:52 __init__.pyc
    $ python -c 'import test_module'
    Traceback (most recent call last):
    File "", line 1, in
    ImportError: No module named test_module
    $ strace -e stat python -c 'import test_module'
    ...
    stat("test_module", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
    stat("test_module/__init__.py", 0x7fff5d9ca580) = -1 EACCES (Permission denied)
    stat("test_module/__init__.pyc", 0x7fff5d9ca580) = -1 EACCES (Permission denied)
    ...

    View Slide

  12. how to use it? (tracing pids)
    $ strace -p -o output.txt
    Process attached - interrupt to quit
    ^CProcess detached
    $ less output.txt
    ...
    write(9, "\324\0\0\0\3SELECT (1) AS `a` FROM `fake_table1` WHERE (`fake_table1`.`id` = 36
    AND `fake_table1`.`other_id` = 199 AND `fake_table1`.`date` = '2012-10-18' ) LIMIT 1", 216)
    = 216
    read(9, "\1\0\0\1\1\27\0\0\2\3def\0\0\0\1a\0\f?", 16384) = 50
    gettimeofday({1352845540, 332140}, NULL) = 0
    poll([{fd=9, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
    write(9, "\1\0\0\0\16", 5) = 5
    read(9, "\7\0\0\1\0\0\0\1\0\0\0", 16384) = 11
    poll([{fd=9, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
    write(9, "\241\0\0\0\3SELECT (1) AS `a` FROM `fake_table1` WHERE (`fake_table1`.`id` = 36
    AND `fake_table1`.`other_id` = 199 ) LIMIT 1", 165) = 165
    read(9, "\1\0\0\1\1\27\0\0\2\3def\0\0\0\1a\0\f?", 16384) = 56
    poll([{fd=9, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
    write(9, "\1\0\0\0\16", 5) = 5
    read(9, "\7\0\0\1\0\0\0\1\0\0\0", 16384) = 11
    ...
    $ lsof -p
    COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
    ...
    python deploy 9u IPv4 14675804 0t0 TCP :45064->:mysql (ESTABLISHED)
    ...

    View Slide

  13. how to use it? (all together)
    $ strace -s 5000 -p -t -e trace=write,read -o output2.txt
    $ less output2.txt
    ...
    17:39:41 write(9, "\324\0\0\0\3SELECT (1) AS `a` FROM `fake_table1` WHERE (`fake_table1`.`id`
    = 36 AND `fake_table1`.`other_id` = 199 AND `fake_table1`.`date` = '2012-10-23' ) LIMIT 1",
    216) = 216
    17:39:45 read(9, "\1\0\0\1\1\27\0\0\2\3def\0\0\0\1a\0\f?
    \0\1\0\0\0\10\201\0\0\0\0\5\0\0\3\376\0\0\1\0\5\0\0\4\376\0\0\1\0", 16384) = 50
    17:39:45 write(9, "\1\0\0\0\16", 5) = 5
    17:39:45 read(9, "\7\0\0\1\0\0\0\1\0\0\0", 16384) = 11
    17:39:45 write(9, "\241\0\0\0\3SELECT (1) AS `a` FROM `fake_table1` WHERE (`fake_table1`.`id`
    = 36 AND `fake_table1`.`other_id` = 199 ) LIMIT 1", 165) = 165
    17:39:49 read(9, "\1\0\0\1\1\27\0\0\2\3def\0\0\0\1a\0\f?
    \0\1\0\0\0\10\201\0\0\0\0\5\0\0\3\376\0\0\1\0\2\0\0\4\0011\5\0\0\5\376\0\0\1\0", 16384) = 56

    View Slide

  14. how to use it? (fork me)
    $ strace -f -p -o output.txt
    Process attached with 18 threads - interrupt to quit
    Process attached
    Process attached
    ...
    $ less output.txt
    31673 gettimeofday({1352847681, 719431}, NULL) = 0
    31673 setsockopt(40, SOL_SOCKET, SO_RCVTIMEO, "\200p\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
    31673 read(40,
    31674 <... read resumed> "A\0\0\1", 4) = 4
    31674 read(42,
    31673 <... read resumed> "H\0\0\0", 4) = 4
    31674 <... read resumed> "\205\242\2\0\0\0\0@
    \10\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65) = 65
    31673 read(40,
    31674 write(42, "\7\0\0\2\0\0\0\2\0\0\0", 11
    31673 <... read resumed> "\3SELECT option_name, option_valu"..., 72) = 72
    31674 <... write resumed> ) = 11
    31673 setsockopt(40, SOL_SOCKET, SO_RCVTIMEO, "\36\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16

    31674 setsockopt(42, SOL_SOCKET, SO_RCVTIMEO, "\36\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
    31673 <... setsockopt resumed> ) = 0
    31674 setsockopt(42, SOL_SOCKET, SO_SNDTIMEO, "<\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0", 16

    31673 gettimeofday(
    31674 <... setsockopt resumed> ) = 0
    31673 <... gettimeofday resumed> {1352847681, 720320}, NULL) = 0
    31674 gettimeofday({1352847681, 720417}, NULL) = 0

    View Slide

  15. how to use it? (fork me again)
    $ strace -ff -p -o output.txt
    Process attached with 18 threads - interrupt to quit
    Process attached
    Process attached
    ...
    $ ls output*
    output. output. output.
    $ less output.
    write(40, "ed=$matches[2]\";s:32:\"tag/([^/]+"..., 8584) = 8584
    pread(54, "\v\0\241\0\0\0\0\0\t'T;i:1;s:86", 20, 444276) = 20
    pread(54, ":\"www.example.com/products-"..., 152, 444296) = 152
    pread(54, "\v\0029\0\0\0\0\0\7\372@orthern-g", 20, 524172) = 20
    pread(54, "love/.needs-rebuild\";i:1;s:92:\"w"..., 560, 524192) = 560
    pread(54, "\v\1\31\0\0\0\0\0\th\2007:\"www.ex", 20, 522816) = 20
    pread(54, "ample.com/people/book/.need"..., 272, 522836) = 272
    pread(54, "\v\0\235\0\0\0\0\0\10\332xild\";i:1;", 20, 68168) = 20
    pread(54, "s:59:\"www.example.com/colle"..., 148, 68188) = 148
    pread(54, "\v\0\31\0\0\0\0\0\10((ions/pola", 20, 449192) = 20
    pread(54, "r-bears-internat", 16, 449212) = 16
    pread(54, "\v\1\315\0\0\0\0\0\0-\374ional/.ne", 20, 534568) = 20
    pread(54, "eds-rebuild\";i:1;s:41:\"www.examp"..., 452, 534588) = 452
    pread(54, "\v\0\335\0\0\0\0\0\tj\34-page/pro", 20, 11772) = 20
    pread(54, "ducts/tremblant-jacket/\";i:1;s:6"..., 212, 11792) = 212

    View Slide

  16. how to use it? (stats)
    $ strace -c -p
    Process attached - interrupt to quit
    ^CProcess detached
    % time seconds usecs/call calls errors syscall
    ------ ----------- ----------- --------- --------- ----------------
    51.71 0.008147 4 2235 recvfrom
    41.03 0.006465 3 2581 read
    3.71 0.000585 0 2235 sendto
    1.75 0.000276 0 7051 gettimeofday
    0.95 0.000150 0 2582 write
    0.50 0.000078 1 95 17 futex
    0.34 0.000054 0 2582 poll
    0.00 0.000000 0 7 brk
    ------ ----------- ----------- --------- --------- ----------------
    100.00 0.015755 19368 17 total

    View Slide

  17. questions?

    View Slide