Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Peeking Into Ruby

Peeking Into Ruby

Your Ruby app is in production, but something isn’t quite right. It worked locally, it passed CI… why’s the running app acting weird?

If this sounds familiar, you’re in luck. Multiple tools exist for grappling with a running Ruby app. This talk will introduce a variety of tools and techniques for peeking into what your Ruby app is doing. From Ruby-level method tracing using rbtrace, all the way down to watching kernel syscalls with strace, you can see what your app is doing, and I’ll show you how.

Don’t let your production system go unwatched!

92e7389893670a1920a4fd98aec0d246?s=128

Jason R Clark

October 01, 2015
Tweet

Transcript

  1. https://flic.kr/p/8RePxD Peeking Into Ruby Tracing Running Code @jasonrclark http://github.com/jasonrclark/ peeking_into_ruby

  2. https://flic.kr/p/aVcE68 rbtrace 2 strace tcpdump

  3. https://flic.kr/p/8Kj2EC Dig Deeper. 3

  4. The Case of The Missing Config File. 4

  5. 5 https://flic.kr/p/pGP9hG

  6. 6 https://flic.kr/p/tRhQNP

  7. 7 https://flic.kr/p/shT4Lr

  8. 8 $ ./data_streamer ******************************** Loading data stream processor Requiring top_secret

    gem... ******************************** Hmmm, that's not how I should be configured -- Carl
  9. 9 https://flic.kr/p/eimYHo

  10. https://flic.kr/p/aVcE68 strace 10

  11. 11 $ strace ./data_streamer

  12. None
  13. ETOOMUCHOUTPUT 13

  14. execve("/bin/echo", ["echo", "boo"], [/* 14 vars */]) = brk(0) =

    0x20f0000 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFREG|0644, st_size=30820, ...}) = mmap(NULL, 30820, PROT_READ, MAP_PRIVATE, 3, 0) = close(3) = 0 access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOE read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 \34\2\0\0\0\0\0"..., 832) = 832 fstat(3, {st_mode=S_IFREG|0755, st_size=1729984, ...}) mmap(NULL, 3836448, PROT_READ|PROT_EXEC, MAP_PRIVATE|MA mprotect(0x7f676f4e1000, 2097152, PROT_NONE) = 0
  15. 15 open(“/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3

  16. 16 open(“/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3

  17. 17 open(“/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3

  18. 18 open(“/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3

  19. A Brief Interlude. 19

  20. 20 Kernel User

  21. 21 https://flic.kr/p/5J1zrp

  22. 22 https://flic.kr/p/4L4zvH

  23. 23 Kernel User syscalls

  24. exec fork exit 24 open read write close

  25. 25 $ man open 2

  26. OPEN(2) Linux Programmer's Manual NAME open, openat - open and

    possibly create a file SYNOPSIS #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> int open(const char *pathname, int flags); int open(const char *pathname, int flags); DESCRIPTION Given a pathname for a file, open() returns a file descriptor, a small, nonnegative integer for use i subsequent system calls (read(2), write(2), lseek( fcntl(2), etc.). The file descriptor returned by successful call will be the lowest-numbered file descriptor not currently open for the process.
  27. 27 Kernel User syscalls strace

  28. https://flic.kr/p/8Kj2EC Look For Landmarks. 28

  29. 29 $ ./data_streamer ******************************** Loading data stream processor Requiring top_secret

    gem... ******************************** Hmmm, that's not how I should be configured -- Carl
  30. 30 getcwd("/usr/src/app", 4096) = 13 lstat("/usr", {st_mode=S_IFDIR|0755, lstat("/usr/src", {st_mode=S_IFDIR|0755, lstat("/usr/src/app",

    {st_mode=S_IFDIR|0755, lstat("/usr/src/app/missing_config", {st_mo getcwd("/usr/src/app", 4096) = 13 write(1,"****************************"..., 50*****************************) = 50 write(1,"\n", 1) = 1 write(1,"Loading data stream processor", 29Loading data stream processor) = 29 fstat(7, {st_mode=S_IFREG|0644, st_size=127, close(7) = 0 getuid() = 0 geteuid() = 0 getgid() = 0 getegid() = 0
  31. 31 fstat(7, {st_mode=S_IFREG|0644, st_size=12 ioctl(7, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_-1 ENOTTY (Inappropriate

    read(7, "begin\n File.read(\”/usr/local/sl read(7, "", 8192) = 0 close(7) = 0 lstat("/usr", {st_mode=S_IFDIR|0755, st_size= lstat("/usr/src", {st_mode=S_IFDIR|0755, st write(1, "Hmmm, that's not how I"..., 51Hmmm, that's not how I...) = 51 write(1, "\n", 1) = 1 rt_sigaction(SIGINT, {SIG_IGN, [], SA_RESTO rt_sigaction(SIGINT, {SIG_DFL, [],SA_RESTO write(4, "!", 1) = 1 munmap(0x7fdaf91a4000, 1052672) = 0 exit_group(666) = ? +++ exited with 154 +++
  32. 32 open(“/usr/local/slurp.confi”, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or

    directory)
  33. 33 open(“/usr/local/slurp.confi”, O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or

    directory)
  34. slurp.confi? 34

  35. Resources 35 Julia Evans: blog and zine http://jvns.ca/blog/categories/strace/ James Golick:

    How to Debug Anything https://www.youtube.com/watch?v=VV7b7fs4VI8
  36. The Case of The Unused Method 36

  37. … Or Was It? 37

  38. 38 https://flic.kr/p/dmb7sW

  39. 39 https://flic.kr/p/a4VW2z

  40. 40 add_method_tracer

  41. 41 https://flic.kr/p/3LP9cm

  42. https://flic.kr/p/8Kj2EC Proper Preparation. 42

  43. https://flic.kr/p/aVcE68 rbtrace 43

  44. 44 Ruby VM Our Code rbtrace

  45. 45 $ rbtrace --methods puts --exec ./script.rb puts puts ...

  46. 46 # code.rb require ‘rbtrace’ $ ruby code.rb & $

    rbtrace -p PID --methods puts puts puts
  47. 47 $ rbtrace -p PID --methods Custom#goo > custom_goo.log

  48. 48 ls -la custom_goo.log -rw-r--r-- 1 0 Sep 30 22:12

    goo.log
  49. The Case of The Occasional Lag 49

  50. 50 Heartbeats Received background job starts

  51. https://flic.kr/p/aVcE68 rbtrace 51

  52. 52 $ rbtrace -p PID --gc -t 05:54:28.10610 Running background

    job 05:54:28.11113 garbage_collect 05:54:28.56664 garbage_collect 05:54:33.72790 garbage_collect
  53. 53

  54. Resources 54 Aman Gupta: github https://github.com/tmm1/rbtrace Justin Weiss: How to

    Debug Ruby Performance Problems in Production http://www.justinweiss.com/blog/2015/04/06/how-to- debug-ruby-performance-problems-in-production/
  55. The Case of The Delicate Negotiations 55

  56. 56 https://flic.kr/p/d8Kvyo

  57. 57 url = "https://www.google.com/" uri = URI.parse(url) response = Net::HTTP.get_response(uri)

    ... response = Net::HTTP.get_response(uri)
  58. 58 url = "https://www.google.com/" uri = URI.parse(url) conn = Net::HTTP.new(uri.host,

    uri.port) conn.use_ssl = true conn.keep_alive_timeout = 60 conn.start request = Net::HTTP::Get.new('/') response = conn.request(request) ... response = conn.request(request)
  59. 59 url = "https://www.google.com/" uri = URI.parse(url) conn = Net::HTTP.new(uri.host,

    uri.port) conn.use_ssl = true conn.keep_alive_timeout = 60 conn.start request = Net::HTTP::Get.new('/') response = conn.request(request) ... response = conn.request(request)
  60. 60 url = "https://www.google.com/" uri = URI.parse(url) conn = Net::HTTP.new(uri.host,

    uri.port) conn.use_ssl = true conn.keep_alive_timeout = 60 conn.start request = Net::HTTP::Get.new('/') response = conn.request(request) ... response = conn.request(request)
  61. https://flic.kr/p/aVcE68 tcpdump 61

  62. 62 Network Stack Our Code tcpdump

  63. https://flic.kr/p/8Kj2EC Find Boundaries. 63

  64. 64 Network Stack Our Code Kernel

  65. 65 $ tcpdump

  66. 66 17:29:36.569760 IP a6a0111ae486.38923 > nuq05s02-in- f16.1e100.net.https: Flags [S], seq

    290609768, win 29200, options [mss 1460,sackOK,TS val 486029 ecr 0,nop,wscale 7], length 0 17:29:36.595921 IP nuq05s02-in-f16.1e100.net.https > a6a0111ae486.38923: Flags [S.], seq 651712001, ack 290609769, win 65535, options [mss 1460], length 0 17:29:36.595996 IP a6a0111ae486.38923 > nuq05s02-in- f16.1e100.net.https: Flags [.], ack 1, win 29200, length 0 17:29:36.596658 IP a6a0111ae486.38923 > nuq05s02-in- f16.1e100.net.https: Flags [P.], seq 1:251, ack 1, win 29200, length 250 17:29:36.596937 IP nuq05s02-in-f16.1e100.net.https > a6a0111ae486.38923: Flags [.], ack 251, win 65535, length 0
  67. ETOOMUCHOUTPUT 67

  68. 68 17:29:36.569760 IP a6a0111ae486.38923 > nuq05s02-in- f16.1e100.net.https: Flags [S], seq

    290609768, win 29200, options [mss 1460,sackOK,TS val 486029 ecr 0,nop,wscale 7], length 0 17:29:36.595921 IP nuq05s02-in-f16.1e100.net.https > a6a0111ae486.38923: Flags [S.], seq 651712001, ack 290609769, win 65535, options [mss 1460], length 0 17:29:36.595996 IP a6a0111ae486.38923 > nuq05s02-in- f16.1e100.net.https: Flags [.], ack 1, win 29200, length 0 17:29:36.596658 IP a6a0111ae486.38923 > nuq05s02-in- f16.1e100.net.https: Flags [P.], seq 1:251, ack 1, win 29200, length 250 17:29:36.596937 IP nuq05s02-in-f16.1e100.net.https > a6a0111ae486.38923: Flags [.], ack 251, win 65535, length 0
  69. 69 17:29:36.569760 IP a6a0111ae486.38923 > nuq05s02-in-f16.1e100.net.https: Flags [S], seq 290609768,

    win 29200, options [mss 1460,sackOK,TS val 486029 ecr 0,nop,wscale 7], length 0
  70. 70 17:29:36.569760 IP a6a0111ae486.38923 > nuq05s02-in-f16.1e100.net.https: Flags [S], seq 290609768,

    win 29200, options [mss 1460,sackOK,TS val 486029 ecr 0,nop,wscale 7], length 0
  71. 71 17:29:36.569760 IP a6a0111ae486.38923 > nuq05s02-in-f16.1e100.net.https: Flags [S], seq 290609768,

    win 29200, options [mss 1460,sackOK,TS val 486029 ecr 0,nop,wscale 7], length 0 a6a0111ae486.38923 == me nuq05s02-in-f16.1e100.net.https == google
  72. 72 17:29:36.569760 IP a6a0111ae486.38923 > nuq05s02-in-f16.1e100.net.https: Flags [S], seq 290609768,

    win 29200, options [mss 1460,sackOK,TS val 486029 ecr 0,nop,wscale 7], length 0
  73. 73 Flags [S] - SYN (Start Connection) [.] - No

    Flag Set [P] - PSH (Push Data) [F] - FIN (Finish Connection)
  74. 74 17:29:36.569760 IP a6a0111ae486.38923 > nuq05s02-in-f16.1e100.net.https: Flags [S], seq 290609768,

    win 29200, options [mss 1460,sackOK,TS val 486029 ecr 0,nop,wscale 7], length 0
  75. 75 url = "https://www.google.com/" uri = URI.parse(url) response = Net::HTTP.get_response(uri)

  76. 76 17:29:36 IP me > google: Flags [S], seq 290609768,

    win 17:29:36 IP google > me: Flags [S.], seq 651712001, ac 17:29:36 IP me > google: Flags [.], ack 1, win 29200, 17:29:36 IP me > google: Flags [P.], seq 1:251, ack 1, 17:29:36 IP google > me: Flags [.], ack 251, win 65535 17:29:36 IP google > me: Flags [P.], seq 1:1289, ack 2 17:29:36 IP me > google: Flags [.], ack 1289, win 3091 17:29:36 IP google > me: Flags [P.], seq 1289:2577, ac … 24 times 17:29:36 IP google > me: Flags [.], seq 10110:11530, a 17:29:36 IP me > google: Flags [.], ack 11530, win 568 17:29:36 IP google > me: Flags [P.], seq 11530:12169, 17:29:36 IP me > google: Flags [P.], seq 535:566, ack 17:29:36 IP me > google: Flags [F.], seq 566, ack 1216 17:29:36 IP google > me: Flags [.], ack 566, win 65535 17:29:36 IP google > me: Flags [.], ack 567, win 65535 17:29:36 IP google > me: Flags [F.], seq 12169, ack 56 17:29:36 IP me > google: Flags [.], ack 12170, win 596
  77. 77 17:29:36 IP me > google: Flags [S], seq 290609768,

    win 17:29:36 IP google > me: Flags [S.], seq 651712001, ac 17:29:36 IP me > google: Flags [.], ack 1, win 29200, 17:29:36 IP me > google: Flags [P.], seq 1:251, ack 1, 17:29:36 IP google > me: Flags [.], ack 251, win 65535 17:29:36 IP google > me: Flags [P.], seq 1:1289, ack 2 17:29:36 IP me > google: Flags [.], ack 1289, win 3091 17:29:36 IP google > me: Flags [P.], seq 1289:2577, ac … 24 times 17:29:36 IP google > me: Flags [.], seq 10110:11530, a 17:29:36 IP me > google: Flags [.], ack 11530, win 568 17:29:36 IP google > me: Flags [P.], seq 11530:12169, 17:29:36 IP me > google: Flags [P.], seq 535:566, ack 17:29:36 IP me > google: Flags [F.], seq 566, ack 1216 17:29:36 IP google > me: Flags [.], ack 566, win 65535 17:29:36 IP google > me: Flags [.], ack 567, win 65535 17:29:36 IP google > me: Flags [F.], seq 12169, ack 56 17:29:36 IP me > google: Flags [.], ack 12170, win 596
  78. 78 17:29:36 IP me > google: Flags [S], seq 290609768,

    win 17:29:36 IP google > me: Flags [S.], seq 651712001, ac 17:29:36 IP me > google: Flags [.], ack 1, win 29200, 17:29:36 IP me > google: Flags [P.], seq 1:251, ack 1, 17:29:36 IP google > me: Flags [.], ack 251, win 65535 17:29:36 IP google > me: Flags [P.], seq 1:1289, ack 2 17:29:36 IP me > google: Flags [.], ack 1289, win 3091 17:29:36 IP google > me: Flags [P.], seq 1289:2577, ac … 24 times 17:29:36 IP google > me: Flags [.], seq 10110:11530, a 17:29:36 IP me > google: Flags [.], ack 11530, win 568 17:29:36 IP google > me: Flags [P.], seq 11530:12169, 17:29:36 IP me > google: Flags [P.], seq 535:566, ack 17:29:36 IP me > google: Flags [F.], seq 566, ack 1216 17:29:36 IP google > me: Flags [.], ack 566, win 65535 17:29:36 IP google > me: Flags [.], ack 567, win 65535 17:29:36 IP google > me: Flags [F.], seq 12169, ack 56 17:29:36 IP me > google: Flags [.], ack 12170, win 596
  79. 79 17:29:36 IP me > google: Flags [S], seq 290609768,

    win 17:29:36 IP google > me: Flags [S.], seq 651712001, ac 17:29:36 IP me > google: Flags [.], ack 1, win 29200, 17:29:36 IP me > google: Flags [P.], seq 1:251, ack 1, 17:29:36 IP google > me: Flags [.], ack 251, win 65535 17:29:36 IP google > me: Flags [P.], seq 1:1289, ack 2 17:29:36 IP me > google: Flags [.], ack 1289, win 3091 17:29:36 IP google > me: Flags [P.], seq 1289:2577, ac … 24 times 17:29:36 IP google > me: Flags [.], seq 10110:11530, a 17:29:36 IP me > google: Flags [.], ack 11530, win 568 17:29:36 IP google > me: Flags [P.], seq 11530:12169, 17:29:36 IP me > google: Flags [P.], seq 535:566, ack 17:29:36 IP me > google: Flags [F.], seq 566, ack 1216 17:29:36 IP google > me: Flags [.], ack 566, win 65535 17:29:36 IP google > me: Flags [.], ack 567, win 65535 17:29:36 IP google > me: Flags [F.], seq 12169, ack 56 17:29:36 IP me > google: Flags [.], ack 12170, win 596
  80. 80 url = "https://www.google.com/" uri = URI.parse(url) conn = Net::HTTP.new(uri.host,

    uri.port) conn.use_ssl = true conn.keep_alive_timeout = 60 conn.start request = Net::HTTP::Get.new('/') response = conn.request(request) ... response = conn.request(request)
  81. 81 17:58:12 IP me > google: Flags [S], seq 290609768,

    win 17:58:12 IP google > me: Flags [S.], seq 651712001, ac 17:58:12 IP me > google: Flags [.], ack 1, win 29200, 17:58:12 IP me > google: Flags [P.], seq 1:251, ack 1, 17:58:12 IP google > me: Flags [.], ack 251, win 65535 17:58:12 IP google > me: Flags [P.], seq 1:1289, ack 2 17:58:12 IP me > google: Flags [.], ack 1289, win 3091 17:58:12 IP google > me: Flags [P.], seq 1289:2577, ac … 24 times 17:58:12 IP google > me: Flags [.], seq 10110:11530, a 17:58:12 IP me > google: Flags [.], ack 11530, win 568 17:58:12 IP google > me: Flags [P.], seq 11530:12169, 17:58:12 IP me > google: Flags [P.], seq 535:566, ack 17:58:12 IP me > google: Flags [.], ack 12170, win 596 No Flags [F]!
  82. 82 17:48:18 IP me > google: Flags [P.], seq 693:851,

    ack 17:48:18 IP google > me: Flags [.], ack 851, win 65535 17:48:18 IP google > me: Flags [P.], seq 20688:21976, 17:48:18 IP me > google: Flags [.], ack 21976, win 653 17:48:18 IP google > me: Flags [.], seq 21976:23396, a 17:48:18 IP me > google: Flags [.], ack 23396, win 653 17:48:18 IP google > me: Flags [.], seq 23396:24816, a 17:48:18 IP me > google: Flags [.], ack 24816, win 653 17:48:18 IP google > me: Flags [P.], seq 24816:25840, 17:48:18 IP me > google: Flags [.], ack 25840, win 653 17:48:18 IP google > me: Flags [.], seq 25840:27260, a 17:48:18 IP me > google: Flags [.], ack 27260, win 653 17:48:18 IP google > me: Flags [.], seq 27260:28680, a 17:48:18 IP me > google: Flags [.], ack 28680, win 653 17:48:18 IP google > me: Flags [P.], seq 28680:29109, 17:48:18 IP me > google: Flags [.], ack 29109, win 653
  83. 83 17:48:18 IP me > google: Flags [P.], seq 693:851,

    ack 17:48:18 IP google > me: Flags [.], ack 851, win 65535 17:48:18 IP google > me: Flags [P.], seq 20688:21976, 17:48:18 IP me > google: Flags [.], ack 21976, win 653 17:48:18 IP google > me: Flags [.], seq 21976:23396, a 17:48:18 IP me > google: Flags [.], ack 23396, win 653 17:48:18 IP google > me: Flags [.], seq 23396:24816, a 17:48:18 IP me > google: Flags [.], ack 24816, win 653 17:48:18 IP google > me: Flags [P.], seq 24816:25840, 17:48:18 IP me > google: Flags [.], ack 25840, win 653 17:48:18 IP google > me: Flags [.], seq 25840:27260, a 17:48:18 IP me > google: Flags [.], ack 27260, win 653 17:48:18 IP google > me: Flags [.], seq 27260:28680, a 17:48:18 IP me > google: Flags [.], ack 28680, win 653 17:48:18 IP google > me: Flags [P.], seq 28680:29109, 17:48:18 IP me > google: Flags [.], ack 29109, win 653 ~15 packets No New Connection
  84. 84

  85. Resources 85 Benjamin Cane: A Quick and Practical Reference for

    tcpdump http://bencane.com/2014/10/13/quick-and-practical- reference-for-tcpdump/ Jonan Scheffler: Lifecycle of a Web Request https://www.youtube.com/watch?v=U-4md8lYosk
  86. 86 https://flic.kr/p/ahkHU8

  87. https://flic.kr/p/aVcE68 rbtrace 87 strace tcpdump

  88. https://flic.kr/p/8Kj2EC Dig Deeper. 88

  89. https://flic.kr/p/8Kj2EC Look For Landmarks. 89

  90. https://flic.kr/p/8Kj2EC Proper Preparations. 90

  91. https://flic.kr/p/8Kj2EC Find Boundaries. 91

  92. Case Closed! 92 https://github.com/jasonrclark/ peeking_into_ruby @jasonrclark