Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Don't reboot, debug!

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Joshua Thijssen Joshua Thijssen
September 18, 2015
110

Don't reboot, debug!

Avatar for Joshua Thijssen

Joshua Thijssen

September 18, 2015
Tweet

Transcript

  1. 1 Don't reboot, debug! A medic first aid course in

    debugging your server Joshua Thijssen @JayTaph
  2. 2

  3. 3

  4. ➡ Apache / PHP / nginx/php-fpm ➡ Monitoring / backup

    ➡ Hanging cron jobs & runaway tools 8 Other causes:
  5. ➡ Apache / PHP / nginx/php-fpm ➡ Monitoring / backup

    ➡ Hanging cron jobs & runaway tools ➡ Connectivity / DNS problems 8 Other causes:
  6. 11

  7. 11 ➡ Isolated user space. ➡ PID (process id) and

    state. ➡ Kernel “preempts”, or process yields.
  8. 11 ➡ Isolated user space. ➡ PID (process id) and

    state. ➡ Kernel “preempts”, or process yields. ➡ Multitasking.
  9. 12

  10. 12 ➡ R Running or runnable ➡ S Interruptible sleep

    ➡ D Uninterruptible sleep ➡ Z Defunct process (zombies)
  11. 13

  12. 14

  13. 14 ➡ Most processes are sleeping. ➡ External processes (and

    the kernel) can “wake up” a process at any time by sending “signals”.
  14. 14 ➡ Most processes are sleeping. ➡ External processes (and

    the kernel) can “wake up” a process at any time by sending “signals”. ➡ Fire signals with “kill”.
  15. 15

  16. 15 ➡ Uninterruptible means it won’t handle signals (directly), but

    waits on its task to finish (it must wake up by itself).
  17. 15 ➡ Uninterruptible means it won’t handle signals (directly), but

    waits on its task to finish (it must wake up by itself). ➡ Used for high-performance loops that needs to focus (like I/O).
  18. 15 ➡ Uninterruptible means it won’t handle signals (directly), but

    waits on its task to finish (it must wake up by itself). ➡ Used for high-performance loops that needs to focus (like I/O). ➡ Still can be preempted by the kernel.
  19. 16

  20. 16 ➡ Zombies aren’t bad. ➡ It’s just bad programming

    or administration that creates zombies.
  21. 16 ➡ Zombies aren’t bad. ➡ It’s just bad programming

    or administration that creates zombies. ➡ But there shouldn’t be many.
  22. 18

  23. 18 ➡ 1 minute, 5 minutes, 15 minutes averages ➡

    Calculated as the number of runnable processes (but has more sources nowadays).
  24. 18 ➡ 1 minute, 5 minutes, 15 minutes averages ➡

    Calculated as the number of runnable processes (but has more sources nowadays). ➡ Depends on number of CPU’s!
  25. 19 14:57:22 up 35 days, 18:57, 1 user, load average:

    1.52, 0.66, 0.27 ➡ 1.52 average runnable processes in the last minute.
  26. 19 14:57:22 up 35 days, 18:57, 1 user, load average:

    1.52, 0.66, 0.27 ➡ 1.52 average runnable processes in the last minute. ➡ 0.66 average in 5 minutes
  27. 19 14:57:22 up 35 days, 18:57, 1 user, load average:

    1.52, 0.66, 0.27 ➡ 1.52 average runnable processes in the last minute. ➡ 0.66 average in 5 minutes ➡ 0.27 average in 15 minutes.
  28. 19 14:57:22 up 35 days, 18:57, 1 user, load average:

    1.52, 0.66, 0.27 ➡ 1.52 average runnable processes in the last minute. ➡ 0.66 average in 5 minutes ➡ 0.27 average in 15 minutes. ➡ Single CPU: 52% more than it can handle.
  29. 19 14:57:22 up 35 days, 18:57, 1 user, load average:

    1.52, 0.66, 0.27 ➡ 1.52 average runnable processes in the last minute. ➡ 0.66 average in 5 minutes ➡ 0.27 average in 15 minutes. ➡ Single CPU: 52% more than it can handle. ➡ Quad core system: not doing very much
  30. 21 Q: How much memory does this process use? This

    is REALLY hard question to answer! It depends on many factors!
  31. 22

  32. 22

  33. 23 ➡ Virtual memory (VIRT) ➡ Shared memory (SHR SHRD)

    ➡ Resident memory (RES or RSS) ➡ Swapped memory (SWP, SWAP)
  34. 24 ➡ Each process has 4GB memory space usable. ➡

    Even if you have less memory installed. (on a 32bit system)
  35. 24 ➡ Each process has 4GB memory space usable. ➡

    Even if you have less memory installed. ➡ 1GB is reserved for kernel. (on a 32bit system)
  36. ➡ New phone book entries are created. ➡ VIRT will

    increase. ➡ Allocating memory != using memory. 27 Allocating memory
  37. 28

  38. <?php $pid = pcntl_fork(); if ($pid) { echo "Hello, this

    is the parent process\n"; } else { echo "Hello, this is the child process\n"; } 29
  39. 31 C1 B1 A1 C1` B1` A1` A1 B1 C1

    Physical Virtual Virtual fork() =>
  40. 32 C1 B1 A1 C1` B2 A1` A1 B1 C1

    Physical Virtual Virtual fork() => B2
  41. 33

  42. $ free -m total used free shared buffers cached Mem:

    3963 3500 462 0 722 1263 -/+ buffers/cache: 1515 2448 Swap: 400 20 379 35
  43. $ free -m total used free shared buffers cached Mem:

    3963 3500 462 0 722 1263 -/+ buffers/cache: 1515 2448 Swap: 400 20 379 35
  44. 38

  45. 39

  46. 40 ➡ With monitoring you have an excellent idea: ➡

    what is happening ➡ what happened ➡ what will likely be happening
  47. 43 ➡ syslog ➡ files ➡ mail ➡ slack /

    hipchat /irc ➡ logstash $ php composer.phar require monolog/monolog
  48. 45

  49. 50 $ strace -ff -p <pid> .... socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)

    = 20 fcntl(20, F_GETFL) = 0x2 (flags O_RDWR) fcntl(20, F_SETFL, O_RDWR|O_NONBLOCK) = 0 connect(20, {sa_family=AF_INET, sin_port=htons(11211), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress) poll([{fd=20, events=POLLOUT}], 1, -1) = 1 ([{fd=20, revents=POLLOUT}]) write(20, "get ez_client1/acls/"..., 44) = 44 read(20, "END\r\n", 8196) = 5 write(20, "get ez_client1/acl/g"..., 40) = 40 read(20, "END\r\n", 8196) = 5 write(20, "quit\r\n", 6) = 6 shutdown(20, 2 /* send and receive */) = 0 close(20) = 0 mkdir("/tmp/smarty", 0777) = -1 EEXIST (File exists) chmod("/tmp/smarty", 0777) = 0 mkdir("/tmp/smarty", 0777) = -1 EEXIST (File exists) chmod("/tmp/smarty", 0777) = 0 access("/userdata/client1/user/templates/nl/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/user/templates/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/theme/templates/nl/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/theme/templates/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/etc/noxlogic/root/themes/ezshopping/templates/nl/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/etc/noxlogic/root/themes/ezshopping/templates/block-right-last.tpl", F_OK) = -1 ENOENT (No such file or directory) mkdir("/tmp/smarty", 0777) = -1 EEXIST (File exists) chmod("/tmp/smarty", 0777) = 0 access("/userdata/client1/user/templates/nl/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/user/templates/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/theme/templates/nl/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/userdata/client1/theme/templates/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/etc/noxlogic/root/themes/ezshopping/templates/nl/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) access("/etc/noxlogic/root/themes/ezshopping/templates/block-right.tpl", F_OK) = -1 ENOENT (No such file or directory) mkdir("/tmp/smarty", 0777) = -1 EEXIST (File exists)
  50. 51 $ strace ping www.google.com .... mprotect(0xb757f000, 4096, PROT_READ) =

    0 munmap(0xb76d8000, 44104) = 0 stat64("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=59, ...}) = 0 socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3 connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.178.4")}, 16) = 0 gettimeofday({1347446161, 382120}, NULL) = 0 poll([{fd=3, events=POLLOUT}], 1, 0) = 1 ([{fd=3, revents=POLLOUT}]) send(3, "u\205\1\0\0\1\0\0\0\0\0\0\3www\6google\3com\0\0\1\0\1", 32, MSG_NOSIGNAL) = 32 poll([{fd=3, events=POLLIN}], 1, 5000
  51. 54 ➡ Unobtrusive probes inside the kernel ➡ Scripts written

    in D language. ➡ SUN / Solaris only (licensing)
  52. 55 ➡ SystemTAP ➡ “GPL” version of dtrace ➡ Awesome,

    but complex ➡ But you need / want debug info packages
  53. 57 ➡ There are some “providers” in the PHP core

    (zend_dtrace.{c,h,d}) ➡ file / line ➡ function entry / exit ➡ exception caught / thrown
  54. 59

  55. Find me on twitter: @jaytaph Find me for development and

    training: www.noxlogic.nl Find me on email: [email protected] Find me for blogs: www.adayinthelifeof.nl Thank You! https://joind.in/talk/view/15191