Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Are you out of memory, or have plenty to spare?

Joshua Thijssen
November 14, 2015
150

Are you out of memory, or have plenty to spare?

Joshua Thijssen

November 14, 2015
Tweet

Transcript

  1. Out of memory or plenty to spare?
    1
    Joshua Thijssen
    jaytaph
    The fine details of reading memory consumption

    View full-size slide

  2. Disclaimers:
    2
    No PHP (but we will still talk about it).
    Pretty advanced stuff.
    Simplified.

    View full-size slide

  3. Q: How much memory
    is our server using?
    3

    View full-size slide

  4. $ free -m
    total used free shared buffers cached
    Mem: 3963 3500 462 0 722 1263
    -/+ buffers/cache: 1515 2448
    Swap: 400 20 379
    5

    View full-size slide

  5. $ free -m
    total used free shared buffers cached
    Mem: 3963 3500 462 0 722 1263
    -/+ buffers/cache: 1515 2448
    Swap: 400 20 379
    5

    View full-size slide

  6. 6
    Free Used
    Buffers / Cache

    View full-size slide

  7. 6
    Free Used
    Buffers / Cache
    Free

    View full-size slide

  8. 7
    Free Used
    Buffers / Cache
    Free

    View full-size slide

  9. Active / Total Objects (% used) : 2187767 / 2283870 (95.8%)
    Active / Total Slabs (% used) : 261417 / 261421 (100.0%)
    Active / Total Caches (% used) : 114 / 192 (59.4%)
    Active / Total Size (% used) : 1013948.21K / 1024061.72K (99.0%)
    Minimum / Average / Maximum Object : 0.02K / 0.45K / 4096.00K
    OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
    825808 825378 99% 0.98K 206452 4 825808K ext4_inode_cache
    522921 512677 98% 0.19K 24901 21 99604K dentry
    394290 364506 92% 0.10K 10110 39 40440K buffer_head
    166914 142785 85% 0.04K 1686 99 6744K ext4_extent_status
    143756 142975 99% 0.05K 1732 83 6928K jbd2_inode
    63700 62887 98% 0.56K 9100 7 36400K radix_tree_node
    42273 29552 69% 0.06K 671 63 2684K kmalloc-64
    23870 19345 81% 0.12K 770 31 3080K kmalloc-96
    16884 14623 86% 0.06K 268 63 1072K anon_vma_chain
    12144 12024 99% 0.12K 368 33 1472K kernfs_node_cache
    11424 10582 92% 0.19K 544 21 2176K vm_area_struct
    10044 7897 78% 0.03K 81 124 324K kmalloc-32
    9331 9135 97% 0.56K 1333 7 5332K inode_cache
    7434 6552 88% 0.06K 118 63 472K anon_vma
    6528 6528 100% 0.62K 1088 6 4352K proc_inode_cache
    3312 2330 70% 0.25K 207 16 828K filp
    2490 2458 98% 0.05K 30 83 120K ftrace_event_field
    2294 2207 96% 0.12K 74 31 296K kmalloc-128
    1596 1440 90% 0.19K 76 21 304K kmalloc-192
    1568 1476 94% 0.07K 28 56 112K Acpi-Operand
    1112 1068 96% 1.00K 278 4 1112K kmalloc-1024
    1104 1076 97% 0.09K 24 46 96K ftrace_event_file
    882 518 58% 0.19K 42 21 168K cred_jar
    880 644 73% 0.25K 55 16 220K skbuff_head_cache
    828 802 96% 0.65K 138 6 552K shmem_inode_cache
    648 552 85% 0.11K 18 36 72K jbd2_journal_head
    608 523 86% 0.25K 38 16 152K kmalloc-256
    8
    slabtop

    View full-size slide

  10. 9
    Q: How much memory
    is our application using?

    View full-size slide

  11. 12
    ➡ Completely isolated from each other.
    Processes

    View full-size slide

  12. 12
    ➡ Completely isolated from each other.
    ➡ Act like they own the place.
    Processes

    View full-size slide

  13. 12
    ➡ Completely isolated from each other.
    ➡ Act like they own the place.
    ➡ Must ask permission for pretty much
    everything.
    Processes

    View full-size slide

  14. 13
    Operating system (kernel)
    Process 1 Process 2 Process 3 Process 4
    Network Disk I/O Screen

    View full-size slide

  15. ➡ Every process can access up to 4 GB of
    memory*
    14

    View full-size slide

  16. ➡ Every process can access up to 4 GB of
    memory*
    ➡ Even if your computer does not have 4GB
    of memory.
    14

    View full-size slide

  17. ➡ Every process can access up to 4 GB of
    memory*
    ➡ Even if your computer does not have 4GB
    of memory.
    ➡ Even if your computer does have MORE
    than 4GB of memory.
    14

    View full-size slide

  18. ➡ On 64bit machines:
    ➡ theoretically: 16 exabytes
    (in relation: size of the internet in 2013
    was estimated 672 exabytes)
    ➡ But most often, only between 8-128TB
    are used.
    15

    View full-size slide

  19. 16
    Physical
    Memory

    View full-size slide

  20. 16
    Physical
    Memory
    Virtual
    Memory

    View full-size slide

  21. 18
    0x00000000
    0x00010000
    0xC0000000
    0xFFFFFFFF
    1 GB
    3 GB

    View full-size slide

  22. 18
    0x00000000
    0x00010000
    0xC0000000
    0xFFFFFFFF
    1 GB
    3 GB

    View full-size slide

  23. 18
    0x00000000
    0x00010000
    0xC0000000
    0xFFFFFFFF
    1 GB
    3 GB
    program
    + data

    View full-size slide

  24. 18
    0x00000000
    0x00010000
    0xC0000000
    0xFFFFFFFF
    1 GB
    3 GB
    program
    + data
    shared

    View full-size slide

  25. 18
    0x00000000
    0x00010000
    0xC0000000
    0xFFFFFFFF
    1 GB
    3 GB
    program
    + data
    stack
    shared

    View full-size slide

  26. 18
    0x00000000
    0x00010000
    0xC0000000
    0xFFFFFFFF
    1 GB
    3 GB
    program
    + data
    stack
    shared

    View full-size slide

  27. 19
    4 kb
    4 kb
    4 kb
    4 kb
    4 kb
    4 kb
    4 kb
    0x12340000
    0x12341000
    0x12342000
    0x12343000
    0x12344000
    0x12345000
    0x12346000

    View full-size slide

  28. ➡ Every process gets its own page table.
    21

    View full-size slide

  29. ➡ Every process gets its own page table.
    ➡ Only pages that are actually used are
    filled!
    21

    View full-size slide

  30. ➡ Every process gets its own page table.
    ➡ Only pages that are actually used are
    filled!
    ➡ If all pages are filled (ie: 4GB is mapped),
    the page table would be 4MB in size.
    21

    View full-size slide

  31. 22
    0x12340000 0x00001000
    0x12341000 0x00522000
    0x12342000 0x00852000
    0x12346000 0x00633000
    Virt Phys

    View full-size slide

  32. ➡ Every virtual address that is used MUST
    be converted to an actual physical address
    through the page tables.
    ➡ Caching in CPU via the Translation
    Lookaside Buffer (TLB)
    23

    View full-size slide

  33. 24
    1c
    1b
    1a
    Physical
    Virtual

    View full-size slide

  34. 24
    1c
    1b
    1a
    1b
    1a
    1c
    Physical
    Virtual

    View full-size slide

  35. 25
    1b
    1c
    1b
    1a
    1a
    2c
    2b
    2a
    2a
    2b
    2c
    1c
    Physical
    Virtual
    Virtual

    View full-size slide

  36. 1b
    26
    1c
    1b
    1a
    1a
    2c
    2b
    2a
    2a
    2b
    2c
    3c
    3b
    3a
    3a
    3b
    3c
    1c
    Physical
    Virtual
    Virtual

    View full-size slide

  37. 1b
    26
    1c
    1b
    1a
    1a
    2c
    2b
    2a
    2a
    2b
    2c
    3c
    3b
    3a
    3a
    3b
    3c
    4c
    4b
    4a
    1c
    Physical
    Virtual
    Virtual

    View full-size slide

  38. 27
    Virtual
    Virtual
    1b
    1a
    2b
    2c
    3a
    3b
    3c
    2a
    1c
    Physical
    1c
    1b
    1a
    2c
    2b
    2a
    3c
    3b
    3a
    4c
    4b
    4a
    *
    *

    View full-size slide

  39. 27
    Virtual
    Virtual Swap
    1b
    1a
    2b
    2c
    3a
    3b
    3c
    2a
    1c
    Physical
    1c
    1b
    1a
    2c
    2b
    2a
    3c
    3b
    3a
    4c
    4b
    4a
    *
    *

    View full-size slide

  40. 27
    Virtual
    Virtual Swap
    1b
    1a
    2b
    2c
    3a
    3b
    3c
    Physical
    2a
    1c
    1c
    1b
    1a
    2c
    2b
    2a
    3c
    3b
    3a
    4c
    4b
    4a
    *
    *

    View full-size slide

  41. 27
    Virtual
    Virtual Swap
    1b
    1a
    2b
    2c
    3a
    3b
    3c
    Physical
    4c
    4a
    4b
    2a
    1c
    1c
    1b
    1a
    2c
    2b
    2a
    3c
    3b
    3a
    4c
    4b
    4a
    *
    *

    View full-size slide

  42. ➡ CPU tells OS when a page is not loaded
    in memory (Page fault).
    ➡ OS loads page from SWAP.
    ➡ OS returns control back.
    ➡ Process is - NEVER - aware of this.
    28
    SWAP

    View full-size slide

  43. Quick recap
    ➡ Every process can use up to 4GB.
    ➡ Every process gets its own page table.
    ➡ Every process starts with the smallest
    possible page table.
    ➡ Pages can be swapped in/out memory by
    CPU/OS, unaware by the process.
    29

    View full-size slide

  44. 30
    What happens when a
    process wants (more)
    memory?

    View full-size slide

  45. 31
    0x00000000
    0x00010000
    0xC0000000
    0xFFFFFFFF
    1 GB
    3 GB
    program
    + data
    stack
    shared

    View full-size slide

  46. 31
    0x00000000
    0x00010000
    0xC0000000
    0xFFFFFFFF
    1 GB
    3 GB
    program
    + data
    stack
    shared

    View full-size slide

  47. Ask the OS:
    (s)brk() or mmap()
    32

    View full-size slide

  48. (s)brk()
    changes the data
    segment size
    33

    View full-size slide

  49. 34
    sbrk(10000)

    View full-size slide

  50. 34
    Heap
    sbrk(10000)

    View full-size slide

  51. ➡ sbrk() does not ALLOCATE (phys) memory.
    35

    View full-size slide

  52. ➡ sbrk() does not ALLOCATE (phys) memory.
    ➡ sbrk() creates table entries in the page table.
    35

    View full-size slide

  53. ➡ sbrk() does not ALLOCATE (phys) memory.
    ➡ sbrk() creates table entries in the page table.
    ➡ Physical memory usage stays the same.
    35

    View full-size slide

  54. ➡ sbrk() does not ALLOCATE (phys) memory.
    ➡ sbrk() creates table entries in the page table.
    ➡ Physical memory usage stays the same.
    ➡ Virtual memory usage goes up.
    35

    View full-size slide

  55. 36
    #include
    #include
    void main(void) {
    sbrk(1024 * 1024 * 1024);
    }

    View full-size slide

  56. 36
    #include
    #include
    void main(void) {
    sbrk(1024 * 1024 * 1024);
    }
    top - 11:11:28 up 8 min, 2 users, load average: 0.01, 0.01, 0.00
    Tasks: 141 total, 1 running, 140 sleeping, 0 stopped, 0 zombie
    Cpu(s): 0.0%us, 0.9%sy, 0.0%ni, 99.1%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
    Mem: 3922404k total, 492688k used, 3429716k free, 81944k buffers
    Swap: 1675260k total, 0k used, 1675260k free, 171296k cached
    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    2132 jthijsse 20 0 1027m 400 324 S 0.0 0.0 0:00.00 memory
    1669 jthijsse 20 0 128m 5896 2064 S 0.0 0.2 0:00.83 zsh
    1871 jthijsse 20 0 125m 5220 1860 S 0.0 0.1 0:00.18 zsh
    1668 jthijsse 20 0 95844 1740 800 S 0.0 0.0 0:00.38 sshd
    1870 jthijsse 20 0 95844 1736 796 S 0.0 0.0 0:00.21 sshd
    2143 jthijsse 20 0 15028 1328 984 R 6.9 0.0 0:00.09 top

    View full-size slide

  57. ➡ Only when we access the virtual memory,
    actual memory usage goes up.
    ➡ Only pages that are accessed will be
    created/loaded into physical memory.
    37

    View full-size slide

  58. ➡ Allocating 1GB of memory is cheap/fast.
    ➡ Iterating 1GB of memory is not.
    38

    View full-size slide

  59. 39
    #include
    #include
    void main(void) {
    int j = 1024 * 1024 * 1024;
    char *a = malloc(j);
    for (int i=0; i!=j; i++) {
    a[i] = 1;
    }
    }

    View full-size slide

  60. 39
    #include
    #include
    void main(void) {
    int j = 1024 * 1024 * 1024;
    char *a = malloc(j);
    for (int i=0; i!=j; i++) {
    a[i] = 1;
    }
    }
    top - 11:09:49 up 7 min, 2 users, load average: 0.07, 0.02, 0.00
    Tasks: 141 total, 1 running, 140 sleeping, 0 stopped, 0 zombie
    Cpu(s): 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
    Mem: 3922404k total, 1543680k used, 2378724k free, 81940k buffers
    Swap: 1675260k total, 0k used, 1675260k free, 171264k cached
    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    2095 jthijsse 20 0 1027m 1.0g 352 S 0.0 26.7 0:03.46 memory2
    1669 jthijsse 20 0 128m 5896 2064 S 0.0 0.2 0:00.83 zsh
    1871 jthijsse 20 0 125m 5220 1860 S 0.0 0.1 0:00.15 zsh
    1668 jthijsse 20 0 95844 1740 800 S 0.0 0.0 0:00.38 sshd
    1870 jthijsse 20 0 95844 1736 796 S 0.0 0.0 0:00.17 sshd
    2059 jthijsse 20 0 15028 1332 992 R 0.0 0.0 0:00.69 top

    View full-size slide

  61. ➡ mmap() maps files into memory.
    ➡ Lazy-load. Only creates virtual pages, loads
    parts into physical memory and connects
    when needed.
    ➡ mmap() can also be used to allocate memory
    (without connecting to files)
    40

    View full-size slide

  62. a = "Hello World"
    if (a == "foobar") {
    // Do something
    }
    43
    [root@localhost ~]# pmap -x 1271
    1271: php-fpm: master process (/etc/php-fpm.conf)
    Address Kbytes RSS Dirty Mode Mapping
    001ee000 248 8 0 r-x-- libgssapi_krb5.so.2.2
    0022c000 4 4 4 r---- libgssapi_krb5.so.2.2
    0022d000 4 4 4 rw--- libgssapi_krb5.so.2.2
    0022f000 28 4 0 r-x-- libcrypt-2.12.so
    00236000 4 4 4 r---- libcrypt-2.12.so
    00237000 4 4 4 rw--- libcrypt-2.12.so
    ....
    08048000 3400 204 0 r-x-- php-fpm
    0839a000 328 140 20 rw--- php-fpm
    083ec000 96 32 32 rw--- [ anon ]
    092b4000 1316 1176 1176 rw--- [ anon ]
    ...
    af483000 4 4 4 rw-s- zero (deleted)
    af484000 28 0 0 r--s- gconv-modules.cache
    af48b000 131072 0 0 rw-s- zero (deleted)
    b748b000 160 4 4 rw--- [ anon ]
    b74b3000 2048 8 0 r---- locale-archive
    b76b3000 1312 124 124 rw--- [ anon ]
    b77fb000 4 4 4 rw-s- zero (deleted)
    b77fc000 4 4 4 rw-s- zero (deleted)
    b77fd000 4 4 4 rw-s- zero (deleted)
    b77fe000 4 4 4 rw-s- zero (deleted)
    b77ff000 4 4 4 rw--- [ anon ]
    bf876000 84 48 48 rw--- [ stack ]
    -------- ------- ------- ------- -------
    total kB 155544 - - -

    View full-size slide

  63. 45
    Process 2
    Process 1 Process 3

    View full-size slide

  64. 46
    48
    1c
    1b
    1a
    2c
    2b
    2a
    Physical
    Process 2
    Process 1

    View full-size slide

  65. 46
    48
    1c
    1b
    1a
    2c
    2b
    2a
    a
    b
    c
    Physical
    Process 2
    Process 1

    View full-size slide

  66. 46
    48
    1c
    1b
    1a
    2c
    2b
    2a
    a
    b
    c
    Physical
    Process 2
    Process 1

    View full-size slide

  67. 48
    Process 1
    fork() =>

    View full-size slide

  68. 48
    Process 1 Process 2
    fork() =>

    View full-size slide

  69. 49
    1c
    1b
    1a
    1'c
    1'b
    1'a
    a
    b
    c
    Physical
    Virtual
    Virtual
    fork() =>

    View full-size slide

  70. 50
    1c
    1b
    1a
    1'c
    2b
    1'a
    1a
    1b
    1c
    Physical
    Virtual
    Virtual
    fork() =>
    2b

    View full-size slide

  71. 52
    ➡ Don't worry about high virtual memory
    usage.
    ➡ Resident memory set is key.
    ➡ When fork()'ing, memory usage (even
    RSS) becomes hard to manage / detect.
    ➡ Don't swap!
    Recap

    View full-size slide

  72. ➡ https://techtalk.intersec.com/2013/07/memory-part-2-
    understanding-process-memory/
    ➡ http://locklessinc.com/articles/memory_usage/
    ➡ http://rhaas.blogspot.nl/2012/01/linux-memory-
    reporting.html
    ➡ http://people.freebsd.org/~lstewart
    ➡ http://deathbytape.com/post/110371790629/intro-
    virtual-memoryarticles/cpumemory.pdf
    ➡ http://nikic.github.com/2011/12/12/How-big-are-PHP-
    arrays-really-Hint-BIG.html
    53
    303 See Other

    View full-size slide

  73. http://farm1.static.flickr.com/73/163450213_18478d3aa6_d.jpg 54

    View full-size slide

  74. 55
    Find me on twitter: @jaytaph
    Find me for development and training: www.noxlogic.nl
    Find me on email: [email protected]
    Find me for blogs: www.adayinthelifeof.nl

    View full-size slide