Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building a Compacting GC

Aaron Patterson
September 20, 2017

Building a Compacting GC

This is a talk about building a compacting GC that I presented at RubyKaigi 2017.

Aaron Patterson

September 20, 2017
Tweet

More Decks by Aaron Patterson

Other Decks in Programming

Transcript

  1. Exploring Memory in Ruby
    Building a Compacting GC

    View full-size slide

  2. Aaron Patterson

    View full-size slide

  3. Ruby Core && Rails Core

    View full-size slide

  4. ૲ੜ͑Δ
    Grass Grows
    !!!
    * Note: English speakers please ask me about this slide, it cannot be translated ❤

    View full-size slide

  5. Mark / Compact GC

    View full-size slide

  6. Exploring Memory in Ruby
    Copy on Write
    Building a Compacting GC (in MRI)
    Memory Inspection Tools

    View full-size slide

  7. What is "Copy on Write"?
    What is "Compaction"?
    I can do this!

    View full-size slide

  8. Experienced People:

    View full-size slide

  9. Algorithms
    Implementation Details

    View full-size slide

  10. Copy on Write
    Optimization

    View full-size slide

  11. What is CoW?

    View full-size slide

  12. Ruby String
    require 'objspace'
    str = "x" * 9000
    p ObjectSpace.memsize_of(str) # => 9041
    str2 = str.dup
    p ObjectSpace.memsize_of(str2) # => 40
    str2[1] = 'l'
    p ObjectSpace.memsize_of(str2) # => 9041
    Initial String
    No Copy
    Copied "on write"

    View full-size slide

  13. Ruby Array
    require 'objspace'
    array = ["x"] * 9000
    p ObjectSpace.memsize_of(array) # => 72040
    array2 = array.dup
    p ObjectSpace.memsize_of(array2) # => 40
    array2[1] = 'l'
    p ObjectSpace.memsize_of(array2) # => 72040
    Initial Array
    No Copy
    Copied "on write"

    View full-size slide

  14. Ruby Hash
    require 'objspace'
    hash = ('a'..'zzz').each_with_object({}) { |k,h| h[k] = :hello }
    p ObjectSpace.memsize_of(hash) # => 917600
    hash2 = hash.dup
    p ObjectSpace.memsize_of(hash2) # => 917600
    Initial Hash
    Did Copy

    View full-size slide

  15. No Observable Difference

    View full-size slide

  16. Operating System

    View full-size slide

  17. Ruby Fork
    string = "x" * 90000
    p PARENT_PID: $$
    gets
    child_pid = fork do
    p CHILD_PID: $$
    gets
    string[1] = 'y'
    gets
    end
    Process.waitpid child_pid
    Initial String
    No Copy
    Copied "on write"

    View full-size slide

  18. OS Memory Copy
    xxxx xxxx xxxx xxxx
    Parent
    Process
    4k 4k 4k 4k
    Child
    Process
    xxxx xxxx xxxx xxxx
    xyxx

    View full-size slide

  19. "CoW Page Fault"

    View full-size slide

  20. Why is CoW Important?

    View full-size slide

  21. Unicorn is a forking webserver

    View full-size slide

  22. Unicorn
    Parent
    Unicorn
    Child
    Unicorn
    Child
    Unicorn
    Child

    View full-size slide

  23. Unicorn
    Parent
    Unicorn
    Child
    Unicorn
    Child
    Unicorn
    Child

    View full-size slide

  24. Reduce Boot Time

    View full-size slide

  25. Decreases Memory Usage

    View full-size slide

  26. This is how it works today.

    View full-size slide

  27. Reducing Page Faults

    View full-size slide

  28. What causes page faults?

    View full-size slide

  29. Mutating Share Memory

    View full-size slide

  30. Garbage Collector

    View full-size slide

  31. Object Allocation

    View full-size slide

  32. Object Allocation
    Ruby Objects
    Empty
    Filled
    Parent
    Process
    Memory
    Child
    Process
    Memory

    View full-size slide

  33. How can we reduce this space?

    View full-size slide

  34. GC compaction

    View full-size slide

  35. Compact Before Fork
    Ruby Objects
    Empty
    Filled
    Parent
    Process
    Memory
    Page 1 Page 2

    View full-size slide

  36. Compact Before Fork
    Ruby Objects
    Empty
    Filled
    Parent
    Process
    Memory
    Page 1 Page 2
    Child
    Process
    Memory

    View full-size slide

  37. GC Compaction

    View full-size slide

  38. What is "compaction"?

    View full-size slide

  39. Compaction
    Ruby Objects
    Empty
    Filled
    Parent
    Process
    Memory
    Page 1 Page 2

    View full-size slide

  40. Why compact?

    View full-size slide

  41. Reduce Memory Usage

    View full-size slide

  42. "Impossible"

    View full-size slide

  43. Compaction Algorithms

    View full-size slide

  44. Two Finger Compaction
    ☝ ☝

    View full-size slide

  45. Disadvantages
    • It’s slow

    • Objects move to random places

    View full-size slide

  46. Advantage
    • It’s EASY!

    View full-size slide

  47. Algorithm
    Object Movement
    Reference Updating

    View full-size slide

  48. Object Movement
    1 2 3 4 5 6 7 8 9 a b
    Free Free Free Obj Free Obj Obj Obj Free Free Obj
    ☝ ☝
    Free Pointer Scan Pointer
    1
    2
    3
    5
    Done!
    Address
    Heap

    View full-size slide

  49. Reference Updating
    1 2 3 4 5 6 7 8 9 a b
    Free Free Free Obj Free Obj Obj Obj Free Free Obj
    Address
    Heap
    a = { c: 'd' }
    Ruby
    {} :c 'd'
    Before Compaction

    View full-size slide

  50. Reference Updating
    1 2 3 4 5 6 7 8 9 a b
    Obj Obj Obj Obj Obj 5 3 2 Free Free 1
    Address
    Heap
    a = { c: 'd' }
    Ruby
    {} :c
    'd'
    After Compaction

    View full-size slide

  51. Reference Updating
    1 2 3 4 5 6 7 8 9 a b
    Obj Obj Obj Obj Obj 5 3 2 Free Free 1
    Address
    Heap
    a = { c: 'd' }
    Ruby
    {} :c
    'd'
    After Compaction

    View full-size slide

  52. Reference Updating
    1 2 3 4 5 6 7 8 9 a b
    Obj Obj Obj Obj Obj 5 3 2 Free Free 1
    Address
    Heap
    a = { c: 'd' }
    Ruby
    {} :c
    'd'
    After Compaction
    Free
    Free
    Free
    Free

    View full-size slide

  53. Implementation Details

    View full-size slide

  54. Code:
    https://github.com/github/ruby/tree/gc-compact

    View full-size slide

  55. Usage
    # Parent unicorn process
    load_all_of_rails_and_dependencies
    load_all_of_application
    GC.compact
    N.times do
    fork do
    # Child worker processes
    # handle requests
    end
    end

    View full-size slide

  56. Changes to gc.c

    View full-size slide

  57. `gc_move`
    1 2 3
    Free Obj
    Address
    Heap 1
    Obj

    View full-size slide

  58. `T_MOVED`
    1 2 3
    Obj
    Address
    Heap 1
    T_MOVED
    Obj

    View full-size slide

  59. `gc_compact_heap`
    1 2 3
    Obj
    Address
    Heap
    ☝ ☝
    Free Obj

    View full-size slide

  60. `gc_update_object_references`
    1 2 3
    Obj
    Address
    Heap Free Obj
    1 2 3
    Obj
    Address
    Heap 1
    Obj

    View full-size slide

  61. Reference Update Helpers
    • gc_ref_update_array

    • gc_ref_update_object

    • hash_foreach_replace

    • gc_ref_update_method_entry

    • ….. etc

    View full-size slide

  62. `pinned_bits[];`

    View full-size slide

  63. What objects can move?

    View full-size slide

  64. What can move?
    • Everthing

    View full-size slide

  65. Finding References

    View full-size slide

  66. Pure Ruby References
    class Foo
    def initialize obj
    @bar = obj
    end
    end
    class Bar
    end
    bar = Bar.new
    foo = Foo.new(bar)
    Foo
    Bar
    @bar

    View full-size slide

  67. C References
    class Bar
    end
    bar = Bar.new
    foo = Foo.new(bar)
    Foo
    Bar
    ???

    View full-size slide

  68. C References
    class Bar
    end
    bar = Bar.new
    foo = Foo.new(bar)
    Foo
    Bar
    rb_gc_mark( )
    T_MOVED

    View full-size slide

  69. C References
    class Bar
    end
    bar = Bar.new
    foo = Foo.new(bar)
    Foo
    Bar
    rb_gc_mark( )
    Cannot update

    View full-size slide

  70. rb_gc_mark
    1. Mark the object

    2. Pin the object in `pinned_bits` table

    View full-size slide

  71. GC.compact
    1. Full GC (so objects get pinned)

    2. Compact objects

    3. Update references

    View full-size slide

  72. What can move?
    • Everthing

    • Except objects marked with `rb_gc_mark`

    View full-size slide

  73. Movement Problems

    View full-size slide

  74. Hashing
    Object
    hash_key( )
    = memory address

    View full-size slide

  75. Fix: cache hash key

    View full-size slide

  76. What can move?
    • Everthing

    • Except objects marked with `rb_gc_mark`

    • and hash keys

    View full-size slide

  77. Dual References

    View full-size slide

  78. Dual References
    Foo
    Bar
    Baz
    T_MOVED

    View full-size slide

  79. Dual References
    Foo
    T_MOVED
    Baz
    Bar

    View full-size slide

  80. Fix: Call rb_gc_mark,
    or use only Ruby
    https://github.com/msgpack/msgpack-ruby/pull/135

    View full-size slide

  81. What can move?
    • Everthing

    • Except objects marked with `rb_gc_mark`

    • and hash keys

    • and dual referenced objects

    View full-size slide

  82. Global Variables

    View full-size slide

  83. Global Variables (in C)
    VALUE cFoo;
    void Init_foo() {
    cFoo = rb_define_class("Foo", rb_cObject);
    }

    View full-size slide

  84. Fix: use heuristics to pin objects

    View full-size slide

  85. What can move?
    • Everthing

    • Except objects marked with `rb_gc_mark`

    • and hash keys

    • and dual referenced objects

    • and objects created with `rb_define_class`

    View full-size slide

  86. String Literals

    View full-size slide

  87. String Literals
    def foo
    puts "hello world"
    end
    ISeq
    literals
    (array)
    "hello world"
    bytecode

    View full-size slide

  88. Updating bytecode is hard

    View full-size slide

  89. What can move?
    • Everthing

    • Except objects marked with `rb_gc_mark`

    • and hash keys

    • and dual referenced objects

    • and objects created with `rb_define_class`

    • and string literals

    View full-size slide

  90. It seems like nothing can move

    View full-size slide

  91. Most can be fixed

    View full-size slide

  92. 46% can move!

    View full-size slide

  93. Before Compaction
    F
    U
    P
    Pages
    Number of slots
    0 100 200 300 400

    View full-size slide

  94. After Compaction
    F
    U
    P
    Pages
    Number of slots
    0 100 200 300 400

    View full-size slide

  95. Inspecting Memory

    View full-size slide

  96. ObjectSpace.dump_all

    View full-size slide

  97. ObjectSpace.dump_all
    require "objspace"
    File.open("out.json", "w") { |f|
    ObjectSpace.dump_all(output: f)
    }

    View full-size slide

  98. Measuring Rails Boot
    $ RAILS_ENV=production \
    bin/rails r \
    'require "objspace"; GC.compact; File.open("out.json", "w") { |f|
    ObjectSpace.dump_all(output: f)
    }’

    View full-size slide

  99. Output
    {"address":"0x7fcc6e01a198", "type":"OBJECT", "class":"0x7fcc6c93d420", "ivars":3,
    "references":["0x7fcc6e01bed0"], "memsize":40, "flags":{"wb_protected":true, "old":true,
    "uncollectible":true, "marked":true}}

    View full-size slide

  100. Output
    {
    "address": "0x7fcc6e01a198",
    "type": "OBJECT",
    "class": "0x7fcc6c93d420",
    "ivars": 3,
    "references": [
    "0x7fcc6e01bed0"
    ],
    "memsize": 40,
    "flags": {
    "wb_protected": true,
    "old": true,
    "uncollectible": true,
    "marked": true
    }
    }
    address
    references
    size

    View full-size slide

  101. Address = Location

    View full-size slide

  102. Heap Fragmentation
    Object
    Empty

    View full-size slide

  103. Heap Fragmentation
    Object
    Empty

    View full-size slide

  104. Heap Fragmentation
    Pinned
    Empty
    Moves

    View full-size slide

  105. Heap Fragmentation
    Pinned
    Empty
    Moves

    View full-size slide

  106. https://github.com/tenderlove/heap-utils

    View full-size slide

  107. Inspecting CoW Memory

    View full-size slide

  108. /proc/{PID}/smaps

    View full-size slide

  109. /proc/{PID}/smaps
    55a92679a000-55a926b53000 rw-p 00000000 00:00 0 [heap]
    Size: 3812 kB
    Rss: 3620 kB
    Pss: 3620 kB
    Shared_Clean: 0 kB
    Shared_Dirty: 0 kB
    Private_Clean: 0 kB
    Private_Dirty: 3620 kB
    Referenced: 3620 kB
    Anonymous: 3620 kB
    AnonHugePages: 0 kB
    Shared_Hugetlb: 0 kB
    Private_Hugetlb: 0 kB
    Swap: 0 kB
    SwapPss: 0 kB
    KernelPageSize: 4 kB
    MMUPageSize: 4 kB
    Locked: 0 kB
    Address
    Range
    RSS & PSS
    Shared
    Dirty

    View full-size slide

  110. RSS
    Shared_Clean + Shared_Dirty + Private_Clean + Private_Dirty

    View full-size slide

  111. PSS
    (Shared_Dirty / Number of Processes) + Shared_Clean +
    Private_Clean + Private_Dirty

    View full-size slide

  112. RSS vs PSS
    RSS PSS
    Unicorn Parent 3620 kB 1840 kB
    Unicorn Child 3620 kB 1840 kB
    Total Usage is 3620 kB not 7240 kB

    View full-size slide

  113. Copying Memory
    x = "x" * 9000
    p PID: $$
    gets
    child_pid = fork do
    puts "forked"
    9000.times do |i|
    puts("I: #{i}") || gets if i % 1000 == 0
    x[i] = 109.chr
    end
    puts "done"
    gets
    end
    Process.waitpid child_pid

    View full-size slide

  114. Shared_Dirty, PSS, RSS
    Memory in Kb
    0
    1000
    2000
    3000
    4000
    Number of Writes
    0 1000 2000 3000 4000 5000 6000 7000 8000 9000
    Shared_Dirty PSS RSS

    View full-size slide

  115. Compaction Impact
    p PID: $$
    arry = []
    GC.start
    gets
    GC.compact if ENV["COMPACT"]
    child_pid = fork do
    pages = GC.stat :heap_allocated_pages
    while pages == GC.stat(:heap_allocated_pages)
    arry << Object.new
    end
    puts "done"
    gets
    end
    Process.waitpid child_pid
    Fill Heap

    View full-size slide

  116. No Compaction PSS: 2684Kb
    Compaction PSS: 2530Kb

    View full-size slide

  117. Compaction Savings: 154Kb

    View full-size slide

  118. Compaction Savings:
    Unknown

    View full-size slide

  119. Use `ObjectSpace`

    View full-size slide

  120. /proc/{PID}/smaps

    View full-size slide

  121. Why compact?

    View full-size slide

  122. "Impossible"

    View full-size slide

  123. Question Your Assumptions

    View full-size slide

  124. We’ve entered grass

    ʹ

    ͬ
    ͨ

    View full-size slide