Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building a Compacting GC

Aaron Patterson
September 20, 2017

Building a Compacting GC

This is a talk about building a compacting GC that I presented at RubyKaigi 2017.

Aaron Patterson

September 20, 2017
Tweet

More Decks by Aaron Patterson

Other Decks in Programming

Transcript

  1. Exploring Memory in Ruby
    Building a Compacting GC

    View Slide

  2. Aaron Patterson

    View Slide

  3. @tenderlove

    View Slide

  4. View Slide

  5. Ruby Core && Rails Core

    View Slide

  6. ૲ੜ͑Δ
    Grass Grows
    !!!
    * Note: English speakers please ask me about this slide, it cannot be translated ❤

    View Slide

  7. X GitHub

    View Slide

  8. git push -f

    View Slide

  9. Two Cats

    View Slide

  10. View Slide

  11. View Slide

  12. Mark / Compact GC

    View Slide

  13. Exploring Memory in Ruby
    Copy on Write
    Building a Compacting GC (in MRI)
    Memory Inspection Tools

    View Slide

  14. Low Level

    View Slide

  15. New People:

    View Slide

  16. What is "Copy on Write"?
    What is "Compaction"?
    I can do this!

    View Slide

  17. Experienced People:

    View Slide

  18. Algorithms
    Implementation Details

    View Slide

  19. Copy on Write
    Optimization

    View Slide

  20. CoW

    View Slide

  21. What is CoW?

    View Slide

  22. Ruby String
    require 'objspace'
    str = "x" * 9000
    p ObjectSpace.memsize_of(str) # => 9041
    str2 = str.dup
    p ObjectSpace.memsize_of(str2) # => 40
    str2[1] = 'l'
    p ObjectSpace.memsize_of(str2) # => 9041
    Initial String
    No Copy
    Copied "on write"

    View Slide

  23. Ruby Array
    require 'objspace'
    array = ["x"] * 9000
    p ObjectSpace.memsize_of(array) # => 72040
    array2 = array.dup
    p ObjectSpace.memsize_of(array2) # => 40
    array2[1] = 'l'
    p ObjectSpace.memsize_of(array2) # => 72040
    Initial Array
    No Copy
    Copied "on write"

    View Slide

  24. Ruby Hash
    require 'objspace'
    hash = ('a'..'zzz').each_with_object({}) { |k,h| h[k] = :hello }
    p ObjectSpace.memsize_of(hash) # => 917600
    hash2 = hash.dup
    p ObjectSpace.memsize_of(hash2) # => 917600
    Initial Hash
    Did Copy

    View Slide

  25. No Observable Difference

    View Slide

  26. Operating System

    View Slide

  27. `fork`

    View Slide

  28. Ruby Fork
    string = "x" * 90000
    p PARENT_PID: $$
    gets
    child_pid = fork do
    p CHILD_PID: $$
    gets
    string[1] = 'y'
    gets
    end
    Process.waitpid child_pid
    Initial String
    No Copy
    Copied "on write"

    View Slide

  29. OS Memory Copy
    xxxx xxxx xxxx xxxx
    Parent
    Process
    4k 4k 4k 4k
    Child
    Process
    xxxx xxxx xxxx xxxx
    xyxx

    View Slide

  30. "CoW Page Fault"

    View Slide

  31. Why is CoW Important?

    View Slide

  32. Unicorn is a forking webserver

    View Slide

  33. Unicorn
    Parent
    Unicorn
    Child
    Unicorn
    Child
    Unicorn
    Child

    View Slide

  34. Unicorn
    Parent
    Unicorn
    Child
    Unicorn
    Child
    Unicorn
    Child

    View Slide

  35. Reduce Boot Time

    View Slide

  36. Decreases Memory Usage

    View Slide

  37. This is how it works today.

    View Slide

  38. Reducing Page Faults

    View Slide

  39. What causes page faults?

    View Slide

  40. Mutating Share Memory

    View Slide

  41. Garbage Collector

    View Slide

  42. Object Allocation

    View Slide

  43. Object Allocation
    Ruby Objects
    Empty
    Filled
    Parent
    Process
    Memory
    Child
    Process
    Memory

    View Slide

  44. How can we reduce this space?

    View Slide

  45. GC compaction

    View Slide

  46. Compact Before Fork
    Ruby Objects
    Empty
    Filled
    Parent
    Process
    Memory
    Page 1 Page 2

    View Slide

  47. Compact Before Fork
    Ruby Objects
    Empty
    Filled
    Parent
    Process
    Memory
    Page 1 Page 2
    Child
    Process
    Memory

    View Slide

  48. GC Compaction

    View Slide

  49. X GitHub

    View Slide

  50. What is "compaction"?

    View Slide

  51. Compaction
    Ruby Objects
    Empty
    Filled
    Parent
    Process
    Memory
    Page 1 Page 2

    View Slide

  52. Why compact?

    View Slide

  53. Reduce Memory Usage

    View Slide

  54. "Impossible"

    View Slide

  55. Compaction Algorithms

    View Slide

  56. Two Finger Compaction
    ☝ ☝

    View Slide

  57. Disadvantages
    • It’s slow

    • Objects move to random places

    View Slide

  58. Advantage
    • It’s EASY!

    View Slide

  59. Algorithm
    Object Movement
    Reference Updating

    View Slide

  60. Object Movement
    1 2 3 4 5 6 7 8 9 a b
    Free Free Free Obj Free Obj Obj Obj Free Free Obj
    ☝ ☝
    Free Pointer Scan Pointer
    1
    2
    3
    5
    Done!
    Address
    Heap

    View Slide

  61. Reference Updating
    1 2 3 4 5 6 7 8 9 a b
    Free Free Free Obj Free Obj Obj Obj Free Free Obj
    Address
    Heap
    a = { c: 'd' }
    Ruby
    {} :c 'd'
    Before Compaction

    View Slide

  62. Reference Updating
    1 2 3 4 5 6 7 8 9 a b
    Obj Obj Obj Obj Obj 5 3 2 Free Free 1
    Address
    Heap
    a = { c: 'd' }
    Ruby
    {} :c
    'd'
    After Compaction

    View Slide

  63. Reference Updating
    1 2 3 4 5 6 7 8 9 a b
    Obj Obj Obj Obj Obj 5 3 2 Free Free 1
    Address
    Heap
    a = { c: 'd' }
    Ruby
    {} :c
    'd'
    After Compaction

    View Slide

  64. Reference Updating
    1 2 3 4 5 6 7 8 9 a b
    Obj Obj Obj Obj Obj 5 3 2 Free Free 1
    Address
    Heap
    a = { c: 'd' }
    Ruby
    {} :c
    'd'
    After Compaction
    Free
    Free
    Free
    Free

    View Slide

  65. Done!!

    View Slide

  66. Implementation Details

    View Slide

  67. Code:
    https://github.com/github/ruby/tree/gc-compact

    View Slide

  68. GC.compact

    View Slide

  69. Usage
    # Parent unicorn process
    load_all_of_rails_and_dependencies
    load_all_of_application
    GC.compact
    N.times do
    fork do
    # Child worker processes
    # handle requests
    end
    end

    View Slide

  70. Changes to gc.c

    View Slide

  71. `gc_move`
    1 2 3
    Free Obj
    Address
    Heap 1
    Obj

    View Slide

  72. `T_MOVED`
    1 2 3
    Obj
    Address
    Heap 1
    T_MOVED
    Obj

    View Slide

  73. `gc_compact_heap`
    1 2 3
    Obj
    Address
    Heap
    ☝ ☝
    Free Obj

    View Slide

  74. `gc_update_object_references`
    1 2 3
    Obj
    Address
    Heap Free Obj
    1 2 3
    Obj
    Address
    Heap 1
    Obj

    View Slide

  75. Reference Update Helpers
    • gc_ref_update_array

    • gc_ref_update_object

    • hash_foreach_replace

    • gc_ref_update_method_entry

    • ….. etc

    View Slide

  76. `pinned_bits[];`

    View Slide

  77. What objects can move?

    View Slide

  78. What can move?
    • Everthing

    View Slide

  79. Finding References

    View Slide

  80. Pure Ruby References
    class Foo
    def initialize obj
    @bar = obj
    end
    end
    class Bar
    end
    bar = Bar.new
    foo = Foo.new(bar)
    Foo
    Bar
    @bar

    View Slide

  81. C References
    class Bar
    end
    bar = Bar.new
    foo = Foo.new(bar)
    Foo
    Bar
    ???

    View Slide

  82. C References
    class Bar
    end
    bar = Bar.new
    foo = Foo.new(bar)
    Foo
    Bar
    rb_gc_mark( )
    T_MOVED

    View Slide

  83. C References
    class Bar
    end
    bar = Bar.new
    foo = Foo.new(bar)
    Foo
    Bar
    rb_gc_mark( )
    Cannot update

    View Slide

  84. rb_gc_mark
    1. Mark the object

    2. Pin the object in `pinned_bits` table

    View Slide

  85. GC.compact
    1. Full GC (so objects get pinned)

    2. Compact objects

    3. Update references

    View Slide

  86. What can move?
    • Everthing

    • Except objects marked with `rb_gc_mark`

    View Slide

  87. Movement Problems

    View Slide

  88. Hash Tables

    View Slide

  89. Hashing
    Object
    hash_key( )
    = memory address

    View Slide

  90. Fix: cache hash key

    View Slide

  91. What can move?
    • Everthing

    • Except objects marked with `rb_gc_mark`

    • and hash keys

    View Slide

  92. Dual References

    View Slide

  93. Dual References
    Foo
    Bar
    Baz
    T_MOVED

    View Slide

  94. Dual References
    Foo
    T_MOVED
    Baz
    Bar

    View Slide

  95. Fix: Call rb_gc_mark,
    or use only Ruby
    https://github.com/msgpack/msgpack-ruby/pull/135

    View Slide

  96. What can move?
    • Everthing

    • Except objects marked with `rb_gc_mark`

    • and hash keys

    • and dual referenced objects

    View Slide

  97. Global Variables

    View Slide

  98. Global Variables (in C)
    VALUE cFoo;
    void Init_foo() {
    cFoo = rb_define_class("Foo", rb_cObject);
    }

    View Slide

  99. Fix: use heuristics to pin objects

    View Slide

  100. What can move?
    • Everthing

    • Except objects marked with `rb_gc_mark`

    • and hash keys

    • and dual referenced objects

    • and objects created with `rb_define_class`

    View Slide

  101. String Literals

    View Slide

  102. String Literals
    def foo
    puts "hello world"
    end
    ISeq
    literals
    (array)
    "hello world"
    bytecode

    View Slide

  103. Updating bytecode is hard

    View Slide

  104. What can move?
    • Everthing

    • Except objects marked with `rb_gc_mark`

    • and hash keys

    • and dual referenced objects

    • and objects created with `rb_define_class`

    • and string literals

    View Slide

  105. It seems like nothing can move

    View Slide

  106. Most can be fixed

    View Slide

  107. 46% can move!

    View Slide

  108. Before Compaction
    F
    U
    P
    Pages
    Number of slots
    0 100 200 300 400

    View Slide

  109. After Compaction
    F
    U
    P
    Pages
    Number of slots
    0 100 200 300 400

    View Slide

  110. Inspecting Memory

    View Slide

  111. ObjectSpace.dump_all

    View Slide

  112. ObjectSpace.dump_all
    require "objspace"
    File.open("out.json", "w") { |f|
    ObjectSpace.dump_all(output: f)
    }

    View Slide

  113. Measuring Rails Boot
    $ RAILS_ENV=production \
    bin/rails r \
    'require "objspace"; GC.compact; File.open("out.json", "w") { |f|
    ObjectSpace.dump_all(output: f)
    }’

    View Slide

  114. Output
    {"address":"0x7fcc6e01a198", "type":"OBJECT", "class":"0x7fcc6c93d420", "ivars":3,
    "references":["0x7fcc6e01bed0"], "memsize":40, "flags":{"wb_protected":true, "old":true,
    "uncollectible":true, "marked":true}}

    View Slide

  115. Output
    {
    "address": "0x7fcc6e01a198",
    "type": "OBJECT",
    "class": "0x7fcc6c93d420",
    "ivars": 3,
    "references": [
    "0x7fcc6e01bed0"
    ],
    "memsize": 40,
    "flags": {
    "wb_protected": true,
    "old": true,
    "uncollectible": true,
    "marked": true
    }
    }
    address
    references
    size

    View Slide

  116. Address = Location

    View Slide

  117. Heap Fragmentation
    Object
    Empty

    View Slide

  118. Heap Fragmentation
    Object
    Empty

    View Slide

  119. Heap Fragmentation
    Pinned
    Empty
    Moves

    View Slide

  120. Heap Fragmentation
    Pinned
    Empty
    Moves

    View Slide

  121. https://github.com/tenderlove/heap-utils

    View Slide

  122. Inspecting CoW Memory

    View Slide

  123. /proc/{PID}/smaps

    View Slide

  124. /proc/{PID}/smaps
    55a92679a000-55a926b53000 rw-p 00000000 00:00 0 [heap]
    Size: 3812 kB
    Rss: 3620 kB
    Pss: 3620 kB
    Shared_Clean: 0 kB
    Shared_Dirty: 0 kB
    Private_Clean: 0 kB
    Private_Dirty: 3620 kB
    Referenced: 3620 kB
    Anonymous: 3620 kB
    AnonHugePages: 0 kB
    Shared_Hugetlb: 0 kB
    Private_Hugetlb: 0 kB
    Swap: 0 kB
    SwapPss: 0 kB
    KernelPageSize: 4 kB
    MMUPageSize: 4 kB
    Locked: 0 kB
    Address
    Range
    RSS & PSS
    Shared
    Dirty

    View Slide

  125. RSS
    Shared_Clean + Shared_Dirty + Private_Clean + Private_Dirty

    View Slide

  126. PSS
    (Shared_Dirty / Number of Processes) + Shared_Clean +
    Private_Clean + Private_Dirty

    View Slide

  127. RSS vs PSS
    RSS PSS
    Unicorn Parent 3620 kB 1840 kB
    Unicorn Child 3620 kB 1840 kB
    Total Usage is 3620 kB not 7240 kB

    View Slide

  128. Copying Memory
    x = "x" * 9000
    p PID: $$
    gets
    child_pid = fork do
    puts "forked"
    9000.times do |i|
    puts("I: #{i}") || gets if i % 1000 == 0
    x[i] = 109.chr
    end
    puts "done"
    gets
    end
    Process.waitpid child_pid

    View Slide

  129. Shared_Dirty, PSS, RSS
    Memory in Kb
    0
    1000
    2000
    3000
    4000
    Number of Writes
    0 1000 2000 3000 4000 5000 6000 7000 8000 9000
    Shared_Dirty PSS RSS

    View Slide

  130. Compaction Impact
    p PID: $$
    arry = []
    GC.start
    gets
    GC.compact if ENV["COMPACT"]
    child_pid = fork do
    pages = GC.stat :heap_allocated_pages
    while pages == GC.stat(:heap_allocated_pages)
    arry << Object.new
    end
    puts "done"
    gets
    end
    Process.waitpid child_pid
    Fill Heap

    View Slide

  131. No Compaction PSS: 2684Kb
    Compaction PSS: 2530Kb

    View Slide

  132. Compaction Savings: 154Kb

    View Slide

  133. Conclusion

    View Slide

  134. Compaction Savings:
    Unknown

    View Slide

  135. Use `ObjectSpace`

    View Slide

  136. /proc/{PID}/smaps

    View Slide

  137. Why compact?

    View Slide

  138. "Impossible"

    View Slide

  139. Question Your Assumptions

    View Slide

  140. We’ve entered grass

    ʹ

    ͬ
    ͨ

    View Slide

  141. Thank you!

    View Slide