Building a Compacting GC

F29327647a9cff5c69618bae420792ea?s=47 Aaron Patterson
September 20, 2017

Building a Compacting GC

This is a talk about building a compacting GC that I presented at RubyKaigi 2017.

F29327647a9cff5c69618bae420792ea?s=128

Aaron Patterson

September 20, 2017
Tweet

Transcript

  1. Exploring Memory in Ruby Building a Compacting GC

  2. Aaron Patterson

  3. @tenderlove

  4. None
  5. Ruby Core && Rails Core

  6. ૲ੜ͑Δ Grass Grows !!! * Note: English speakers please ask

    me about this slide, it cannot be translated ❤
  7. X GitHub

  8. git push -f

  9. Two Cats

  10. None
  11. None
  12. Mark / Compact GC

  13. Exploring Memory in Ruby Copy on Write Building a Compacting

    GC (in MRI) Memory Inspection Tools
  14. Low Level

  15. New People:

  16. What is "Copy on Write"? What is "Compaction"? I can

    do this!
  17. Experienced People:

  18. Algorithms Implementation Details

  19. Copy on Write Optimization

  20. CoW

  21. What is CoW?

  22. Ruby String require 'objspace' str = "x" * 9000 p

    ObjectSpace.memsize_of(str) # => 9041 str2 = str.dup p ObjectSpace.memsize_of(str2) # => 40 str2[1] = 'l' p ObjectSpace.memsize_of(str2) # => 9041 Initial String No Copy Copied "on write"
  23. Ruby Array require 'objspace' array = ["x"] * 9000 p

    ObjectSpace.memsize_of(array) # => 72040 array2 = array.dup p ObjectSpace.memsize_of(array2) # => 40 array2[1] = 'l' p ObjectSpace.memsize_of(array2) # => 72040 Initial Array No Copy Copied "on write"
  24. Ruby Hash require 'objspace' hash = ('a'..'zzz').each_with_object({}) { |k,h| h[k]

    = :hello } p ObjectSpace.memsize_of(hash) # => 917600 hash2 = hash.dup p ObjectSpace.memsize_of(hash2) # => 917600 Initial Hash Did Copy
  25. No Observable Difference

  26. Operating System

  27. `fork`

  28. Ruby Fork string = "x" * 90000 p PARENT_PID: $$

    gets child_pid = fork do p CHILD_PID: $$ gets string[1] = 'y' gets end Process.waitpid child_pid Initial String No Copy Copied "on write"
  29. OS Memory Copy xxxx xxxx xxxx xxxx Parent Process 4k

    4k 4k 4k Child Process xxxx xxxx xxxx xxxx xyxx
  30. "CoW Page Fault"

  31. Why is CoW Important?

  32. Unicorn is a forking webserver

  33. Unicorn Parent Unicorn Child Unicorn Child Unicorn Child

  34. Unicorn Parent Unicorn Child Unicorn Child Unicorn Child

  35. Reduce Boot Time

  36. Decreases Memory Usage

  37. This is how it works today.

  38. Reducing Page Faults

  39. What causes page faults?

  40. Mutating Share Memory

  41. Garbage Collector

  42. Object Allocation

  43. Object Allocation Ruby Objects Empty Filled Parent Process Memory Child

    Process Memory
  44. How can we reduce this space?

  45. GC compaction

  46. Compact Before Fork Ruby Objects Empty Filled Parent Process Memory

    Page 1 Page 2
  47. Compact Before Fork Ruby Objects Empty Filled Parent Process Memory

    Page 1 Page 2 Child Process Memory
  48. GC Compaction

  49. X GitHub

  50. What is "compaction"?

  51. Compaction Ruby Objects Empty Filled Parent Process Memory Page 1

    Page 2
  52. Why compact?

  53. Reduce Memory Usage

  54. "Impossible"

  55. Compaction Algorithms

  56. Two Finger Compaction ☝ ☝

  57. Disadvantages • It’s slow • Objects move to random places

  58. Advantage • It’s EASY!

  59. Algorithm Object Movement Reference Updating

  60. Object Movement 1 2 3 4 5 6 7 8

    9 a b Free Free Free Obj Free Obj Obj Obj Free Free Obj ☝ ☝ Free Pointer Scan Pointer 1 2 3 5 Done! Address Heap
  61. Reference Updating 1 2 3 4 5 6 7 8

    9 a b Free Free Free Obj Free Obj Obj Obj Free Free Obj Address Heap a = { c: 'd' } Ruby {} :c 'd' Before Compaction
  62. Reference Updating 1 2 3 4 5 6 7 8

    9 a b Obj Obj Obj Obj Obj 5 3 2 Free Free 1 Address Heap a = { c: 'd' } Ruby {} :c 'd' After Compaction
  63. Reference Updating 1 2 3 4 5 6 7 8

    9 a b Obj Obj Obj Obj Obj 5 3 2 Free Free 1 Address Heap a = { c: 'd' } Ruby {} :c 'd' After Compaction ☝
  64. Reference Updating 1 2 3 4 5 6 7 8

    9 a b Obj Obj Obj Obj Obj 5 3 2 Free Free 1 Address Heap a = { c: 'd' } Ruby {} :c 'd' After Compaction Free Free Free Free
  65. Done!!

  66. Implementation Details

  67. Code: https://github.com/github/ruby/tree/gc-compact

  68. GC.compact

  69. Usage # Parent unicorn process load_all_of_rails_and_dependencies load_all_of_application GC.compact N.times do

    fork do # Child worker processes # handle requests end end
  70. Changes to gc.c

  71. `gc_move` 1 2 3 Free Obj Address Heap 1 Obj

  72. `T_MOVED` 1 2 3 Obj Address Heap 1 T_MOVED Obj

  73. `gc_compact_heap` 1 2 3 Obj Address Heap ☝ ☝ Free

    Obj
  74. `gc_update_object_references` 1 2 3 Obj Address Heap Free Obj 1

    2 3 Obj Address Heap 1 Obj
  75. Reference Update Helpers • gc_ref_update_array • gc_ref_update_object • hash_foreach_replace •

    gc_ref_update_method_entry • ….. etc
  76. `pinned_bits[];`

  77. What objects can move?

  78. What can move? • Everthing

  79. Finding References

  80. Pure Ruby References class Foo def initialize obj @bar =

    obj end end class Bar end bar = Bar.new foo = Foo.new(bar) Foo Bar @bar
  81. C References class Bar end bar = Bar.new foo =

    Foo.new(bar) Foo Bar ???
  82. C References class Bar end bar = Bar.new foo =

    Foo.new(bar) Foo Bar rb_gc_mark( ) T_MOVED
  83. C References class Bar end bar = Bar.new foo =

    Foo.new(bar) Foo Bar rb_gc_mark( ) Cannot update
  84. rb_gc_mark 1. Mark the object 2. Pin the object in

    `pinned_bits` table
  85. GC.compact 1. Full GC (so objects get pinned) 2. Compact

    objects 3. Update references
  86. What can move? • Everthing • Except objects marked with

    `rb_gc_mark`
  87. Movement Problems

  88. Hash Tables

  89. Hashing Object hash_key( ) = memory address

  90. Fix: cache hash key

  91. What can move? • Everthing • Except objects marked with

    `rb_gc_mark` • and hash keys
  92. Dual References

  93. Dual References Foo Bar Baz T_MOVED

  94. Dual References Foo T_MOVED Baz Bar

  95. Fix: Call rb_gc_mark, or use only Ruby https://github.com/msgpack/msgpack-ruby/pull/135

  96. What can move? • Everthing • Except objects marked with

    `rb_gc_mark` • and hash keys • and dual referenced objects
  97. Global Variables

  98. Global Variables (in C) VALUE cFoo; void Init_foo() { cFoo

    = rb_define_class("Foo", rb_cObject); }
  99. Fix: use heuristics to pin objects

  100. What can move? • Everthing • Except objects marked with

    `rb_gc_mark` • and hash keys • and dual referenced objects • and objects created with `rb_define_class`
  101. String Literals

  102. String Literals def foo puts "hello world" end ISeq literals

    (array) "hello world" bytecode
  103. Updating bytecode is hard

  104. What can move? • Everthing • Except objects marked with

    `rb_gc_mark` • and hash keys • and dual referenced objects • and objects created with `rb_define_class` • and string literals
  105. It seems like nothing can move

  106. Most can be fixed

  107. 46% can move!

  108. Before Compaction F U P Pages Number of slots 0

    100 200 300 400
  109. After Compaction F U P Pages Number of slots 0

    100 200 300 400
  110. Inspecting Memory

  111. ObjectSpace.dump_all

  112. ObjectSpace.dump_all require "objspace" File.open("out.json", "w") { |f| ObjectSpace.dump_all(output: f) }

  113. Measuring Rails Boot $ RAILS_ENV=production \ bin/rails r \ 'require

    "objspace"; GC.compact; File.open("out.json", "w") { |f| ObjectSpace.dump_all(output: f) }’
  114. Output {"address":"0x7fcc6e01a198", "type":"OBJECT", "class":"0x7fcc6c93d420", "ivars":3, "references":["0x7fcc6e01bed0"], "memsize":40, "flags":{"wb_protected":true, "old":true, "uncollectible":true,

    "marked":true}}
  115. Output { "address": "0x7fcc6e01a198", "type": "OBJECT", "class": "0x7fcc6c93d420", "ivars": 3,

    "references": [ "0x7fcc6e01bed0" ], "memsize": 40, "flags": { "wb_protected": true, "old": true, "uncollectible": true, "marked": true } } address references size
  116. Address = Location

  117. Heap Fragmentation Object Empty

  118. Heap Fragmentation Object Empty

  119. Heap Fragmentation Pinned Empty Moves

  120. Heap Fragmentation Pinned Empty Moves

  121. https://github.com/tenderlove/heap-utils

  122. Inspecting CoW Memory

  123. /proc/{PID}/smaps

  124. /proc/{PID}/smaps 55a92679a000-55a926b53000 rw-p 00000000 00:00 0 [heap] Size: 3812 kB

    Rss: 3620 kB Pss: 3620 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 3620 kB Referenced: 3620 kB Anonymous: 3620 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB Address Range RSS & PSS Shared Dirty
  125. RSS Shared_Clean + Shared_Dirty + Private_Clean + Private_Dirty

  126. PSS (Shared_Dirty / Number of Processes) + Shared_Clean + Private_Clean

    + Private_Dirty
  127. RSS vs PSS RSS PSS Unicorn Parent 3620 kB 1840

    kB Unicorn Child 3620 kB 1840 kB Total Usage is 3620 kB not 7240 kB
  128. Copying Memory x = "x" * 9000 p PID: $$

    gets child_pid = fork do puts "forked" 9000.times do |i| puts("I: #{i}") || gets if i % 1000 == 0 x[i] = 109.chr end puts "done" gets end Process.waitpid child_pid
  129. Shared_Dirty, PSS, RSS Memory in Kb 0 1000 2000 3000

    4000 Number of Writes 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Shared_Dirty PSS RSS
  130. Compaction Impact p PID: $$ arry = [] GC.start gets

    GC.compact if ENV["COMPACT"] child_pid = fork do pages = GC.stat :heap_allocated_pages while pages == GC.stat(:heap_allocated_pages) arry << Object.new end puts "done" gets end Process.waitpid child_pid Fill Heap
  131. No Compaction PSS: 2684Kb Compaction PSS: 2530Kb

  132. Compaction Savings: 154Kb

  133. Conclusion

  134. Compaction Savings: Unknown

  135. Use `ObjectSpace`

  136. /proc/{PID}/smaps

  137. Why compact?

  138. "Impossible"

  139. Question Your Assumptions

  140. We’ve entered grass ૲ ʹ ೖ ͬ ͨ

  141. Thank you!