Compacting GC in MRI v2

Compacting GC in MRI v2

About compacting GC in MRI (v2). For Ruby Kaigi 2019

F29327647a9cff5c69618bae420792ea?s=128

Aaron Patterson

April 18, 2019
Tweet

Transcript

  1. 4.
  2. 5.
  3. 6.
  4. 18.
  5. 25.
  6. 26.
  7. 29.

    CoW Friendliness Computer Memory Allocated Memory Allocated Memory Allocated Memory

    Free Memory Free Memory Allocated Memory Allocated Memory Allocated Memory Free Memory Free Memory Parent Process Child Process
  8. 30.

    CoW Friendliness Computer Memory Allocated Memory Allocated Memory Allocated Memory

    Free Memory Free Memory Allocated Memory Allocated Memory Allocated Memory Free Memory Free Memory Parent Process Child Process
  9. 31.

    CoW Friendliness Computer Memory Allocated Memory Allocated Memory Allocated Memory

    Free Memory Free Memory Allocated Memory Allocated Memory Allocated Memory Free Memory Free Memory Parent Process Child Process
  10. 32.

    CoW Friendliness Computer Memory Allocated Memory Allocated Memory Allocated Memory

    Free Memory Free Memory Allocated Memory Allocated Memory Allocated Memory Free Memory Free Memory Parent Process Child Process
  11. 36.
  12. 39.

    Ruby Heaps System Memory Malloc Heap Ruby’s Object Heap String.new

    "The Quick Brown Fox Jumps Over The Lazy Dog"
  13. 45.

    Ruby’s Heap Layout 40 bytes Each chunk is a "slot"

    Em pty Filled Empty Filled Moved
  14. 49.

    Ruby’s Heap Layout Each slot has a unique address 1

    2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
  15. 53.

    Moving Objects A B C D E F Free Scan

    1 2 3 4 5 6 7 8 9 10 Empty Filled Moved 4 5
  16. 54.

    Updating References A B C D E F 1 2

    3 4 5 6 7 8 9 10 Empty Filled Moved Before Moving Objects
  17. 55.

    Updating References A B C D 5 4 1 2

    3 4 5 6 7 8 9 10 Empty Filled F Moved E After Moving Objects
  18. 56.

    Updating References A B C D 5 4 1 2

    3 4 5 6 7 8 9 10 Empty Filled F Moved E After Moving Objects
  19. 57.

    Object Movement def compact heap = [ ... ] #

    many slots left = 0 right = heap.length - 1 while left < right left_slot = heap[left] right_slot = heap[right] if is_empty?(left_slot) && !is_empty?(right_slot) && can_move?(right_slot) swap(left, right) heap[right] = T_MOVED.new(left) # leave forwarding address end while !is_empty?(heap[left]) left += 1 end while is_empty?(heap[right]) || !can_move?(heap[right]) right -= 1 end end end Pointers Met? Copy / Forward Advance "free" Retreat "scan"
  20. 58.

    Reference Updating def update_references heap.each do |slot| next if is_empty?(slot)

    || is_moved?(slot) slot.references.each_with_index do |child, i| if is_moved?(child) slot.set_reference(i, child.new_location) end end end end How are references stored?
  21. 59.

    Finding References • How do Hashes hold references? • How

    do Arrays hold references? • How do Objects hold references? • … • … • …
  22. 60.

    Reference Updating static void gc_ref_update_array(rb_objspace_t * objspace, VALUE v) {

    long i, len; if (FL_TEST(v, ELTS_SHARED)) return; len = RARRAY_LEN(v); if (len > 0) { VALUE *ptr = (VALUE *)RARRAY_CONST_PTR_TRANSIENT(v); for(i = 0; i < len; i++) { UPDATE_IF_MOVED(objspace, ptr[i]); } } } static void gc_ref_update_object(rb_objspace_t * objspace, VALUE v) { uint32_t i, len = ROBJECT_NUMIV(v); VALUE *ptr = ROBJECT_IVPTR(v); for (i = 0; i < len; i++) { UPDATE_IF_MOVED(objspace, ptr[i]); } } static int hash_replace_ref(st_data_t *key, st_data_t *value, st_data_t argp, int existing)
  23. 71.

    Yajl typedef struct { VALUE builderStack; VALUE parse_complete_callback; int nestedArrayLevel;

    int nestedHashLevel; int objectsFound; int symbolizeKeys; yajl_handle parser; } yajl_parser_wrapper; C Code (yajl_ext.h) malloc(yajl_parser_wrapper) Ruby Object T_DATA Ruby Object Ruby Object builderStack parse_complete_callback
  24. 72.

    Yajl typedef struct { VALUE builderStack; VALUE parse_complete_callback; int nestedArrayLevel;

    int nestedHashLevel; int objectsFound; int symbolizeKeys; yajl_handle parser; } yajl_parser_wrapper; C Code (yajl_ext.h) malloc(yajl_parser_wrapper) Ruby Object T_DATA Ruby Object Ruby Object builderStack parse_complete_callback GC: "idk, " MOVED!
  25. 73.

    Yajl Mark Function void yajl_parser_wrapper_mark(void * wrapper) { yajl_parser_wrapper *

    w = wrapper; if (w) { rb_gc_mark(w->builderStack); rb_gc_mark(w->parse_complete_callback); } } malloc(yajl_parser_wrapper) Ruby Object T_DATA Ruby Object Ruby Object rb_gc_mark(builderStack) rb_gc_mark(parse_complete_callback)
  26. 75.

    Pinning Bits 1 2 3 4 5 6 7 8

    9 10 Yajl [ ] ? "foo" "bar" ? Address Content Pinned x = [ "foo", "bar" ] y = Yajl.new Ruby Code rb_gc_m ark rb_gc_m ark gc_m ark_no_pin gc_m ark_no_pin
  27. 76.

    Pinning Bits 1 2 3 4 5 6 7 8

    9 10 Yajl [ ] ? ? Address Content Pinned x = [ "foo", "bar" ] y = Yajl.new Ruby Code Free Scan "bar" "foo" 4 5 Move Step
  28. 77.

    Pinning Bits 1 2 3 4 5 6 7 8

    9 10 Yajl [ ] ? ? Address Content Pinned x = [ "foo", "bar" ] y = Yajl.new Ruby Code 4 5 Reference Update Step "bar" "foo" Update
  29. 84.

    Compaction Callback static const rb_data_type_t yajl_parser_type = { "Yajl/parser", {yajl_parser_wrapper_mark,

    yajl_parser_wrapper_free, NULL,}, 0, 0, RUBY_TYPED_FREE_IMMEDIATELY, }; Mark No Compaction Callback static const rb_data_type_t yajl_parser_type = { "Yajl/parser", {yajl_parser_wrapper_mark, yajl_parser_wrapper_free, NULL, yajl_parser_compact}, 0, 0, RUBY_TYPED_FREE_IMMEDIATELY, }; Compact With Compaction Callback Sweep
  30. 85.

    "No Pin" Marking void yajl_parser_wrapper_mark(void * wrapper) { yajl_parser_wrapper *

    w = wrapper; if (w) { rb_gc_mark(w->builderStack); rb_gc_mark(w->parse_complete_callback); } } No Compaction Support void yajl_parser_wrapper_mark(void * wrapper) { yajl_parser_wrapper * w = wrapper; if (w) { rb_gc_mark_no_pin(w->builderStack); rb_gc_mark_no_pin(w->parse_complete_callback); } } With Compaction Support
  31. 86.

    Compaction Callback void yajl_parser_compact(void *wrapper) { yajl_parser_wrapper * w =

    wrapper; if (w) { w->builderStack = rb_gc_new_location(w->builderStack); w->parse_complete_callback = rb_gc_new_location(w->parse_complete_callback); } } New Location
  32. 88.

    Problem Object Graph Object Implemented in Ruby Object Implemented in

    C Some Object Automatically Marked!! (gc_mark_no_pin) Not Marked
  33. 89.

    Compaction 1 2 3 4 5 6 7 8 9

    10 Ruby Obj C Obj ? 4 5 3
  34. 91.

    RubyVM Instruction Sequence ISeq Object (in C) def foo "bar"

    end Code Mark Array (Ruby) "bar" Marked Marked NOT Marked
  35. 92.

    "Direct Marking" in Ruby 2.6 ISeq Object (in C) def

    foo "bar" end Code "bar" Marked https://bugs.ruby-lang.org/issues/14370
  36. 93.

    MsgPack Object Implemented in Ruby Object Implemented in C Some

    Object Automatically Marked!! (gc_mark) Not Marked
  37. 98.

    object_id is based on location 1 2 3 4 5

    6 7 8 9 10 Ruby Obj Ruby Obj Ruby Obj 5 object#object_id => 1 object#object_id => 2 object#object_id => 9 object#object_id => ?
  38. 101.

    "Seen" Object IDs $seen_object_id = {} class Object def object_id

    $seen_object_id[memory_location] ||= memory_location end end
  39. 102.

    Object ID After Move x = Object.new x.object_id GC.compact x.object_id

    1 2 3 4 X Heap Object ID Table Memory Location Object ID 4 4 Updated Object ID Table Memory Location Object ID 1 4
  40. 103.

    Object ID Collisions x = Object.new x.object_id GC.compact x.object_id y

    = Object.new y.object_id 1 2 3 4 X Heap Object ID Table Memory Location Object ID 4 4 Updated Object ID Table Memory Location Object ID 1 4 y.object_id => ??? x.object_id => 4 Y
  41. 104.

    Collision Resolution $seen_object_id = {} $location_to_object_id = {} class Object

    def object_id id = memory_location while $seen_object_id[id] id += 1 end $seen_object_id[id] = id $location_to_object_id[memory_location] = id end end
  42. 105.

    Object ID Collisions x = Object.new x.object_id GC.compact x.object_id y

    = Object.new y.object_id 1 2 3 4 X Heap Object ID Table Memory Location Object ID 4 4 Updated Object ID Table Memory Location Object ID 1 4 y.object_id => 5 x.object_id => 4 Y Updated Object ID Table- Memory Location Object ID 1 4 4 5
  43. 106.

    GC Cleanup $seen_object_id = {} $location_to_object_id = {} def free_obj(obj)

    if $location_to_object_id[obj.memory_location] id = $location_to_object_id.delete(obj.memory_location) $seen_object_id.delete(id) end end
  44. 108.
  45. 117.

    Sliding Compaction 1 2 3 4 5 6 7 8

    9 10 Yajl [ ] ? "foo" "bar" ? Address Content