Slide 1

Slide 1 text

Exploring Memory in Ruby Building a Compacting GC

Slide 2

Slide 2 text

Aaron Patterson

Slide 3

Slide 3 text

@tenderlove

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

Ruby Core && Rails Core

Slide 6

Slide 6 text

૲ੜ͑Δ Grass Grows !!! * Note: English speakers please ask me about this slide, it cannot be translated ❤

Slide 7

Slide 7 text

X GitHub

Slide 8

Slide 8 text

git push -f

Slide 9

Slide 9 text

Two Cats

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

Mark / Compact GC

Slide 13

Slide 13 text

Exploring Memory in Ruby Copy on Write Building a Compacting GC (in MRI) Memory Inspection Tools

Slide 14

Slide 14 text

Low Level

Slide 15

Slide 15 text

New People:

Slide 16

Slide 16 text

What is "Copy on Write"? What is "Compaction"? I can do this!

Slide 17

Slide 17 text

Experienced People:

Slide 18

Slide 18 text

Algorithms Implementation Details

Slide 19

Slide 19 text

Copy on Write Optimization

Slide 20

Slide 20 text

CoW

Slide 21

Slide 21 text

What is CoW?

Slide 22

Slide 22 text

Ruby String require 'objspace' str = "x" * 9000 p ObjectSpace.memsize_of(str) # => 9041 str2 = str.dup p ObjectSpace.memsize_of(str2) # => 40 str2[1] = 'l' p ObjectSpace.memsize_of(str2) # => 9041 Initial String No Copy Copied "on write"

Slide 23

Slide 23 text

Ruby Array require 'objspace' array = ["x"] * 9000 p ObjectSpace.memsize_of(array) # => 72040 array2 = array.dup p ObjectSpace.memsize_of(array2) # => 40 array2[1] = 'l' p ObjectSpace.memsize_of(array2) # => 72040 Initial Array No Copy Copied "on write"

Slide 24

Slide 24 text

Ruby Hash require 'objspace' hash = ('a'..'zzz').each_with_object({}) { |k,h| h[k] = :hello } p ObjectSpace.memsize_of(hash) # => 917600 hash2 = hash.dup p ObjectSpace.memsize_of(hash2) # => 917600 Initial Hash Did Copy

Slide 25

Slide 25 text

No Observable Difference

Slide 26

Slide 26 text

Operating System

Slide 27

Slide 27 text

`fork`

Slide 28

Slide 28 text

Ruby Fork string = "x" * 90000 p PARENT_PID: $$ gets child_pid = fork do p CHILD_PID: $$ gets string[1] = 'y' gets end Process.waitpid child_pid Initial String No Copy Copied "on write"

Slide 29

Slide 29 text

OS Memory Copy xxxx xxxx xxxx xxxx Parent Process 4k 4k 4k 4k Child Process xxxx xxxx xxxx xxxx xyxx

Slide 30

Slide 30 text

"CoW Page Fault"

Slide 31

Slide 31 text

Why is CoW Important?

Slide 32

Slide 32 text

Unicorn is a forking webserver

Slide 33

Slide 33 text

Unicorn Parent Unicorn Child Unicorn Child Unicorn Child

Slide 34

Slide 34 text

Unicorn Parent Unicorn Child Unicorn Child Unicorn Child

Slide 35

Slide 35 text

Reduce Boot Time

Slide 36

Slide 36 text

Decreases Memory Usage

Slide 37

Slide 37 text

This is how it works today.

Slide 38

Slide 38 text

Reducing Page Faults

Slide 39

Slide 39 text

What causes page faults?

Slide 40

Slide 40 text

Mutating Share Memory

Slide 41

Slide 41 text

Garbage Collector

Slide 42

Slide 42 text

Object Allocation

Slide 43

Slide 43 text

Object Allocation Ruby Objects Empty Filled Parent Process Memory Child Process Memory

Slide 44

Slide 44 text

How can we reduce this space?

Slide 45

Slide 45 text

GC compaction

Slide 46

Slide 46 text

Compact Before Fork Ruby Objects Empty Filled Parent Process Memory Page 1 Page 2

Slide 47

Slide 47 text

Compact Before Fork Ruby Objects Empty Filled Parent Process Memory Page 1 Page 2 Child Process Memory

Slide 48

Slide 48 text

GC Compaction

Slide 49

Slide 49 text

X GitHub

Slide 50

Slide 50 text

What is "compaction"?

Slide 51

Slide 51 text

Compaction Ruby Objects Empty Filled Parent Process Memory Page 1 Page 2

Slide 52

Slide 52 text

Why compact?

Slide 53

Slide 53 text

Reduce Memory Usage

Slide 54

Slide 54 text

"Impossible"

Slide 55

Slide 55 text

Compaction Algorithms

Slide 56

Slide 56 text

Two Finger Compaction ☝ ☝

Slide 57

Slide 57 text

Disadvantages • It’s slow • Objects move to random places

Slide 58

Slide 58 text

Advantage • It’s EASY!

Slide 59

Slide 59 text

Algorithm Object Movement Reference Updating

Slide 60

Slide 60 text

Object Movement 1 2 3 4 5 6 7 8 9 a b Free Free Free Obj Free Obj Obj Obj Free Free Obj ☝ ☝ Free Pointer Scan Pointer 1 2 3 5 Done! Address Heap

Slide 61

Slide 61 text

Reference Updating 1 2 3 4 5 6 7 8 9 a b Free Free Free Obj Free Obj Obj Obj Free Free Obj Address Heap a = { c: 'd' } Ruby {} :c 'd' Before Compaction

Slide 62

Slide 62 text

Reference Updating 1 2 3 4 5 6 7 8 9 a b Obj Obj Obj Obj Obj 5 3 2 Free Free 1 Address Heap a = { c: 'd' } Ruby {} :c 'd' After Compaction

Slide 63

Slide 63 text

Reference Updating 1 2 3 4 5 6 7 8 9 a b Obj Obj Obj Obj Obj 5 3 2 Free Free 1 Address Heap a = { c: 'd' } Ruby {} :c 'd' After Compaction ☝

Slide 64

Slide 64 text

Reference Updating 1 2 3 4 5 6 7 8 9 a b Obj Obj Obj Obj Obj 5 3 2 Free Free 1 Address Heap a = { c: 'd' } Ruby {} :c 'd' After Compaction Free Free Free Free

Slide 65

Slide 65 text

Done!!

Slide 66

Slide 66 text

Implementation Details

Slide 67

Slide 67 text

Code: https://github.com/github/ruby/tree/gc-compact

Slide 68

Slide 68 text

GC.compact

Slide 69

Slide 69 text

Usage # Parent unicorn process load_all_of_rails_and_dependencies load_all_of_application GC.compact N.times do fork do # Child worker processes # handle requests end end

Slide 70

Slide 70 text

Changes to gc.c

Slide 71

Slide 71 text

`gc_move` 1 2 3 Free Obj Address Heap 1 Obj

Slide 72

Slide 72 text

`T_MOVED` 1 2 3 Obj Address Heap 1 T_MOVED Obj

Slide 73

Slide 73 text

`gc_compact_heap` 1 2 3 Obj Address Heap ☝ ☝ Free Obj

Slide 74

Slide 74 text

`gc_update_object_references` 1 2 3 Obj Address Heap Free Obj 1 2 3 Obj Address Heap 1 Obj

Slide 75

Slide 75 text

Reference Update Helpers • gc_ref_update_array • gc_ref_update_object • hash_foreach_replace • gc_ref_update_method_entry • ….. etc

Slide 76

Slide 76 text

`pinned_bits[];`

Slide 77

Slide 77 text

What objects can move?

Slide 78

Slide 78 text

What can move? • Everthing

Slide 79

Slide 79 text

Finding References

Slide 80

Slide 80 text

Pure Ruby References class Foo def initialize obj @bar = obj end end class Bar end bar = Bar.new foo = Foo.new(bar) Foo Bar @bar

Slide 81

Slide 81 text

C References class Bar end bar = Bar.new foo = Foo.new(bar) Foo Bar ???

Slide 82

Slide 82 text

C References class Bar end bar = Bar.new foo = Foo.new(bar) Foo Bar rb_gc_mark( ) T_MOVED

Slide 83

Slide 83 text

C References class Bar end bar = Bar.new foo = Foo.new(bar) Foo Bar rb_gc_mark( ) Cannot update

Slide 84

Slide 84 text

rb_gc_mark 1. Mark the object 2. Pin the object in `pinned_bits` table

Slide 85

Slide 85 text

GC.compact 1. Full GC (so objects get pinned) 2. Compact objects 3. Update references

Slide 86

Slide 86 text

What can move? • Everthing • Except objects marked with `rb_gc_mark`

Slide 87

Slide 87 text

Movement Problems

Slide 88

Slide 88 text

Hash Tables

Slide 89

Slide 89 text

Hashing Object hash_key( ) = memory address

Slide 90

Slide 90 text

Fix: cache hash key

Slide 91

Slide 91 text

What can move? • Everthing • Except objects marked with `rb_gc_mark` • and hash keys

Slide 92

Slide 92 text

Dual References

Slide 93

Slide 93 text

Dual References Foo Bar Baz T_MOVED

Slide 94

Slide 94 text

Dual References Foo T_MOVED Baz Bar

Slide 95

Slide 95 text

Fix: Call rb_gc_mark, or use only Ruby https://github.com/msgpack/msgpack-ruby/pull/135

Slide 96

Slide 96 text

What can move? • Everthing • Except objects marked with `rb_gc_mark` • and hash keys • and dual referenced objects

Slide 97

Slide 97 text

Global Variables

Slide 98

Slide 98 text

Global Variables (in C) VALUE cFoo; void Init_foo() { cFoo = rb_define_class("Foo", rb_cObject); }

Slide 99

Slide 99 text

Fix: use heuristics to pin objects

Slide 100

Slide 100 text

What can move? • Everthing • Except objects marked with `rb_gc_mark` • and hash keys • and dual referenced objects • and objects created with `rb_define_class`

Slide 101

Slide 101 text

String Literals

Slide 102

Slide 102 text

String Literals def foo puts "hello world" end ISeq literals (array) "hello world" bytecode

Slide 103

Slide 103 text

Updating bytecode is hard

Slide 104

Slide 104 text

What can move? • Everthing • Except objects marked with `rb_gc_mark` • and hash keys • and dual referenced objects • and objects created with `rb_define_class` • and string literals

Slide 105

Slide 105 text

It seems like nothing can move

Slide 106

Slide 106 text

Most can be fixed

Slide 107

Slide 107 text

46% can move!

Slide 108

Slide 108 text

Before Compaction F U P Pages Number of slots 0 100 200 300 400

Slide 109

Slide 109 text

After Compaction F U P Pages Number of slots 0 100 200 300 400

Slide 110

Slide 110 text

Inspecting Memory

Slide 111

Slide 111 text

ObjectSpace.dump_all

Slide 112

Slide 112 text

ObjectSpace.dump_all require "objspace" File.open("out.json", "w") { |f| ObjectSpace.dump_all(output: f) }

Slide 113

Slide 113 text

Measuring Rails Boot $ RAILS_ENV=production \ bin/rails r \ 'require "objspace"; GC.compact; File.open("out.json", "w") { |f| ObjectSpace.dump_all(output: f) }’

Slide 114

Slide 114 text

Output {"address":"0x7fcc6e01a198", "type":"OBJECT", "class":"0x7fcc6c93d420", "ivars":3, "references":["0x7fcc6e01bed0"], "memsize":40, "flags":{"wb_protected":true, "old":true, "uncollectible":true, "marked":true}}

Slide 115

Slide 115 text

Output { "address": "0x7fcc6e01a198", "type": "OBJECT", "class": "0x7fcc6c93d420", "ivars": 3, "references": [ "0x7fcc6e01bed0" ], "memsize": 40, "flags": { "wb_protected": true, "old": true, "uncollectible": true, "marked": true } } address references size

Slide 116

Slide 116 text

Address = Location

Slide 117

Slide 117 text

Heap Fragmentation Object Empty

Slide 118

Slide 118 text

Heap Fragmentation Object Empty

Slide 119

Slide 119 text

Heap Fragmentation Pinned Empty Moves

Slide 120

Slide 120 text

Heap Fragmentation Pinned Empty Moves

Slide 121

Slide 121 text

https://github.com/tenderlove/heap-utils

Slide 122

Slide 122 text

Inspecting CoW Memory

Slide 123

Slide 123 text

/proc/{PID}/smaps

Slide 124

Slide 124 text

/proc/{PID}/smaps 55a92679a000-55a926b53000 rw-p 00000000 00:00 0 [heap] Size: 3812 kB Rss: 3620 kB Pss: 3620 kB Shared_Clean: 0 kB Shared_Dirty: 0 kB Private_Clean: 0 kB Private_Dirty: 3620 kB Referenced: 3620 kB Anonymous: 3620 kB AnonHugePages: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB KernelPageSize: 4 kB MMUPageSize: 4 kB Locked: 0 kB Address Range RSS & PSS Shared Dirty

Slide 125

Slide 125 text

RSS Shared_Clean + Shared_Dirty + Private_Clean + Private_Dirty

Slide 126

Slide 126 text

PSS (Shared_Dirty / Number of Processes) + Shared_Clean + Private_Clean + Private_Dirty

Slide 127

Slide 127 text

RSS vs PSS RSS PSS Unicorn Parent 3620 kB 1840 kB Unicorn Child 3620 kB 1840 kB Total Usage is 3620 kB not 7240 kB

Slide 128

Slide 128 text

Copying Memory x = "x" * 9000 p PID: $$ gets child_pid = fork do puts "forked" 9000.times do |i| puts("I: #{i}") || gets if i % 1000 == 0 x[i] = 109.chr end puts "done" gets end Process.waitpid child_pid

Slide 129

Slide 129 text

Shared_Dirty, PSS, RSS Memory in Kb 0 1000 2000 3000 4000 Number of Writes 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 Shared_Dirty PSS RSS

Slide 130

Slide 130 text

Compaction Impact p PID: $$ arry = [] GC.start gets GC.compact if ENV["COMPACT"] child_pid = fork do pages = GC.stat :heap_allocated_pages while pages == GC.stat(:heap_allocated_pages) arry << Object.new end puts "done" gets end Process.waitpid child_pid Fill Heap

Slide 131

Slide 131 text

No Compaction PSS: 2684Kb Compaction PSS: 2530Kb

Slide 132

Slide 132 text

Compaction Savings: 154Kb

Slide 133

Slide 133 text

Conclusion

Slide 134

Slide 134 text

Compaction Savings: Unknown

Slide 135

Slide 135 text

Use `ObjectSpace`

Slide 136

Slide 136 text

/proc/{PID}/smaps

Slide 137

Slide 137 text

Why compact?

Slide 138

Slide 138 text

"Impossible"

Slide 139

Slide 139 text

Question Your Assumptions

Slide 140

Slide 140 text

We’ve entered grass ૲ ʹ ೖ ͬ ͨ

Slide 141

Slide 141 text

Thank you!