The Hitchhiker's Guide to Ruby GC

The Hitchhiker's Guide to Ruby GC

A brief history of garbage collection in MRI Ruby.

Facce030b679bda34eb7c64885a741fc?s=128

Eric Weinstein

November 16, 2015
Tweet

Transcript

  1. The Hitchhiker’s Guide to Ruby GC # Eric Weinstein #

    RubyConf # 16 November 2015 # San Antonio, TX
  2. Koichi should be giving this talk

  3. Oh god what am I doing

  4. About eric_weinstein = { employer: 'Condé Nast', github: 'ericqweinstein', knife_throwing_champ:

    true, twitter: 'ericqweinstein', website: 'ericweinste.in' }
  5. Ruby Wizardry RUBYCONF2015 (40% off!)

  6. Don’t Panic!

  7. Part 0: Ruby is Not Slow

  8. Okay, yes, sort of • But not for the reason(s)

    you think! • Database queries (N + 1) • Superlinear time complexity • Ruby being an interpreted language
  9. For better (and worse)

  10. Everything is an object

  11. Part 1: History

  12. We’re talking MRI • Not Rubinius or JRuby, which have

    different garbage collectors • I may make comparisons where appropriate, though, so the above could make guest appearances
  13. 1.8.7 • Ruby GC is tracing (as opposed to reference

    counting, e.g. Python) • Mark & sweep • M&S was invented by Alexander Graham Link John McCarthy for LISP in 1959
  14. 1.8.7 (con’t) • Ruby allocates some memory (more on this

    later) • When there’s no more free memory, Ruby marks all active objects, then combines (sweeps) inactive objects into a single list
  15. 1.8.7 (con’t) M M M M M

  16. Everything stops

  17. 1.9.3 • Lazy mark & sweep: sweep in phases to

    reduce the length of each sweep pause • Like 1.8.7, subverts native copy-on-write (this will be on the quiz)
  18. 2.0 • Bitmap marking: we no longer mark objects directly,

    so we’re free to leverage copy- on-write! • This will be important later
  19. 2.1 • Generational GC (two generations: young and old) •

    If you survive 3 collections, you become old • Because most objects die young, you perform (fast) minor GC frequently and (slower) stop- the-world major GC rarely • The RGenGC algorithm (Koichi Sasada, EuRuKo)
  20. Part 2: (….2.2)

  21. 2.2 • Symbol GC: no more symbol DoS • Incremental

    major GC (the RincGC algorithm) • Tricolor marking: white (unmarked), gray (marked, but may refer to white objects), and black (marked, without references to white objects)
  22. But there’s a

  23. Wild White Object Appears! • We create a new (white)

    object, but there are no gray objects with references to it (since there are only black and white objects) • We reclaim our live object by mistake!
  24. Write Barriers • They’re super effective! • We also have

    write barrier protected and write barrier unprotected objects • Pause time is relative to the number of living write barrier unprotected objects • Most objects (String, Array, Hash, or user-defined POROs) are write barrier protected objects, so the pause time for unprotected objects is acceptable
  25. GC Tuning • Don’t do it! • (Experts only): Don’t

    do it yet
  26. Turning it to 11 • RUBY_GC_HEAP_GROWTH_FACTOR • RUBY_GC_HEAP_GROWTH_MAX_SLOTS • RUBY_GC_MALLOC_LIMIT

    • RUBY_GC_MALLOC_LIMIT_MAX • RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR
  27. This is not a silver bullet

  28. Part 3: Case Study

  29. Let’s talk about memory

  30. The Memory Model • Ruby objects are 40-byte RValue structures,

    which Ruby allocates into heaps of 16KB each • You get ~400 Ruby objects per heap (16_000 / 40) • Ruby initially allocates ~150 heaps
  31. > require 'objspace' => true > ObjectSpace.count_objects[:TOTAL] / GC.stat[:heap_used] =>

    408
  32. We Were Making A $#!% Tonne $ bxr console Loading

    development console... > GC.start => nil > GC.stat => {:count=>132, :heap_used=>1456, :heap_length=>2619, :heap_incremen t=>1163, :heap_live_num=>513174, :heap_free_num=>81988, :heap_fina l_num=>0}
  33. :heap_live_num => 513174 Half a million objects

  34. RValue Flags Value Next The RValue Contains FL_MARK Pointer to

    next RStruct Object contents
  35. Heaps on Heaps (1.9.3)

  36. Mark && Sweep • Ruby heaps comprise linked lists of

    RValues • When there are no more free RValues, Ruby 1.9 sets FL_MARK on all active Ruby objects (marks)... • ...then relinks inactive objects (sweeps) into a single linked list (the “free list”)
  37. The Free List M M M M M

  38. Copy on Write • When our production processes call fork,

    the new child process shares all memory with the parent • Copies are only made when changes are written, but marking objects as live on the objects themselves forces a write • Ruby 1.9.3 GC subverts native copy on write
  39. Let us not burden our remembrances with A heaviness that's

    gone. — The Tempest, Act V, Scene i
  40. Memory in Ruby 2.0 Header 0 1 0 0 0

    1 0 1 1 0 1 0 1 1 1 0 0 1 0 0 0 0 1 0 0 1 1 1 0 1 0 0 0 1 0 M M M
  41. Bitmap Marking • Ruby 2.0 includes a header at the

    beginning of each heap that contains a pointer to a bitmap representing the state of each object (1 == marked, 0 == unmarked) • The GC mark phase no longer modifies objects, allowing child processes to appropriately leverage copy on write
  42. How This Affects Unicorn • Unicorn creates N workers by

    forking itself N times (which is not a Bad Thing™) • Without copy on write, each worker spends time and memory making copies of objects that are marked by the GC but are otherwise identical • As N increases, the problem gets worse
  43. Time Trials • Loading the app in 1.9.3 invoked 122

    GC runs and took approximately 4.4 seconds • Loading the app in 2.0 invoked 66 GC runs and took approximately 3.0 seconds • In this case, 1.9.3 spent 47% more time collecting garbage
  44. What did we do?

  45. #1: Upgraded to 2.0 • Leveraged copy on write •

    require is faster • Cool new features like Module#prepend, lazy Enumerators, refinements, and UTF-8 by default • It was about damned time
  46. #2: Profiled • Find and eliminate SOBs (sources of bloat)

    • Native GC profiling: http://ruby-doc.org/ core-2.0/GC.html • ruby-prof: https://github.com/ruby-prof/ ruby-prof
  47. #3: Tuned the GC • RUBY_GC_MALLOC_LIMIT (controls when we perform

    a full GC run; defaults to 8MB) • RUBY_HEAP_MIN_SLOTS (controls slots per heap, defaults to 10_000) • RUBY_HEAP_SLOTS_GROWTH_FACTOR (controls heap allocation rate, defaults to 1.8x)
  48. Credits && Further Reading • http://patshaughnessy.net/2012/3/23/why-you-should-be- excited-about-garbage-collection-in-ruby-2-0 • http://samsaffron.com/archive/2013/11/22/demystifying- the-ruby-gc

    • Ruby Under a Microscope (Pat Shaughnessy) • Ruby Performance Optimization (Alexander Dymo) • Koichi Sasada (http://youtube.com/watch/?v=92zMKGt7Qlk)
  49. puts 'Thanks!' end

  50. Questions? eric_weinstein = { employer: 'Condé Nast', github: 'ericqweinstein', twitter:

    'ericqweinstein', website: 'ericweinste.in' } RUBYCONF2015 (40% off!)