Upgrade to Pro — share decks privately, control downloads, hide ads and more …

GoRuCo 2013

GoRuCo 2013

My talk at GoRuCo 2013

Michael Bernstein

June 08, 2013

More Decks by Michael Bernstein

Other Decks in Technology


  1. To Know A Garbage Collector Michael R. Bernstein Gotham Ruby

    Conference New York, New York June 8th, 2013
  2. Talk Outline • Who Am I? • Introduce Our Goals

    • Share Some Influences • Pursue Our Goals • Conclusion
  3. Who Am I? • This is my 6th GoRuCo •

    Professional “Programmer” • Former Computer Science Teacher • MFA From Parsons in Design & Technology • Salad Thought Leader @ Paperless Post
  4. Introduction: Goals • Get excited about GC, hopefully learn a

    few things • Think about the connection between programming languages and GC
  5. Garbage Collection is a form of automatic memory management which

    gives a program the appearance of infinite memory by reclaiming allocated objects which are no longer in use.
  6. Heap • A data structure in which objects may be

    allocated or deallocated in any order
  7. Mutator class Book < ActiveRecord::Base has_many :authors has_many :citations has_many

    :references has_many :subscriptions has_many :subscribers end
  8. Collector static void gc_mark_locations(rb_objspace_t *objspace, VALUE *start, VALUE *end) {

    long n; if (end <= start) return; n = end - start; mark_locations_array(objspace, start, n); } void rb_gc_mark_locations(VALUE *start, VALUE *end) { gc_mark_locations(&rb_objspace, start, end); } #define rb_gc_mark_locations(start, end) gc_mark_locations(objspace, (start), (end)) struct mark_tbl_arg { rb_objspace_t *objspace; };
  9. Garbage collection is automatic memory management. While the mutator runs,

    it routinely allocates memory from the heap. If more memory than available is needed, the collector reclaims unused memory and returns it to the heap.
  10. 1960: A Good Year For Garbage Collectors • John McCarthy,

    Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I 1960 • George Collins, A Method for Overlapping and Erasure of Lists 1960
  11. • McCarthy - Mark and Sweep (Tracing) • Collins -

    Reference Counting 1960: A Good Year For Garbage Collectors
  12. Mark and Sweep (Tracing) def new ref = allocate if

    ref.nil? mark sweep ref = allocate if ref.nil? raise "Out of memory" end end ref end
  13. def mark worklist = Worklist.new heap_roots.each do |root| ref =

    root.address if ref && !ref.is_marked? ref.set_marked worklist << ref recursive_mark(worklist) end end end Mark and Sweep (Tracing)
  14. def sweep object_cursor = heap_start while object_cursor < heap_end if

    object_cursor.is_marked? object_cursor.unset_marked else object_cursor.free end object_cursor.next end end Mark and Sweep (Tracing)
  15. Barrier • Code that runs as a result of accessing

    or mutating an object on the heap
  16. Reference Counting def new ref = allocate if ref.nil? raise

    "Out of memory" end ref.ref_count = 0 ref end def write(src, i, ref) # A Write Barrier add_reference(ref) delete_reference(src[i]) src[i] = ref end
  17. Reference Counting def add_reference(ref) rc.ref_count = rc.ref_count + 1 end

    def delete_reference(ref) rc.ref_count = rc.ref_count - 1 if rc.ref_count == 0 ref.pointers.each do |field| delete_reference(field.address) end free(ref) end end
  18. Pros And Cons • Pro: Reference Counting is incremental. As

    it works, it frees memory • Con: Reference Counting cannot easily collect cycles, or objects on the heap which reference themselves • Pro: Mark & Sweep can collect cycles • Con: Mark & Sweep can exhibit long pauses and exhibits poor locality
  19. The Uni!ied Theory • Tracing and Reference Counting are “duals”

    of the same operation • In optimized form, they are very similar • Most successful GCs are hybrid Tracer-Counters • Formalized Garbage Collectors with a uniform cost-model
  20. • Subtly tweak Reference Counting by buffering calls to free()

    • Subtly tweak Mark & Sweep by maintaining a true reference count instead of a “live” bit in the mark phase • Hybrid Example: Generational GC The Uni!ied Theory
  21. • Design of Garbage Collectors can be made more methodical

    • Three main decisions: • Partition • Traversal • Trade-offs The Uni!ied Theory
  22. Programming Languages • How are they developed? • How are

    they designed? • How do they interact with system memory? • What aspects of their design are pertinent to the discussion of GC?
  23. Ruby • Dynamic, Multiple Implementations, we all <3 it •

    MRI - Simple Beginnings, Advanced Future • Rubinius - Thoroughly Modern • JRuby - The Power of the JVM
  24. Java • Massive amounts of research into the JVM’s GC

    • Adaptive • Tunable • Hybrid approach
  25. Haskell • Strictly, statically typed • Compiler informs GC •

    Design of language makes certain aspects of GC simpler • Design of GHC has allowed incremental improvements to the GC
  26. Programming Languages • Most great programming languages have worked on

    their GCs over time • The design of the language heavily influences what is possible in GC
  27. “GC is not a generic solution for memory leaks, but

    a (correct) GC is a generic solution for 'dangling pointers'. Just as there is no general solution for 'loops' (due to undecidability), there is no general solution for 'leaks'.” - Henry Baker
  28. In Conclusion • Garbage Collection is a fascinating discipline •

    Deep knowledge of your tools is very helpful • If we understand GC better, we can make Ruby better
  29. Works Cited • David F. Bacon, Perry Cheng, and V.

    T. Rajan. A uni!ied theory of garbage collection. In OOPSLA 2004, 2004, pages 50-68. • Cooper et. al. Teaching Garbage Collection without Implementing Compilers or Interpreters. In SIGCSE 2013, 2013, pages 385-390. • Robby Findler “The Many Faces of Dr. Scheme” http://www.eecs.northwestern.edu/~robby/ logos/ • Richard Jones, Antony Hosking, and Eliot Moss. The Garbage Collection Handbook: The Art of Automatic Memory Management. CRC Applied Algorithms and Data Structures. Chapman & Hall, August 2012, pages 375-416. • Richard Jones: Garbage Collection Bibliography http://www.cs.kent.ac.uk/people/staff/rej/gcbib/ gcbib.html • Henry Lieberman and Carl E. Hewitt. A real-time garbage collector based on the lifetimes of objects. AI Memo 569a, MIT, April 1981.