$30 off During Our Annual Pro Sale. View Details »

GoRuCo 2013

GoRuCo 2013

My talk at GoRuCo 2013

Michael Bernstein

June 08, 2013
Tweet

More Decks by Michael Bernstein

Other Decks in Technology

Transcript

  1. To Know A
    Garbage Collector
    Michael R. Bernstein
    Gotham Ruby Conference
    New York, New York
    June 8th, 2013

    View Slide

  2. Talk Outline
    • Who Am I?
    • Introduce Our Goals
    • Share Some Influences
    • Pursue Our Goals
    • Conclusion

    View Slide

  3. Who Am I?
    • This is my 6th GoRuCo
    • Professional “Programmer”
    • Former Computer Science Teacher
    • MFA From Parsons in Design &
    Technology
    • Salad Thought Leader @ Paperless Post

    View Slide

  4. I’m obsessed.

    View Slide

  5. Introduction: Goals
    • Get excited about GC, hopefully learn a
    few things
    • Think about the connection between
    programming languages and GC

    View Slide

  6. In!luences

    The Garbage Collection Handbook by
    Jones et al.

    View Slide

  7. In!luences
    "The undecidability of liveness is a
    corollary of the halting problem"
    - Jones et al.

    View Slide

  8. In!luences

    Teaching Garbage Collection without Implementing
    Compilers or Interpreters by Cooper et al.
    [Findler et al]

    View Slide

  9. In!luences

    A Uni!ied Theory of Garbage Collection by
    Bacon et al.

    View Slide

  10. Get Excited About Garbage Collection

    View Slide

  11. Garbage Collection is a form of automatic
    memory management which gives a
    program the appearance of infinite
    memory by reclaiming allocated objects
    which are no longer in use.

    View Slide

  12. Terminology
    • Garbage Collection
    • Heap
    • Mutator
    • Collector
    • Roots
    • Barriers

    View Slide

  13. Heap

    A data structure in which objects may be
    allocated or deallocated in any order

    View Slide

  14. Heap

    View Slide

  15. Mutator

    The part of a running program which
    executes application code

    View Slide

  16. Mutator
    class Book < ActiveRecord::Base
    has_many :authors
    has_many :citations
    has_many :references
    has_many :subscriptions
    has_many :subscribers
    end

    View Slide

  17. Collector

    The part of a running program
    responsible for Garbage Collection

    View Slide

  18. Collector
    static void
    gc_mark_locations(rb_objspace_t *objspace, VALUE *start, VALUE *end)
    {
    long n;
    if (end <= start) return;
    n = end - start;
    mark_locations_array(objspace, start, n);
    }
    void
    rb_gc_mark_locations(VALUE *start, VALUE *end)
    {
    gc_mark_locations(&rb_objspace, start, end);
    }
    #define rb_gc_mark_locations(start, end) gc_mark_locations(objspace, (start), (end))
    struct mark_tbl_arg {
    rb_objspace_t *objspace;
    };

    View Slide

  19. Garbage collection is automatic memory
    management. While the mutator runs, it
    routinely allocates memory from the heap. If
    more memory than available is needed, the
    collector reclaims unused memory and returns
    it to the heap.

    View Slide

  20. 1960: A Good Year For Garbage Collectors
    • John McCarthy, Recursive Functions of
    Symbolic Expressions and Their Computation
    by Machine, Part I 1960
    • George Collins, A Method for Overlapping
    and Erasure of Lists 1960

    View Slide

  21. • McCarthy - Mark and Sweep (Tracing)
    • Collins - Reference Counting
    1960: A Good Year For Garbage Collectors

    View Slide

  22. Roots

    References that are directly accessible to
    the Mutator without going through
    other objects

    View Slide

  23. Mark and Sweep (Tracing)
    def new
    ref = allocate
    if ref.nil?
    mark
    sweep
    ref = allocate
    if ref.nil?
    raise "Out of memory"
    end
    end
    ref
    end

    View Slide

  24. def mark
    worklist = Worklist.new
    heap_roots.each do |root|
    ref = root.address
    if ref && !ref.is_marked?
    ref.set_marked
    worklist << ref
    recursive_mark(worklist)
    end
    end
    end
    Mark and Sweep (Tracing)

    View Slide

  25. def sweep
    object_cursor = heap_start
    while object_cursor < heap_end
    if object_cursor.is_marked?
    object_cursor.unset_marked
    else
    object_cursor.free
    end
    object_cursor.next
    end
    end
    Mark and Sweep (Tracing)

    View Slide

  26. Barrier

    Code that runs as a result of accessing or
    mutating an object on the heap

    View Slide

  27. Reference Counting
    def new
    ref = allocate
    if ref.nil?
    raise "Out of memory"
    end
    ref.ref_count = 0
    ref
    end
    def write(src, i, ref) # A Write Barrier
    add_reference(ref)
    delete_reference(src[i])
    src[i] = ref
    end

    View Slide

  28. Reference Counting
    def add_reference(ref)
    rc.ref_count = rc.ref_count + 1
    end
    def delete_reference(ref)
    rc.ref_count = rc.ref_count - 1
    if rc.ref_count == 0
    ref.pointers.each do |field|
    delete_reference(field.address)
    end
    free(ref)
    end
    end

    View Slide

  29. Pros And Cons
    • Pro: Reference Counting is incremental. As it
    works, it frees memory
    • Con: Reference Counting cannot easily collect
    cycles, or objects on the heap which reference
    themselves
    • Pro: Mark & Sweep can collect cycles
    • Con: Mark & Sweep can exhibit long pauses and
    exhibits poor locality

    View Slide

  30. The Uni!ied Theory
    • Tracing and Reference Counting are “duals” of
    the same operation
    • In optimized form, they are very similar
    • Most successful GCs are hybrid Tracer-Counters
    • Formalized Garbage Collectors with a
    uniform cost-model

    View Slide

  31. • Subtly tweak Reference Counting by
    buffering calls to free()
    • Subtly tweak Mark & Sweep by maintaining a
    true reference count instead of a “live” bit in
    the mark phase
    • Hybrid Example: Generational GC
    The Uni!ied Theory

    View Slide

  32. • Design of Garbage Collectors can be made
    more methodical
    • Three main decisions:
    • Partition
    • Traversal
    • Trade-offs
    The Uni!ied Theory

    View Slide

  33. Exhale.
    I know that was exciting.

    View Slide

  34. Programming Languages and GC

    View Slide

  35. Programming Languages
    • How are they developed?
    • How are they designed?
    • How do they interact with system
    memory?
    • What aspects of their design are
    pertinent to the discussion of GC?

    View Slide

  36. Ruby
    • Dynamic, Multiple Implementations, we
    all <3 it
    • MRI - Simple Beginnings, Advanced
    Future
    • Rubinius - Thoroughly Modern
    • JRuby - The Power of the JVM

    View Slide

  37. Java
    • Massive amounts of research into the
    JVM’s GC
    • Adaptive
    • Tunable
    • Hybrid approach

    View Slide

  38. Haskell
    • Strictly, statically typed
    • Compiler informs GC
    • Design of language makes certain
    aspects of GC simpler
    • Design of GHC has allowed incremental
    improvements to the GC

    View Slide

  39. Programming Languages
    • Most great programming languages
    have worked on their GCs over time
    • The design of the language heavily
    influences what is possible in GC

    View Slide

  40. “GC is not a generic solution for memory
    leaks, but a (correct) GC is a generic
    solution for 'dangling pointers'. Just as
    there is no general solution for 'loops' (due
    to undecidability), there is no general
    solution for 'leaks'.”
    - Henry Baker

    View Slide

  41. In Conclusion
    • Garbage Collection is a fascinating
    discipline
    • Deep knowledge of your tools is very
    helpful
    • If we understand GC better, we can
    make Ruby better

    View Slide

  42. Works Cited
    • David F. Bacon, Perry Cheng, and V. T. Rajan. A uni!ied theory of garbage collection. In OOPSLA
    2004, 2004, pages 50-68.
    • Cooper et. al. Teaching Garbage Collection without Implementing Compilers or Interpreters. In SIGCSE
    2013, 2013, pages 385-390.
    • Robby Findler “The Many Faces of Dr. Scheme” http://www.eecs.northwestern.edu/~robby/
    logos/
    • Richard Jones, Antony Hosking, and Eliot Moss. The Garbage Collection Handbook: The Art of
    Automatic Memory Management. CRC Applied Algorithms and Data Structures. Chapman & Hall,
    August 2012, pages 375-416.
    • Richard Jones: Garbage Collection Bibliography http://www.cs.kent.ac.uk/people/staff/rej/gcbib/
    gcbib.html
    • Henry Lieberman and Carl E. Hewitt. A real-time garbage collector based on the lifetimes of objects. AI
    Memo 569a, MIT, April 1981.

    View Slide

  43. Thanks!
    w
    @
    gh
    michaelrbernste.in
    mrb_bk
    mrb

    View Slide