Slide 1

Slide 1 text

A Unified Theory of Garbage Collection Michael R. Bernstein Papers We Love #1 NYC, 24 February 2014 Monday, February 24, 14 - I am SO EXCITED to be here! - Many thanks to Zeeshan, Clint, and our hosts at Viggle

Slide 2

Slide 2 text

Obsessed Monday, February 24, 14 - Hello, my name is Michael R. Bernstein, and I’m Obsessed. - Not with anything in particular, just in general - I want you to be obsessed too.

Slide 3

Slide 3 text

About Me • Former Computer Science educator • Currently a “professional programmer” • Got interested in Garbage Collection because I used Ruby professionally • This meetup is basically my dream come true Monday, February 24, 14 - A little bit about me

Slide 4

Slide 4 text

“A Unified Theory of Garbage Collection” Bacon, Chang, and Rajan, 2004 Monday, February 24, 14 - The paper for tonight’s discussion

Slide 5

Slide 5 text

Outline • Garbage Collection • This paper’s contributions • My favorite parts • Discussion Monday, February 24, 14 - Here’s what we’ll cover

Slide 6

Slide 6 text

Show of Hands • Studied GC? • Read the paper? • Hacked on a GC? • Here for the beer and pizza? Monday, February 24, 14 - Show of hands

Slide 7

Slide 7 text

Garbage Collection Monday, February 24, 14 - Let’s talk about Garbage Collection

Slide 8

Slide 8 text

Terminology • Garbage Collection • Heap • Mutator • Collector • Roots • Barriers Monday, February 24, 14 - Who has good definitions?

Slide 9

Slide 9 text

“Garbage collection is automatic memory management. While the mutator runs, it routinely allocates memory from the heap. If more memory than available is needed, the collector reclaims unused memory and returns it to the heap.” Monday, February 24, 14 - And a definition

Slide 10

Slide 10 text

GC Algorithms Monday, February 24, 14 - Let’s discuss the two algorithms that are the starting point for the discussion in the paper

Slide 11

Slide 11 text

“The incremental nature of reference counting is generally considered to be its fundamental advantage. However, the cost of updating reference counts every time a new pointer is loaded into a register is typically much to high for high- performance applications.” Reference Counting Monday, February 24, 14 - Description

Slide 12

Slide 12 text

“As a result, some form of deferred reference counting is used...the result is delayed collection, floating garbage, and longer pauses - the typical characteristics of tracing collectors!” Reference Counting Monday, February 24, 14 - RC

Slide 13

Slide 13 text

“...its fundamental advantages are the lack of per-mutation overhead and the natural collection of cyclic garbage. However, a fundamental disadvantage of tracing is that freeing of dead objects is delayed...resulting in long pause times.” Mark & Sweep Monday, February 24, 14 - M&S

Slide 14

Slide 14 text

“One of the first optimizations that is typically applied is generational collection. This reduces the average pause time...but also introduces per-mutation overhead -- thus it takes on both some of the positive and negative aspects of reference counting collection.” Mark & Sweep Monday, February 24, 14 - M&S

Slide 15

Slide 15 text

This Paper’s Structure • Qualitative Comparison • The Algorithmic Duals • Hybrid Collectors • Cycle Collection • Multi-Heap Collectors • Cost Analysis • Space-Time Tradeoffs Monday, February 24, 14 - Formal framework - Fixed point formulation - Shows that what we think is different is similar

Slide 16

Slide 16 text

This Paper’s Contributions • Formal framework for GC algorithms • Breaks down cargo-culted beliefs • Nearly all production GCs are hybrids • Design recommendations Monday, February 24, 14 - Formal framework - Fixed point formulation - Shows that what we think is different is similar

Slide 17

Slide 17 text

"Our first-hand experience of (and frustration with) the convergence of optimized forms of reference counting and tracing collectors led directly to a deeper study of the algorithms in the hope of funding fundamental similarities that seem to be appearing in practice." Monday, February 24, 14 - Awesome quote

Slide 18

Slide 18 text

“...tracing operates on live objects, or 'matter', while reference counting operates on dead objects, or 'anti- matter'. For every operation performed by the tracing collector, there is a precisely corresponding anti- operation performed by the reference counting collector.” Monday, February 24, 14 - Awesome quote

Slide 19

Slide 19 text

Monday, February 24, 14 - The comparison, revisited

Slide 20

Slide 20 text

Monday, February 24, 14 - Awesome diagrams!

Slide 21

Slide 21 text

Why I Love It • Emphasizes tradeoffs • Specifies areas of focus for design • Really neat and novel Monday, February 24, 14 - Tradeoffs - algorithms which are good for distributed or disk backed systems would be terrible for VMs and vice versa - Authors provide a list of properties of GC algorithms which should be considered in design

Slide 22

Slide 22 text

Discussion • Other algorithmic duals? • Details of fix-point formulation • Partition, traversal, and trade-offs • Automatically tuned GCs Monday, February 24, 14 - Others - not quite the same, but FP vs. OOP

Slide 23

Slide 23 text

Thank You • michaelrbernste.in • twitter.com/mrb_bk Monday, February 24, 14 - You’re awesome! - Find me online