Slide 1

Slide 1 text

Garbage Collection in Java “Heap storage for objects is reclaimed by an automatic storage management system. Objects are never explicitly de-allocated” –Java Virtual Machine Specification [1]

Slide 2

Slide 2 text

Java Variable Flavours • Stack – Local variables holding primitives. – Local variable of reference type will point at heap memory. • Heap – Objects. – Primitive fields of an object. 10

Slide 3

Slide 3 text

Mark and Sweep Approach • Simplest. Garbage collector algorithm. • Recover and reuse. Heap memory no longer in use. • Algorithm. – Stop-the-world. Non-deterministic pauses stop all the running threads. – Start. From live objects walks through the tree of references. – Mark. As live any object on the route. – Sweep. Everything left is garbage and can be collected.

Slide 4

Slide 4 text

Generational Garbage Collector • Mark-and-sweep. Improved. • Weak generational hypothesis [2]. – Most objects become unreachable quickly. – Few references from older to young objects exist. • Areas of memory. – Eden. Most new objects (very large objects directly to old generation). – Survivor. Survived, one holds object, the other empty. – Tenured. Promoted longer-lived objects. – PermGen. Not strictly in heap, internal structures (i.e. class definitions). Eden Survivor From To Tenured PermGen Young Generation Old Generation

Slide 5

Slide 5 text

Escape Analysis • Recent change. Java 6u23. • Local variables. Only used inside the method. – No passed into other methods. – No returned. • No heap. Object created on the method stack frame – Reduce objects of young collections. – Memory used freed when method returns. 10

Slide 6

Slide 6 text

Concurrent Mark-Sweep in Action • Two short pauses. Per GC cycle, initial mark and remark. • Initial Mark. Identifies set of objects immediately reachable outside old generation. • Concurrent marking phase. Marks all live objects transitively reachable from this set. • Object graph can change. Not all live objects are guaranteed to be marked. • Pre-cleaning. Revisiting objects modified concurrently with the marking phase. • Second Pause (Remark). Revisits objects modified during concurrent marking phase. • Concurrent sweep phase. Deallocates garbage objects without relocating the live ones. Initial Mark Remark Marking/Pre-cleaning Sweeping Running application thread Running GC thread

Slide 7

Slide 7 text

CMS Pros/Cons • Advantages. – Two pauses. No one single pause. – Concurrency. App and GC run in parallel. • Disadvantages. – Extra overhead. – Free lists. Free space not contiguous. – Large Java heap req. Marking cycle lasts more than stop-the-world, space reclaimed at the end. – App runs concurrently. Old generation potentially increases during marking phase. – Floating garbage. Not guaranteed all garbage objects. – Fragmentation issue. Lack of compaction, possible not efficient use of free space [3]. Start sweeping End sweeping

Slide 8

Slide 8 text

Garbage-First GC (G1) in Action • Features. Parallel, concurrent, incrementally compacting low-pause. • Heap layout. Split into regions. • Region. Equal-sized chunks. • Pause goal. How long app can pause for GC while running (20 ms every 5 min). • Pause. Objects evacuated from one or more regions to a single region. • Statistics. Average a region takes to collect. • G1?. Knows mostly empty regions. – Collects. In this regions first. – Concentrates. Collect on areas likely to be full of garbage. GC Marking GC GC Running application thread Running GC thread

Slide 9

Slide 9 text

G1 Pros/Cons • Advantages. – High performance. – Pause time goals. Prevent interruption proportional to heap or live-data size. – Compact and free up memory. Continuously work to reduce fragmentation. – Concurrent global marking phase. Determine liveness of the objects. – Pause prediction model. • Meet user-defined pause time target with high probability. • Selects number of regions to collect based on the time target. • Disadvantages. – Target. Multi-processor machines with large memories.

Slide 10

Slide 10 text

Comparison • G1. – Compacting collector. – Avoid free list, rely on regions. – More predictable pauses. – User can specify pause targets. • Switch to G1. – More than 50% heap occupied with live data. – Allocation and promotion rate varies significantly. – Undesired long collection and compaction pauses (0.5s to 1s) • CMS. – No compaction. – No control. • ParallelOld. – Whole-heap compaction, long pauses. – No control.

Slide 11

Slide 11 text

References [1] Lindholm, Tim, and Frank Yellin. Java Virtual Machine Specification, Second Edition. Addison-Wesley, Reading, MA, 1999. [2] Charlie Hunt, Binu John. Java Performance, First Edition. Pearson, 2011. [3] Jones, Richard, and Rafael Lins. Garbage Collection. John Wiley & Sons, Ltd.,West Sussex, PO19 IUD, England, 1996. [4] Memory Management in the Java HotSpot Virtual Machine. Sun Mircrosystem, April 2006. [5] B.J. Evans, M. Verburg. Well-Grounded Java Developer. Manning, 2012.

Slide 12

Slide 12 text

Questions?