Slide 1

Slide 1 text

The Garbage-First Garbage Collector Tony Printezis, Sun Microsystems Paul Ciciora, Chicago Board Options Exchange #TS-5419

Slide 2

Slide 2 text

2008 JavaOneSM Conference | java.sun.com/javaone | 2 Trademarks And Abbreviations (to get them out of the way...) Java™ Platform, Standard Edition (Java SE) Java Hotspot™ Virtual Machine (HotSpot JVM™) Solaris™ Operating System (Solaris OS) Garbage Collection (GC) Concurrent Mark-Sweep Garbage Collector (CMS) Garbage-First Garbage Collector (G1)

Slide 3

Slide 3 text

2008 JavaOneSM Conference | java.sun.com/javaone | 3 Who Are These Guys? Tony Printezis • Staff Engineer, HotSpot JVM GC Group • Sun Microsystems, Burlington, MA Paul Ciciora • “The Adventurous Customer” • Dept Head, Object Oriented Infrastructure • Chicago Board Options Exchange (CBOE), Chicago, IL

Slide 4

Slide 4 text

2008 JavaOneSM Conference | java.sun.com/javaone | 4 Learn how Garbage-First, aka G1, the new low-latency garbage collector in the HotSpot JVM works and what it will mean to your application in the future.

Slide 5

Slide 5 text

2008 JavaOneSM Conference | java.sun.com/javaone | 5 Agenda Garbage-First Attributes Garbage-First Operation First Impressions: CBOE Final Thoughts

Slide 6

Slide 6 text

2008 JavaOneSM Conference | java.sun.com/javaone | 6 Agenda Garbage-First Attributes Garbage-First Operation First Impressions: CBOE Final Thoughts

Slide 7

Slide 7 text

2008 JavaOneSM Conference | java.sun.com/javaone | 7 The Garbage-First Garbage Collector (G1) Future CMS Replacement Server “Style” Garbage Collector Parallel Concurrent Generational Good Throughput Compacting Improved ease-of-use Predictable (though not hard real-time)

Slide 8

Slide 8 text

2008 JavaOneSM Conference | java.sun.com/javaone | 8 The Garbage-First Garbage Collector (G1) Future CMS Replacement Server “Style” Garbage Collector Parallel Concurrent Generational Good Throughput Compacting Improved ease-of-use Predictable (though not hard real-time) Main differences between Garbage-First and CMS

Slide 9

Slide 9 text

2008 JavaOneSM Conference | java.sun.com/javaone | 9 GC Work: Parallelism & Concurrency GC Thread Application Thread

Slide 10

Slide 10 text

2008 JavaOneSM Conference | java.sun.com/javaone | 10 Predictability Cannot guarantee hard real-time behavior • OS scheduling • Other processes • Application behavior (without analysis) Soft real-time • Achieve good level of predictable behavior • With high probability... but no hard guarantees If you want hard real-time guarantees, please consider using the Sun Microsystems' Java Real-Time System We are expecting G1 to be more predictable than CMS

Slide 11

Slide 11 text

2008 JavaOneSM Conference | java.sun.com/javaone | 11 Compaction Consolidates free space in large chunk(s) Battles fragmentation • Important for long-running applications • We can sleep better at night... and in the daytime too. :-) Enables fast allocation • Linear, no free lists • TLABs (thread-local allocation buffers) • Infrequent synchronization at allocation time • Only at the slow path No free lunch! • Copying is the largest contributor to pause times

Slide 12

Slide 12 text

2008 JavaOneSM Conference | java.sun.com/javaone | 12 Weak Generational Hypothesis Two observations • “Most newly-allocated objects will die young.” • “There are few old-to-young references.” Split the heap into “generations” • Usually two: young generation / old generation Concentrate collection effort on the young generation • Good payoff (a lot of space reclaimed) for your collection effort • Lower GC overhead • Most pauses are short Reduced allocation rate into the old generation • Young generation acts as a “filter”

Slide 13

Slide 13 text

2008 JavaOneSM Conference | java.sun.com/javaone | 13 Why Generational? Most Java applications • Conform to the weak generational hypothesis • Really benefit from generational GC • Performance-wise, generational GC is hard to beat in most cases All GCs in the HotSpot JVM are generational

Slide 14

Slide 14 text

2008 JavaOneSM Conference | java.sun.com/javaone | 14 Agenda Garbage-First Attributes Garbage-First Operation First Impressions: CBOE Final Thoughts

Slide 15

Slide 15 text

2008 JavaOneSM Conference | java.sun.com/javaone | 15 Color Key Young Generation Old Generation Recently Copied in Young Generation Recently Copied in Old Generation Non-Allocated Space

Slide 16

Slide 16 text

2008 JavaOneSM Conference | java.sun.com/javaone | 16 Young GCs in CMS (i) Young generation, split into • Eden • Survivor spaces Old generation • In-place de-allocation • Managed by free lists CMS

Slide 17

Slide 17 text

2008 JavaOneSM Conference | java.sun.com/javaone | 17 Young GCs in CMS (ii) During a young generation GC • Survivors from the young generation are evacuated to • Other survivor space • Old generation CMS

Slide 18

Slide 18 text

2008 JavaOneSM Conference | java.sun.com/javaone | 18 Young GCs in CMS (iii) End of young generation GC CMS

Slide 19

Slide 19 text

2008 JavaOneSM Conference | java.sun.com/javaone | 19 Young GCs in G1 (i) Heap split into regions • Currently 1MB regions Young generation • A set of regions Old generation • A set of regions G1

Slide 20

Slide 20 text

2008 JavaOneSM Conference | java.sun.com/javaone | 20 Young GCs in G1 (ii) During a young generation GC • Survivors from the young regions are evacuated to: • Survivor regions • Old regions G1

Slide 21

Slide 21 text

2008 JavaOneSM Conference | java.sun.com/javaone | 21 Young GCs in G1 (iii) End of young generation GC G1

Slide 22

Slide 22 text

2008 JavaOneSM Conference | java.sun.com/javaone | 22 Summary: Young GCs in G1 Single physical heap, split into regions • Set of contiguous regions allocated for large (“humongous”) objects No physically separate young generation • A set of (non-contiguous) regions • Very easy to resize Young GCs • Done with “evacuation pauses” • Stop-the-world • Parallel • Evacuate surviving objects from one set of regions to another

Slide 23

Slide 23 text

2008 JavaOneSM Conference | java.sun.com/javaone | 23 Old GCs in CMS (Sweeping After Marking) (i) Concurrent marking phase • Two stop-the-world pauses • Initial mark • Remark • Marks reachable (live) objects • Unmarked objects are deduced to be unreachable (dead) CMS

Slide 24

Slide 24 text

2008 JavaOneSM Conference | java.sun.com/javaone | 24 Old GCs in CMS (Sweeping After Marking) (ii) Concurrent sweeping phase • Sweeps over the heap • In-place de-allocates unmarked objects CMS

Slide 25

Slide 25 text

2008 JavaOneSM Conference | java.sun.com/javaone | 25 Old GCs in CMS (Sweeping After Marking) (iii) End of concurrent sweeping phase • All unmarked objects are de- allocated CMS

Slide 26

Slide 26 text

2008 JavaOneSM Conference | java.sun.com/javaone | 26 Old GCs in G1 (After Marking) (i) Concurrent marking phase • One stop-the-world pause • Remark • (Initial mark piggybacked on an evacuation pause) • Calculates liveness information per region • Empty regions can be reclaimed immediately G1

Slide 27

Slide 27 text

2008 JavaOneSM Conference | java.sun.com/javaone | 27 Old GCs in G1 (After Marking) (ii) End of remark phase G1

Slide 28

Slide 28 text

2008 JavaOneSM Conference | java.sun.com/javaone | 28 Old GCs in G1 (After Marking) (iii) Reclaiming old regions • Pick regions with low live ratio • Collect them piggy-backed on young GCs • Only a few old regions collected per such GC G1

Slide 29

Slide 29 text

2008 JavaOneSM Conference | java.sun.com/javaone | 29 Old GCs in G1 (After Marking) (iv) We might leave some garbage objects in the heap • In regions with very high live ratio • We might collect them later G1

Slide 30

Slide 30 text

2008 JavaOneSM Conference | java.sun.com/javaone | 30 Summary: Old GCs in G1 Concurrent marking phase • Calculates liveness information per region • Identifies best regions for subsequent evacuation pauses • No corresponding sweeping phase • Different marking algorithm than CMS • Snapshot-at-the-beginning (SATB) • Achieves shorter remarks Old regions reclaimed by • Remark (when totally empty) • Evacuation pauses Most reclamation happens with evacuation pauses • Compaction

Slide 31

Slide 31 text

2008 JavaOneSM Conference | java.sun.com/javaone | 31 G1: “One-And-A-Half” GCs CMS • Young generation GCs • Old generation concurrent marking and sweeping G1 • Evacuation pauses • Both for young and old regions • Only concurrent marking

Slide 32

Slide 32 text

2008 JavaOneSM Conference | java.sun.com/javaone | 32 Remembered Sets During evacuation pauses • Need to identify “roots” from other regions We maintain “remembered sets” • One per region • Keeps track of all heap locations with references into that region We can pick any region to collect • Without sweeping the whole heap to find references into it Remembered set maintenance • Write barrier + concurrent processing Remembered set footprint • <5% of the heap A C D B

Slide 33

Slide 33 text

2008 JavaOneSM Conference | java.sun.com/javaone | 33 Pause Prediction Model User-defined pause goal • Goal, not a promise or a guarantee! Stop-the-world pauses • Evacuation pauses are the main “bottleneck” • Highly application-dependent • Remark pauses are short Pause prediction model • Keep stats on pause behavior • Predict the maximum amount of work that does not violate the goal • e.g., young generation size • Works best for “steady-state” applications

Slide 34

Slide 34 text

2008 JavaOneSM Conference | java.sun.com/javaone | 34 CMS vs. G1 Comparison G1 CMS

Slide 35

Slide 35 text

2008 JavaOneSM Conference | java.sun.com/javaone | 35 Agenda Garbage-First Attributes Garbage-First Operation First Impressions: CBOE Final Thoughts

Slide 36

Slide 36 text

2008 JavaOneSM Conference | java.sun.com/javaone | 36 CBOE: Leading the Options Industry for 35 Years Founded in 1973, CBOE is the largest U.S. options marketplace Original marketplace for standardized, exchange-traded options Industry leader in product innovation Powered by CBOEdirect, Hybrid Trading System integrates electronic and open outcry trading, offering unparalleled trading choice Launched CBOE Futures Exchange (CFE) in March 2004

Slide 37

Slide 37 text

2008 JavaOneSM Conference | java.sun.com/javaone | 37 CBOE Leads the Industry in Market Share CBOE is the number 1 U.S. options marketplace, handling one-third of total industry volume in 2007 CBOE 2007 market share numbers: • Total options: 33.0% • Equity options: 25.7% • Multiply-listed index/ETF options: 36.7% • Cash index options: 86.1% 2007 Total Market Share By Exchange 33% 28% 14% 8% 12% 5% CBOE ISE PHLX AMEX NYSE BOX

Slide 38

Slide 38 text

2008 JavaOneSM Conference | java.sun.com/javaone | 38 Profile of the Adventurous Customer ...in a hyper-competitive industry! Started tuning GC with 1.2 SemiSpaces Early adopter of Parallel Young Gen GC Switched to CMS with 1.4.2 Started testing the Java platform on the Solaris OS 10 x86 in 2005 Solaris OS 10 x86 in production since May 2006 • Zones • Processor sets • FX scheduling Currently running 6u4 • First customer in production Leveraging open source

Slide 39

Slide 39 text

2008 JavaOneSM Conference | java.sun.com/javaone | 39 Keeping Current: Effort and Reward No Pain, No Gain vs. Lots of Pain, but Lots of Gain

Slide 40

Slide 40 text

2008 JavaOneSM Conference | java.sun.com/javaone | 40 CBOE Performance Test Environment How to mitigate the risk of being on the bleeding edge Two full production slices, including hardware redundancy • Pretesting production software / hardware upgrades Two dedicated “wind tunnels” • Software / hardware proof-of-concept + GC tuning • Heavily virtualized (flexibility) ~150 servers Personnel • 8 full-time staff • Running round the clock, on a 20-hour cycle

Slide 41

Slide 41 text

2008 JavaOneSM Conference | java.sun.com/javaone | 41 GC Logging in Production Turn on the headlights! -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTaskTimeStamps -XX:PrintCMSStatistics=1 -XX:+PrintGCTaskTimeStamps -XX:+PrintTenuringDistribution -XX:PrintFLSStatistics=1

Slide 42

Slide 42 text

2008 JavaOneSM Conference | java.sun.com/javaone | 42 The GC Challenge Main objectives • Lowering the frequency of the collections • Lowering the time of the collections • Avoiding a Full GC

Slide 43

Slide 43 text

2008 JavaOneSM Conference | java.sun.com/javaone | 43 The CBOE Problem Set Worst-case scenario An average young GC promotes ~6 MBs in ~100 ms at a frequency of ~4 secs CMS cycles occur every ~3-4 mins • It usually takes ~12-13 secs for a cycle to complete • If it does not complete in ~45-50 secs “we lose the race” Tuning allows for more throughput which just brings us back to the original problem: more garbage! The challenge • Reduce young GC times to under 50 ms • Maintain intervals to > 5 secs • Never lose the race

Slide 44

Slide 44 text

2008 JavaOneSM Conference | java.sun.com/javaone | 44 : 0 0 : 0 2 : 0 4 : 0 5 : 0 6 : 0 7 : 0 8 : 0 9 : 1 0 : 1 1 : 1 2 : 1 3 : 1 4 : 1 5 : 1 5 : 1 6 : 1 7 : 1 9 : 2 0 : 2 2 : 2 3 : 2 4 : 2 5 : 2 5 : 2 5 : 2 6 : 2 8 : 2 9 : 3 0 : 3 1 : 3 2 : 3 3 : 3 4 : 3 5 : 3 6 : 3 7 : 3 8 : 3 9 : 4 0 : 4 0 : 4 1 : 4 2 : 4 3 : 4 4 : 4 5 : 4 6 : 4 7 : 4 8 : 4 9 : 5 0 : 5 1 : 5 2 : 5 3 : 5 4 : 5 5 : 5 6 : 5 8 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 Time (ms) Time Stamp (min) CMS GC Pauses GC Thread Stop Time Source: CBOE ~1.7 GB processed ~533 MB promoted 57 GCs in 60 secs

Slide 45

Slide 45 text

2008 JavaOneSM Conference | java.sun.com/javaone | 45 : 0 0 : 0 2 : 0 4 : 0 5 : 0 6 : 0 7 : 0 8 : 0 9 : 1 0 : 1 1 : 1 2 : 1 3 : 1 4 : 1 5 : 1 6 : 1 7 : 1 9 : 2 0 : 2 3 : 2 4 : 2 5 : 2 5 : 2 8 : 2 9 : 3 0 : 3 1 : 3 2 : 3 3 : 3 4 : 3 5 : 3 6 : 3 7 : 3 8 : 3 9 : 4 0 : 4 0 : 4 1 : 4 2 : 4 3 : 4 4 : 4 5 : 4 6 : 4 7 : 4 8 : 4 9 : 5 0 : 5 1 : 5 2 : 5 3 : 5 4 : 5 5 : 5 6 : 5 8 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 30 32.5 35 37.5 40 42.5 45 47.5 50 52.5 55 57.5 60 62.5 65 67.5 70 Old Gen Occ (%) Promo Size (KB) Time Stamp (min) The CMS Obstacle Course Promotion Size / Old Generation Occupancy Source: CBOE Cycle Initiating Threshold (60%) (~11 sec) CMS start CMS end

Slide 46

Slide 46 text

2008 JavaOneSM Conference | java.sun.com/javaone | 46 : 0 0 : 0 2 : 0 4 : 0 5 : 0 6 : 0 7 : 0 8 : 0 9 : 1 0 : 1 1 : 1 2 : 1 3 : 1 4 : 1 5 : 1 6 : 1 7 : 1 9 : 2 0 : 2 3 : 2 4 : 2 5 : 2 5 : 2 8 : 2 9 : 3 0 : 3 1 : 3 2 : 3 3 : 3 4 : 3 5 : 3 6 : 3 7 : 3 8 : 3 9 : 4 0 : 4 0 : 4 1 : 4 2 : 4 3 : 4 4 : 4 5 : 4 6 : 4 7 : 4 8 : 4 9 : 5 0 : 5 1 : 5 2 : 5 3 : 5 4 : 5 5 : 5 6 : 5 8 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 30 32.5 35 37.5 40 42.5 45 47.5 50 52.5 55 57.5 60 62.5 65 67.5 70 Old Gen Occ (%) Promo Size (KB) Time Stamp (min) The CMS Obstacle Course Source: CBOE CMS start Heading towards Full GC Promotion Size / Old Generation Occupancy

Slide 47

Slide 47 text

2008 JavaOneSM Conference | java.sun.com/javaone | 47 CMS vs. G1 Results Homogeneous stress test at 3,000 TPS steady • Sun Fire™ server X4450, 16-way, 32 GB memory, 32-bit version Preliminary Results • Achieved a good measure of stability • We can run a lot longer than we could 3 months ago. :-) • Currently, pause times quite higher than CMS • Due to some “band-aids” to address some concurrency-related bugs • SATB works as advertised • Remarks consistently down to 12ms from 50-60ms with CMS • The adventurous customer very satisfied with the G1 GC output • Very detailed breakdown to identify tuning opportunities • GC team very receptive to “constructive criticism”

Slide 48

Slide 48 text

2008 JavaOneSM Conference | java.sun.com/javaone | 48 Agenda Garbage-First Attributes Garbage-First Operation First Impressions: CBOE Final Thoughts

Slide 49

Slide 49 text

2008 JavaOneSM Conference | java.sun.com/javaone | 49 G1 Summary Server-Style Low-Latency GC • Parallel • Concurrent • Compacting • Soft Real-Time Future replacement for CMS

Slide 50

Slide 50 text

2008 JavaOneSM Conference | java.sun.com/javaone | 50 Future Directions Reduce • Stop-the-world pause times • GC overhead Deal with mostly-static heap subsets • Automatically discover a mostly-static set of regions • Reduce marking overhead for that set • Target: large caches Keep taking advantage of Paul's “wind tunnels” :-)

Slide 51

Slide 51 text

2008 JavaOneSM Conference | java.sun.com/javaone | 51 Reading Material D. L. Detlefs, C. H. Flood, S. Heller, and T. Printezis. Garbage-First Garbage Collection. In A. Diwan, editor, Proceedings of the 2004 International Symposium on Memory Management (ISMM 2004), pages 37-48, Vancouver, Canada, October 2004. ACM Press. T. Printezis and D. L. Detlefs. A Generational Mostly- Concurrent Garbage Collector. In A. L. Hosking, editor, Proceedings of the 2000 International Symposium on Memory Management (ISMM 2000), pages 134-154, Minneapolis, MN, USA, October 2000. ACM Press.

Slide 52

Slide 52 text

2008 JavaOneSM Conference | java.sun.com/javaone | 52 Tony Printezis, Sun Microsystems Paul Ciciora, CBOE #TS-5419