Tuning Java for Virtual

© 2009 VMware Inc. All rights reserved SFJUG: Tuning Java
for Virtual Ben Corrie, Staff Engineer, Java performance engineering 15th March 2012

2 Agenda §  Introduction §  Virtualization 101 §  Existing Java
best practices §  Hypervisor memory management §  Java memory management §  Java memory management when running virtual §  Over-committing memory and the performance pitfalls §  Possible workarounds to over-committing Java §  Elastic Memory For Java §  Future topics and demo §  Appendix

3 Introduction §  About me: •  Worked on Java since
1998 •  Spent 10 years at IBM, much of that time working on the J9 JVM •  Joined SpringSource in 2008 as a Consultant •  VMware acquired SpringSource in 2009 •  Relocated to California in 2010 to work in partnership with VMware on Java performance optimizations for vSphere •  Keep up with developments: @bensdoings §  About this talk: •  Look at existing Java best practice •  Why is Java a special case? In-depth look at memory management •  What Vmware is doing to better support Java workloads

5 What is Virtualization?

6 Partitioning •  Run multiple operating systems on one physical
machine •  Fully utilize server resources •  Support high availability by clustering virtual machines Encapsulation •  Encapsulate the entire state of the virtual machine in hardware-independent files •  Save the virtual machine state as a snapshot in time •  Re-use or transfer whole virtual machines with a simple file copy Isolation •  Isolate faults and security at the virtual-machine level •  Dynamically control CPU, memory, disk and network resources per virtual machine •  Guarantee service levels Three Properties of Virtualization

7 Initial Virtualization Benefits: Consolidation BEFORE VMware AFTER VMware 1,000
Direct attach 3000 cables/ports 200 racks 400 power whips 80 Tiered SAN and NAS 400 cables/ports 10 racks 20 power whips Servers Storage Network Facilities Servers Storage Network Facilities

8 Next Benefit: Simpler Management VMotion Technology VMotion Technology moves
running virtual machines from one host to another while maintaining continuous service availability - Enables Resource Pools - Enables High Availability

9 Pooling of resources Resource Pool Resource Pool Resource Pool
Pools replace hosts as the primary compute abstraction

11 The Best Hardware §  Hardware-assisted virtualization •  1st Generation
CPU virtualization appeared in 2006 •  VT-x from Intel and AMD-V from AMD •  Eliminated the need for binary translation •  2nd Generation MMU virtualization •  EPT in Intel and RVI in AMD •  Memory management support that maps guest memory to host memory •  Supported in ESX 4.0+ §  Useful reference papers •  “Performance Best Practices for Vmware vSphere 5.0” •  Goes into detail on general virtualization performance questions •  http://www.vmware.com/pdf/Perf_Best_Practices_vSphere5.0.pdf •  “Enterprise Java Applications on VMware Best Practices Guide” •  http://www.vmware.com/files/pdf/techpaper/Enterprise-Java-Applications-on-VMware-Best-Practices-Guide.pdf

12 Existing Java Best Practices §  Memory •  Make sure
Java always has 100% of the memory it needs (no over-commit!) •  Try to reduce the Java heap if possible to avoid wasting memory •  Use large pages (wherever supported) for better performance (see Appendix B) §  CPUs •  Match the number of vCPUs to the number of GC threads •  Don’t use more vCPUs than you actually need: 2-4 is usually plenty §  Timekeeping •  Synchronize the host and VMs with external NTP sources •  Lower the clock interrupt rate on Linux (see Appendix A) •  Use the Java features for lower resolution timing as supplied by your JVM (Windows/Sun JVM example: -XX:+ForceTimeHighResolution) §  Vertical and Horizontal scalability •  High availability, inter-tier configuration etc.

14 Virtual Memory §  Creates uniform memory address space • 
Operating system maps application virtual addresses to physical addresses •  Gives the operating system memory management abilities that are transparent to the application •  Hypervisor adds extra level of indirection •  Maps guest’s physical addresses to machine addresses •  Gives the hypervisor memory management abilities that are transparent to the guest “virtual” memory “physical” memory “machine” memory guest hypervisor

15 Virtual Memory guest hypervisor “machine” memory “physical” memory “virtual”
memory “virtual” memory “physical” memory “machine” memory guest hypervisor Application Operating System Hypervisor App OS Hypervisor VM

16 Application Memory Management §  Starts with no memory § 
Allocates memory through syscall to operating system §  Often frees memory voluntarily through syscall §  Explicit memory allocation interface with operating system Hyper visor OS App

17 Operating System Memory Management §  Assumes it owns all
physical memory §  No memory allocation interface with hardware •  Does not explicitly allocate or free physical memory §  Defines semantics of “allocated” and “free” memory •  Maintains “free” list and “allocated” lists of physical memory •  Memory is “free” or “allocated” depending on which list it resides Hyper visor OS App

18 VM Memory “Allocation” §  VM starts with no machine
memory “allocated” to it § No memory is used when powered off § Small footprint after power on and boot §  Machine memory is lazily allocated on demand § When applications or the OS read and write to physical memory, machine memory is provided to “back” it. Hyper visor OS App

19 VM Memory Reclamation §  Guest physical memory not “freed”
in typical sense •  Guest OS moves memory to its “free” list •  Data in “freed” memory may not have been modified §  Hypervisor isn’t aware when guest frees memory •  Freed memory state unchanged •  No access to guest’s “free” list •  Unsure when to reclaim “freed” guest memory Hyper visor OS App Guest Free List

20 VM Memory Reclamation Cont’d §  Guest OS (inside the
VM) •  Allocates and frees… •  And allocates and frees… •  And allocates and frees… §  VM •  Allocates… •  And allocates… •  And allocates… Hypervisor can’t reclaim memory through guest frees! Hyper visor App Guest free list Inside the VM OS VM

21 Reclamation techniques: Transparent Page Sharing (TPS) §  Simple idea:
why maintain many copies of the same thing? •  If 4 Windows VMs are running, there are 4 copies of Windows code •  Only one copy is needed §  Share memory between VMs when possible •  Background hypervisor thread identifies identical sets of memory •  Points all VMs at one set of memory, frees the others •  VMs are unaware of the change VM 1 VM 2 VM 3 Hyper visor VM 1 VM 2 VM 3 Hyper visor

22 Reclamation techniques: Ballooning §  Hypervisor wants to reclaim memory
§  Guest OS is not aware of this •  Thinks it owns all physical memory •  Sits inside its own “box”, unaware it’s running in a VM or that other VMs are running §  Goal: make the guest aware so it frees up some of its memory §  Solution: artificially create memory pressure inside the VM •  “Push” memory pressure from the hypervisor into the VM •  Use “balloon” driver inside the VM to create memory pressure Hyper visor OS App OS App VM 1 VM 2 memory pressure

23 Reclamation techniques: Ballooning - Inflating Balloon Hyper visor OS
Guest App Guest free list Balloon Driver 1.  Balloon driver allocates memory 2.  Balloon driver pins allocated memory 3.  Guest may reclaim other memory 4.  Balloon driver tells hypervisor what memory it allocated 5.  Hypervisor frees machine memory backing memory allocated by balloon driver 6.  Hypervisor now has more free physical memory List of memory Hypervisor: 8 pieces of memory are allocated Guest: 3 pieces of memory are allocated Hypervisor: 5 pieces of memory are allocated Guest: 6 pieces of memory are allocated

24 Reclamation techniques: Ballooning - Inflating Balloon (Cont’d) Hyper visor
OS Guest App Guest free list Balloon Driver Guest OS swapping, a possible side effect of ballooning Two possibilities for guest free memory: 2. VM doesn’t have much free memory 1. VM has lots of free memory Guest free list 1. No swapping necessary! 2. May need To swap! Guest OS chooses whether to swap or not!

25 Hyper visor OS App Reclamation techniques: Compression and Swapping
§  If memory cannot be reclaimed using sharing or ballooning… §  Memory compression § Suitable memory pages can be unmapped from the VM and compressed § Pages are still in memory and uncompressed on demand § Up to 100x faster than swapping! §  Host swapping § Last resort: expensive disk latencies § Swaps out to a per-VM swap file

26 Policies: When to reclaim and which VMs? §  When
to reclaim? •  Page sharing will occur as a background thread (TPS) •  Ballooning, compression and swapping only occur if there is memory pressure •  A resource pool containing VMs can have an arbitrary memory limit •  A physical host containing VMs can have a real physical memory limit •  Once allocation requests risk violating available memory, reclamation begins §  Which VMs? •  The hypervisor estimates how much memory is “active” in a VM •  VMs which are the least “active” are the primary reclamation targets

28 Java Memory Management §  Interesting parallels! •  The “black
box” relationship between the hypervisor and the operating system is very similar to that of the relationship between the JVM and the OS •  Unlike native applications which allocate and free memory from the OS, the JVM manages its object heap as a “black box” •  The OS allocates pages lazily to the heap, but they are never freed (unless in the rare case where the heap shrinks) OS JVM §  JVM starts up §  Allocates some objects §  Allocates more objects §  Garbage Collection Guest free list heap

29 OS JVM Java Memory Management: Partially vs fully committed
heap •  Java gives you the option to set a “min” and “max” heap size •  Partially committed heap •  Min < max (-Xms < -Xmx) •  Heap grows to the high water mark of live data through incremental growth •  Typically won’t shrink back down •  More frequent, but shorter GCs •  Fully committed heap •  Min == max (-Xms == -Xmx) •  No heap growth, so data gradually fills the heap until it’s full •  Less frequent, but longer GCs •  Potentially wasteful for apps with small amounts of med/long lived data OS JVM GC count: Garbage: 0 0 1 2 3 4 1 2 5 GC count: Garbage: 0 0 1 5

30 Java Memory Management: Large Pages §  What are large
pages? •  Operating system memory is managed in pages (typically 4096 bytes) •  Indirection between virtual and physical pages can be expensive to manage •  Some operating systems have support for large pages, which are 2MB each •  Using large pages significantly reduces the overhead to manage this indirection §  How does this apply to the JVM? •  The JVM can be configured to request large pages for its heap §  Advantages •  Performance improvements due to more efficient memory management •  Hypervisor is clever enough to break a large page into small pages behind the scenes if it thinks there’s significant page sharing benefits §  Disadvantages •  It’s a pain to configure (see Appendix B)

32 Java Memory Management When Running Virtual §  Recap § So
far, we’ve seen that the guest OS memory is a black box to the hypervisor § We’ve also seen that the JVM memory is a black box to the OS it’s running on § We also saw that the most efficient way of reclaiming memory from the guest is to use a balloon, which forcibly takes free memory from the guest OS § But this ballooning mechanism presumes that the application is going to actually free memory when it doesn’t need it, so that the balloon can then take it… §  …Oops! § As we’ve seen, Java’s memory management breaks this model entirely § Java is LAZY about its memory management §  It will try to avoid incurring a full GC until the absolute last minute §  A memory spike that largely fills the heap will leave behind garbage which is then wasted memory which neither the OS, nor the hypervisor can reclaim § With ballooning, the hypervisor has a way of transferring memory pressure into the OS, but the OS has no way of transferring memory pressure into the JVM

33 §  So what happens if you use the balloon
with Java? § Well, it depends! § There are some cases where it may work just fine § There are other cases where it can cause a sudden collapse in performance § The provisos are too complex to offer guarantees, hence the best practice OS JVM Balloon Partially committed heap OS JVM Balloon Fully committed heap Memory Reclamation with Java

34 Memory Reclamation with Java Cont’d §  Large or small
pages? § The behavior of the balloon can vary significantly depending on whether you use large or small pages § Large pages (in current versions of Linux) are pinned in memory and cannot be swapped to disk. These pages are also off-limits to the balloon § If there is a lot of large page memory in the guest, this can limit the balloon size and therefore the memory that can be reclaimed §  What about Transparent Page Sharing? § Very little of the JVM’s memory is sharable, unless the JVM is specifically designed to arrange immutable memory in a way that can be shared

36 Over-committing memory §  What is “memory over-commitment”? § A significant
benefit of virtualization is that the hypervisor can manage resources such as CPU and memory more efficiently § Over-committing means deliberately giving the hypervisor less physical resources than are potentially available to all the VMs running on it §  Eg. 6 VMs, each with 8GB RAM and 4 vCPUs = 48MB vRAM and 24 vCPUs total §  Running all these VMs on a 32GB host with 16 CPUs = 50% over-committed § By over-committing memory on the hypervisor, we expect it to reclaim memory from VMs that are consuming less memory and give it to those consuming more § The hypervisor reclaims the most memory from the least active VMs § As we’ve seen, it reclaims memory using page sharing, ballooning, compression and swapping

37 Over-committing memory – the cost §  Relative reclamation costs
§ TPS runs all the time in a very low priority background thread – minimal cost § Ballooning runs in the guest OS and is an efficient way to reclaim memory § Compression is generally going to be less efficient than ballooning § Swapping (in this case host swapping) is very expensive due to disk latencies §  Swapping to SSD memory is much more efficient than swapping to disk §  Guest vs Host swapping § As we’ve seen, if the guest balloon inflates to a large enough size, it can force the guest OS to start swapping memory to disk § Even though this can be expensive, the guest knows best which memory to swap § The hypervisor has no idea which memory is best to swap, so it’s fairly random § Host swapping therefore typically carries a worse performance penalty

39 Possible workarounds to over-committing with Java §  Tune the
Java heap so that it collects garbage more proactively § By forcing the JVM to be more proactive about cleaning up, it is possible to… §  Prevent the heap from expanding further than it needs to: -Xms < -Xmx §  Encourage the heap to shrink some time after a spike: -XX:MaxHeapFreeRatio § The problem with this is that it cannot currently be synchronized with the hypervisor’s requirements for memory – you pay the GC cost all the time §  Use a Concurrent GC algorithm (CMS) § By constantly doing small amounts of Garbage Collection, CMS… §  Gives the hypervisor a more consistent and realistic active memory estimate §  If guest or host swapping does occur, the swapping back in of memory is more progressive and there is no sudden “freeze” of the JVM § The problem with this is that there is a constant throughput penalty §  Use a reservation to limit the balloon size § If you want the limit the reclamation size (and therefore balloon size) of a VM, set a reservation of (<vm size> - <max reclamation size>) § This doesn’t really solve the problem of getting Java to give back memory though

41 What is EM4J? §  EM4J is a balloon that
lives in the Java heap § The hypervisor inflates and deflates the balloon inside the Java heap § Balloon inflation cleans up garbage in the heap and hands memory back § When the JVM needs the full heap memory, it can kick out the balloon § EM4J disables the VMware guest tools balloon §  EM4J works with your existing configuration settings § A JVM agent designed to work with all Hotspot 1.6.0 GC policies § EM4J ballooning works just as well with large or small pages, fully-committed or partially-committed heap §  Let’s revisit our earlier scenarios…

42 §  VMware tools balloon disabled §  EM4J balloon starts
to inflate § Sharable memory is written to the Java heap §  EM4J balloon inflates more § More sharable memory is written and the pages are consolidated § The previously used memory is now free to be used elsewhere §  EM4J balloon inflates even more § The same thing happens again OS JVM Partially committed heap Memory Reclamation with EM4J Balloon Hypervisor §  EM4J can respond very quickly to balloon inflation requests § Memory can be reclaimed at up to 500MB/s

43 JVM §  This scenario is wasting more memory on
the hypervisor § Remember how previously balloon inflation caused swapping to disk? §  EM4J balloon inflates… § No need for swap! §  A Garbage collection occurs § The balloon works with the GC §  The JVM allocates more memory § The balloon is not overwritten §  The JVM shuts down § The balloon memory is still shared OS JVM Fully committed heap Memory Reclamation with EM4J Cont’d Balloon Hypervisor

44 The meaning of “Elastic memory” §  Memory is “elastic”
when it is taken from less active VMs and given to more active VMs JVM OS JVM JVM OS JVM Hypervisor LOAD LOAD Balloon Balloon

45 Evaluation on a Larger System (host memory)

46 Evaluation on a Larger System (32 VMs at 40%
overcommit)

47 Evaluation on a Larger System (32 VMs at 40%
overcommit)

48 Benefits of EM4J §  You can now safely over-commit
memory with Java! § There are no longer complex provisos to worry about § The amount you can over-commit depends on how much live data you have § We have good lab results up to 40% over-commit §  No need to reduce heap sizes as memory is not wasted § You can even go the other way and increase heap sizes to improve headroom §  You can monitor EM4J through JMX § EM4J ballooning also shows as “ballooned” memory in vCenter §  Multiple JVMs in a guest can contribute to the balloon § With v1, we recommend a pragmatic maximum of around 4-8 §  If you over-commit too far, performance degrades gracefully § There will be a steady increase in GC frequency as you increase over-commit

50 Future topics and Demo §  Management and monitoring of
Java § vCenter plugins to give visibility into Java workloads § Best-practice and right-sizing analysis tailored for Java §  Java APIs for managing and configuring VMs § http://vijava.sourceforge.net/ § Particularly useful in automated testing §  Scaling and High Availability § vMotion, CloudFoundry, Spring Data §  Demo

51 Questions §  Follow me @bensdoings

53 Appendix A: Which Kernel to Choose for the Best
Possible Timekeeping §  Choose a kernel that the hypervisor recognizes as tickless §  On ESX 3.5 chose: • SLES 10 SP2 • RHEL 5.4 §  On ESX 4.x choose from above, or: • SLES 11 • Ubuntu 9.04 •  Make sure you have updated to kernel 2.6.28-7.18 or later • Ubuntu 9.10 §  Later updates of these kernels are expected to work • For example SLES 10 SP3 works as well as SP2

54 Appendix B: Configuring Large Pages on Linux § How many
large pages do I need? •  Use these handy formulas for the Hotspot JVM: •  HugeTlbPages = ((MaxHeapSizeMB + MaxPermGenMB) / 2) + 2 •  (Parallel GC needs the extra 2 pages for some reason) •  ShmMaxBytes = (HugeTlbPages * (2 * 1024 * 1024)) + ShmNeededByOS §  Configuring the OS •  /etc/sysctl.conf •  Set vm.nr_hugepages to the HugeTlbPages calculated above •  Set vm.hugetlb_shm_group to a gid if you’re not running as root •  Set kernel.shmmax to ShmMaxBytes calculated above •  Reboot •  cat /proc/meminfo should show you the HugePages_Total §  Configuring Java •  Run with –XX:+UseLargePages. It will fail if anything is wrong •  You should see the HugePages_Rsvd from /proc/meminfo increase

Tuning Java for Virtual

Tuning Java for Virtual

More Decks by marakana

Other Decks in Programming

Featured

Transcript