Slide 1

Slide 1 text

MHDeS: Deduplicating Method Handle Graphs for Efficient Dynamic JVM Language Implementations Shijie Xu1 David Bremner1 Daniel Heidinga2 1IBM Centre for Advanced Studies (CAS Atlantic) University of New Brunswick, Canada 2IBM Ottawa, Canada July 18, 2016

Slide 2

Slide 2 text

Method Handle A Method Handle (MH) is a typed, directly executable reference to an underlying method, field, constructor, or similar low level operation, with optional transformations of arguments and return values1. MethodHandle g = lookup () . f i n d V i r t u a l ( S t r i n g .class, ” isEmpty ” , methodType ( boolean.class) ) ; MethodHandle gwt = MethodHandles . guardWithTest (g , t , f ) ; MethodType: argument types and return type. Executable reference. Method Type Transformation. e.g., GuardWithTest, CatchException, FilterReturn. 1 https://docs.oracle.com/javase/7/docs/api/java/lang/invoke/MethodHandle.html

Slide 3

Slide 3 text

Method Handle Graph A referenced method can have references to other MHs. an MHG = MHs + Structure

Slide 4

Slide 4 text

A Sample Code for MHG Creation C a l l S i t e bootstrapMethod ( . . . ) { C a l l S i t e cs = . . MethodHandle g = lookup () . f i n d V i r t u a l ( S t r i n g .class, ” isEmpty ” , methodType ( boolean.class) ) ; MethodHandle t = lookup () . f i n d S t a t i c (Some , ” addGoogle ” , methodType ( S t r i n g . class, S t r i n g .class) ) ; //Append google to the t a i l . MethodHandle f = lookup () . f i n d S t a t i c ( Other , ”addYahoo” , methodType ( S t r i n g . class, S t r i n g .class) ) ; //Append yahoo to the t a i l . MethodHandle gwt = MethodHandles . guardWithTest (g , t , f ) ; MethodHandle d0 = MethodHandles . f i l t e r R e t u r n ( gwt , f ) ; cs . setTarget ( d0 ) ; return cs ; }

Slide 5

Slide 5 text

Method Handles and invokedynamic JSR 292: Supporting dynamically typed language in JVM. MHG Bootstrap method. Implemented by developers. Transfer method invocation.

Slide 6

Slide 6 text

Motivation Equivalent MHGs: Memory and CPU resources. Bad for Just-In-Time compilation (Inline Graph). The JIT compilation is based on Invocation Counter. The JITted MHG is inlined to the CallSite directly.

Slide 7

Slide 7 text

Motivation Threshold = 30; MHG a MHG b MHG c MHG a’ MHG b’ MHG c’ native code native code native code 10 10 10 30 30 30 MHG b MHG c 10 MHG a X 10 native code 10 jit X No deduplication With Deduplication jit No Jit No Jit No Jit Case 1 Case 2 10 + +10=30

Slide 8

Slide 8 text

Our work Equivalence Model. Method Handle Deduplication System (MHDeS).

Slide 9

Slide 9 text

Equivalence Model Method Handle key (MH key): A single method handle. Method Handle Graph Key: A number of MH keys + graph structure.

Slide 10

Slide 10 text

Equivalence Model Two Graphs Gm and Gn , F(m, n) → {0, 1}. F(m, n) = 1 if Gm = Gn or MHG keym = MHG keyn . F(m, n) = (m = n) ∨ (f (m, n) ∧ (f (m) = f (n))) (1) where f (m) = MH keym , and f (m, n) =              |Sm| i=1 F(Sm (i), Sn (i)), if Sm = ∅ ∧ Sn = ∅ ∧ |Sm | = |Sn | 1, Sm = Sn = ∅ 0, otherwise (2)

Slide 11

Slide 11 text

MHDeS: MHG Deduplication System MHDeS is a runtime equivalent MHG deduplication system. Runtime. Avoid creating equivalent method handles. Use an index key to represent a method handle about to be created.

Slide 12

Slide 12 text

MH Index Key Index key: {transformation, MH Key, orderd children, optional data} It represents a method handle about to be created. It has all parameters (i.e., method type, children, and other necessary data if present).

Slide 13

Slide 13 text

MH pool The pool is: Transformation Index → MHObject Chain MH Pool

Slide 14

Slide 14 text

Detector Use existing MHObjects in the pool to construct an index key.

Slide 15

Slide 15 text

Detector Use existing MHObjects in the pool to construct an index key. Compare the index key to the existing MHs in the pool recursively.

Slide 16

Slide 16 text

Detector Use existing MHObjects in the pool to construct an index key. Compare the index key to the existing MHs in the pool recursively. Create the method handle for the index key and add it to the pool if comparison false.

Slide 17

Slide 17 text

Procedure Pick up the MHObject chain and iterate it by index key’s transformation. For each MHObject, compare its children to those of the index key. Fast comparison path: see if both a child MH of an index key and a child of the MH in the pool are the same instance.

Slide 18

Slide 18 text

Procedure The comparison rules: Points two instances in the pool → Non equivalent. One refers to pool’s instance, and the other is not → Not sure.

Slide 19

Slide 19 text

Experiment Setup JRuby (1.7.6) a Java Implementation of Ruby language. IBM J9 JVM for Java 8. Micro-Indy Benchmark in Computer Language Benchmarks Game(CLBG).

Slide 20

Slide 20 text

Measurement Elapsed Time to complete a test. Memory Usage (The occupied memory after last GC) Accumulated GC Pause Time Improved = Measureorig − MeasureMHDeS Measureorig (3)

Slide 21

Slide 21 text

Elapsed Time mergesort_hongli so_object gc_mb socket_transfer_1mb app_factorial string_concat printff so_lists so_lists_small word_anagrams fasta count_multithreaded app_mandelbrot primes nbody app_fib socket_transfer_1mb_noblock nsieve_bits gc_string simple_server gc_array cal spectral_norm app_pentomino so_sieve eval so_matrix mbari_bogus1 fractal simple_connect partial_sums pi so_array count_shared_thread fiber_ring list binary_trees app_tarai monte_carlo_pi observ write_large −10 0 10 20 30 40 50 60 70 80 Improved Performance(%) 0 0 -4 -2 1 0 1 -2 -4 14 1 36 3 -6-6 0 0 2 8 0 0 3 0 -1 0 0 77 -5 1 -3 4 2 0 5 0 24 3 29 0 3 1 Figure : CPU Time Reduction Percentage (Median: 0.9%, Mean: 4.67%, std:14.48)

Slide 22

Slide 22 text

Memory usage gencon Policy Heap: a nursery and a tenured area. Create objects on the nursery. Move objects to the tenured area.

Slide 23

Slide 23 text

Memory usage mergesort_hongli so_object gc_mb socket_transfer_1mb app_factorial string_concat printff so_lists so_lists_small word_anagrams fasta count_multithreaded app_mandelbrot primes nbody app_fib socket_transfer_1mb_noblock nsieve_bits gc_string simple_server gc_array cal spectral_norm app_pentomino so_sieve eval so_matrix mbari_bogus1 fractal simple_connect partial_sums pi so_array count_shared_thread fiber_ring list binary_trees app_tarai monte_carlo_pi observ write_large −50 −40 −30 −20 −10 0 10 20 30 40 Reduced Memory with MHDes (%) 13 11 10 10 14 32 5 7 3 13 1 -4 6 17 2 14 -1 5 910 14 13 -2 0 11 14 4 -41 9 13 7 12 12 5 -2 0 11 17 2 -3 11 Figure : Memory usage, (Median: 9.84%, Mean: 7.19%)

Slide 24

Slide 24 text

GC Pause Time -120 -100 -80 -60 -40 -20 0 20 40 app_pentomino monte_carlo_pi so_array gc_string count_multithreaded app_tarai count_shared_thread so_object observ nsieve_bits eval write_large string_concat mergesort_hongli nbody fiber_ring gc_array printff partial_sums fasta word_anagrams list gc_mb simple_connect binary_trees cal app_factorial app_fib so_lists_small so_matrix so_sieve pi so_lists app_mandelbrot socket_transfer_1mb_noblock primes spectral_norm fractal simple_server socket_transfer_1mb mbari_bogus1 Reduced GC Paused Time (%) Figure : GC Pause Time (Mean: 1.65%) Not Much Improvement

Slide 25

Slide 25 text

Earlier JIT compilation The whole program repeats interpreting bytecode version and JITted version. Trivial GC pause time change CPU time ≈ n1 i=1 T inti + n2 i=1 T exei (4)

Slide 26

Slide 26 text

MHDeS Expenses Mean(ms) Max(ms) Mean(ms) Max(ms) app pentomino 0.163418 10 monte carlo pi 0.415842 11 so array 0.206452 10 gc string 0.375 15 count multithreaded 0.171504 11 app tarai 0.241667 10 count shared thread 0.180247 15 so object 0.179426 9 observ 0.191558 11 nsieve bits 0.180233 8 eval 0.468085 9 write large 0.404762 7 string concat 0.220339 8 mergesort hongli 0.222973 9 nbody 0.179562 9 fiber ring 0.268698 9 gc array 0.2 9 printff 0.337539 8 partial sums 0.281437 10 fasta 0.22314 9 word anagrams 0.197044 9 list 0.945606 296 gc mb 0.285714 8 simple connect 0.223214 12 binary trees 0.297872 10 cal 0.181435 8 app factorial 0.209302 6 app fib 0.211009 8 so lists small 0.190311 11 so matrix 0.378261 9 so sieve 0.272727 8 pi 0.178571 9 so lists 0.273356 7 app mandelbrot 0.376 10 primes 0.252137 10 socket transfer 1mb noblock 0.170673 1 spectral norm 0.194774 9 fractal 0.231884 10 simple server 0.201681 9 socket transfer 1mb 0.204334 1 mbari bogus1 0.0388703 167 Table : MHDeS Time Expense at Runtime