Upgrade to Pro — share decks privately, control downloads, hide ads and more …

JVM Workshop

JVM Workshop

Jakub Kubryński

November 06, 2018
Tweet

More Decks by Jakub Kubryński

Other Decks in Programming

Transcript

  1. @jkubrynski / kubrynski.blog @jkubrynski / kubrynski.blog JVM DIVE FOR MERE

    MORTALS JVM DIVE FOR MERE MORTALS WORKSHOPS WORKSHOPS JAKUB KUBRYNSKI JAKUB KUBRYNSKI [email protected] / @jkubrynski / https://kubrynski.blog
  2. $ WHOAMI $ WHOAMI 14+ YEARS PROFESSIONAL EXPERIENCE 14+ YEARS

    PROFESSIONAL EXPERIENCE DEVSKILLER CO-FOUNDER DEVSKILLER CO-FOUNDER BOTTEGA TRAINER & AUDITOR BOTTEGA TRAINER & AUDITOR DEVOXX.PL PROGRAM COMMITTEE MEMBER DEVOXX.PL PROGRAM COMMITTEE MEMBER OPENSOURCE CONTRIBUTOR OPENSOURCE CONTRIBUTOR
  3. WHO ARE YOU? WHO ARE YOU? Bytecode ? Hashtable vs

    ConcurrentHashMap ? JIT ? Concurrent Mark-Sweep ? Re ection vs MethodHandle ? Invokedynamic ?
  4. LIFE CYCLE LIFE CYCLE source -> javac -> bytecode bytecode

    -> classloader -> interpreter interpreter -> JIT -> optimized native code
  5. SOURCE CODE SOURCE CODE package com.random.company.app; public class StringUtilsHelper {

    public boolean isEmpty(String str) { return str != null && str.length() > 0; } }
  6. CLASS FILE CLASS FILE ClassFile u4 magic; // CAFEBABE u2

    minor_version; u2 major_version; u2 constant_pool_count; cp_info constant_pool[constant_pool_count-1]; u2 access_flags; u2 this_class; u2 super_class; u2 interfaces_count; u2 interfaces[interfaces_count]; u2 fields_count; field_info fields[fields_count]; u2 methods_count; method_info methods[methods_count]; u2 attributes count;
  7. BYTECODE BYTECODE list of operation codes $xxd -p Test.class ...1b04a0000504ac2a1b0464b600021b68ac...

    1b => 27 => iload_1 04 => 4 => iconst_1 a0 => 160 => if_icmpne 7 04 => 4 => iconst_1 ac => 172 => ireturn 2a => 42 => aload_0 1b => 27 => iload_1 04 => 4 => iconst_1 64 => 100 => isub b6 => 182 => invokevirtual #5 1b => 27 => iload_1 68 => 104 => imul ac => 172 => ireturn
  8. BYTECODE - MORE INFO BYTECODE - MORE INFO current usage:

    239/255 $JDK/src/hotspot/share/interpreter/bytecodes.hpp class sun.jvm.hotspot.interpreter.Bytecodes http://www.javaworld.com/article/2077233/core-java/bytecode- basics.html
  9. CLASSLOADING PHASES CLASSLOADING PHASES loading -> reads class le linking

    verifying -> veri es bytecode correctness preparing -> allocates memory resolving -> links with classes, interfaces, elds, methods initializing -> static initializers
  10. INTERPRETER INTERPRETER detects the critical hot spots in the program

    template interpreter java -XX:+UnlockDiagnosticVMOptions -XX:+PrintInterpreter -version
  11. JIT JIT Just-In-Time optimizes code compiles methods into native code

    -client (C1) / -server (C2) runs up to 20 times faster
  12. INLINING INLINING public String getStringFromSupplier(Supplier<String> supplier) { return supplier.get(); }

    public String businessMethod(String param) { Supplier<String> stringSupplier = new StringSupplier(”my” + param); return getStringFromSupplier(stringSupplier); } // turns to public String businessMethod(String param) { Supplier<String> stringSupplier = new StringSupplier(”my” + param); return stringSupplier.get(); }
  13. UNROLLING UNROLLING private static String[] options = {"yes", "no", "true",

    "false"} public void someMethod() { for (String opt : options) { process(opt); } } //turns into public void someMethod() { process("yes"); process("no"); process("true"); process("false"); }
  14. SCALAR REPLACEMENT SCALAR REPLACEMENT public record(int x, int y) {

    Point point = new Point(x, y); storePoint(point); } // inlining public record(int x, int y) { Point point = new Point(x, y); events.store("Added point", point.x, point.y); } // scalar replacement public record(int x, int y) { events.store("Added point", x, y); }
  15. DEAD CODE ELIMINATION DEAD CODE ELIMINATION public void myMethod() {

    for (int i = 0; i < THRESHOLD; i++) { new String("test"); } } // turns into public void myMethod() { }
  16. LOCK ELISION LOCK ELISION public void process(List<User> users) { List<User>

    result = new ArrayList<>(); synchronized(result) { fillResult(users); } } //turns into public void process(List<User> users) { List<User> result = new ArrayList<>(); fillResult(users); }
  17. TYPE SHARPENING TYPE SHARPENING List<User> users = new ArrayList<>(); //

    turns into ArrayList<User> users = new ArrayList<>();
  18. ON STACK REPLACEMENT ON STACK REPLACEMENT happens when the interpreter

    discovers that a method is looping converts an interpreted stack frame into a native compiled stack frame
  19. TIERED COMPILATION TIERED COMPILATION LEVELS LEVELS 0: Interpreted code 1:

    Simple C1 compiled code 2: Limited C1 compiled code 3: Full C1 compiled code 4: C2 compiled code
  20. WHY SHOULD I CARE? WHY SHOULD I CARE? JIT does

    most of the optimizations we could do manually without "obfuscating" source code Performance/load tests should run only on "hot" application
  21. HOW TO TRACK? HOW TO TRACK? When after restarting your

    app is at the full speed? $ jstat -compiler <PID> 1s // or JDK9+ => -Xlog:jit*=debug JDK8 => -XX:+PrintCompilation
  22. -XLOG:JIT -XLOG:JIT $JDK/src/hotspot/share/compiler/compileTask.cpp#CompileTask::print_impl const char compile_type = is_osr_method ? '%'

    : ' '; const char sync_char = is_synchronized ? 's' : ' '; const char exception_char = has_exception_handler ? '!' : ' '; const char blocking_char = is_blocking ? 'b' : ' '; const char native_char = is_native ? 'n' : ' '; compile_id, compile_type, sync_char, exception_char, blocking_char, native_char, comp_level [0,090s][debug][jit,compilation] 104 3 java.lang.module.ModuleDescriptor$Exports::isQualified (18 bytes) [0,096s][debug][jit,compilation] 122 4 java.lang.module.ModuleDescriptor$Exports::isQualified (18 bytes) [0,097s][debug][jit,compilation] 104 3 java.lang.module.ModuleDescriptor$Exports::isQualified (18 bytes) made not entrant [0,032s][debug][jit,compilation] 1 3 java.util.concurrent.ConcurrentHashMap::tabAt (22 bytes) [0,032s][debug][jit,inlining ] @ 15 jdk.internal.misc.Unsafe::getObjectAcquire (7 bytes) [0,059s][debug][jit,compilation] 35 n 0 java.lang.Object::hashCode (native) [0,065s][debug][jit,compilation] 41 ! 3 java.util.concurrent.ConcurrentHashMap::putVal (432 bytes)
  23. STACK TRACE STACK TRACE "main@1" prio=5 tid=0x1 nid=NA runnable java.lang.Thread.State:

    RUNNABLE at io.codearte.BlockBuilder.startBlock(BlockBuilder.groovy:21) at io.codearte.Generator.process(Generator.java:318) at io.codearte.ImportantApp.do(ImportantApp.java:64) at sun.reflect.NativeMethodImpl.invoke(NativeMethodImpl.java:18) at sun.reflect.NativeMethodImpl.invoke(NativeMethodImpl.java:62) at java.lang.reflect.Method.invoke(Method.java:497)
  24. OBJECT LAYOUT OBJECT LAYOUT com.eshop.model.Product object internals: OFFSET SIZE TYPE

    DESCRIPTION VALUE 0 12 (object header) N/A 12 4 int Product.id N/A 16 4 String Product.name N/A 20 4 (loss due to the next object alignment) Instance size: 24 bytes (estimated, the sample instance is not available) Space losses: 0 bytes internal + 4 bytes external = 4 bytes total
  25. OBJECT LAYOUT OBJECT LAYOUT com.eshop.model.Product object internals: OFFSET SIZE TYPE

    DESCRIPTION VALUE 0 12 (object header) N/A 12 4 int Product.id N/A 16 4 int Product.price N/A 20 4 String Product.name N/A Instance size: 24 bytes (estimated, the sample instance is not available) Space losses: 0 bytes internal + 0 bytes external = 0 bytes total
  26. OBJECT LAYOUT OBJECT LAYOUT com.eshop.model.Product object internals: OFFSET SIZE TYPE

    DESCRIPTION VALUE 0 12 (object header) N/A 12 4 int Product.id N/A 16 4 int Product.price N/A 20 1 boolean Product.available N/A 21 3 (alignment/padding gap) N/A 24 4 String Product.name N/A 28 4 (loss due to the next object alignment) Instance size: 32 bytes (estimated, the sample instance is not available) Space losses: 3 bytes internal + 4 bytes external = 7 bytes total
  27. OBJECT LAYOUT OBJECT LAYOUT com.eshop.model.Product object internals: OFFSET SIZE TYPE

    DESCRIPTION VALUE 0 16 (object header) N/A 16 4 int Product.id N/A 20 4 int Product.price N/A 24 1 boolean Product.available N/A 25 7 (alignment/padding gap) N/A 32 8 String Product.name N/A Instance size: 40 bytes (estimated, the sample instance is not available) Space losses: 3 bytes internal + 0 bytes external = 3 bytes total
  28. GC ALGORITHMS GC ALGORITHMS Serial Parallel Concurrent Mark Sweep G1

    Epsilon* (No-Op) - JEP 318 ZGC* (Low Latency) - JEP 333 * Experimental
  29. VECTOR ALGORITHM VECTOR ALGORITHM mark_gc_roots() for (each_root_object) { mark_all_referenced_objects() }

    for (each_object_in_memory) { if (is_marked_as_reacheable) { unmark_the_object() } else { remove_object_and_reclaim_memory() } }
  30. GENERATIONAL HYPOTESIS GENERATIONAL HYPOTESIS Infant mortality young objects are much

    more probable to die Idea is to process both generations separately Tenuring threshold
  31. GENERATIONAL HYPOTESIS GENERATIONAL HYPOTESIS Does not work for caches Do

    not keep caches in the same heap as the application Go off-heap
  32. LAMBDA UNDER THE HOOD LAMBDA UNDER THE HOOD BigDecimal sumCreditEntries(Client

    client) { return sumEntries(client.getAccounts(), account -> account.getCreditEntries()); } private static java.util.List lambda$sumCreditEntries$0(com.sandbox.Account); private Period period; BigDecimal sumCreditEntries(Client client) { return sumEntries(client.getAccounts(), account -> account.getCreditEntries(period)); } private java.util.List lambda$sumCreditEntries$0(com.sandbox.Account); BigDecimal sumCreditEntries(Client client, Period period) { return sumEntries(client.getAccounts(), account -> account.credit(period)); } private static java.util.List lambda$sumCreditEntries$0 (java.time.Period, com.sandbox.Account);
  33. METHOD REFERENCE METHOD REFERENCE SIMILAR TO LAMBDAS, BUT NO NEED

    TO GENERATE A METHOD SIMILAR TO LAMBDAS, BUT NO NEED TO GENERATE A METHOD BECAUSE WE'RE CALLING A METHOD BECAUSE WE'RE CALLING A METHOD
  34. BENCHMARKS BENCHMARKS CallTypes.baseline avgt 30 4.163 ± 0.009 ns/op CallTypes.lambda

    avgt 30 4.174 ± 0.015 ns/op CallTypes.methodRef avgt 30 4.244 ± 0.049 ns/op CallTypesExternal.baseline avgt 30 50.055 ± 0.275 ns/op CallTypesExternal.lambda avgt 30 50.980 ± 0.650 ns/op CallTypesExternal.methodRef avgt 30 50.655 ± 0.376 ns/op
  35. METHODHANDLES METHODHANDLES is not a replacement for re ection (re

    ection is more for introspection of classes) re ection does access control during invocation while MethodHandle checks all during lookup
  36. BENCHMARKS BENCHMARKS 1ns = 0.000 001 ms = 0.000 000

    001 s Benchmark Mode Cnt Score Error Units Invocations.baseline avgt 30 35.761 ± 0.113 ns/op Invocations.reflectionWithoutLookup avgt 30 41.413 ± 0.223 ns/op Invocations.handleWithoutLookup avgt 30 42.002 ± 1.002 ns/op Invocations.handleExactWithoutLookup avgt 30 38.134 ± 0.153 ns/op Invocations.reflection avgt 30 71.207 ± 1.558 ns/op Invocations.handle avgt 30 858.148 ± 18.135 ns/op
  37. EXCEPTIONS EXCEPTIONS DEEP MEANS THERE ARE 4 MORE FRAMES DEEP

    MEANS THERE ARE 4 MORE FRAMES Benchmark Mode Cnt Score Error Units Exceptions.standardExcept avgt 30 1029.919 ± 5.026 ns/op Exceptions.standardExceptDeep avgt 30 1121.771 ± 6.615 ns/op Exceptions.stacklessExcept avgt 30 18.827 ± 0.066 ns/op Exceptions.stacklessExceptDeep avgt 30 19.835 ± 0.053 ns/op
  38. FURTHER READING FURTHER READING Optimizing Java - Benjamin J Evans,

    James Gough The Well-Grounded Java Developer - Benjamin J. Evans, Martijn Verburg Java Performance - Charlie Hunt, Binu John Java Performance: The De nitive Guide - Scott Oaks
  39. I WANT MORE! I WANT MORE! THE JAVA® VIRTUAL MACHINE

    SPECIFICATION THE JAVA® VIRTUAL MACHINE SPECIFICATION HG CLONE HG CLONE HTTP://HG.OPENJDK.JAVA.NET/JDK11/JDK11/ HTTP://HG.OPENJDK.JAVA.NET/JDK11/JDK11/ HTTP://OPENJDK.JAVA.NET/PROJECTS/CODE-TOOLS/JMH HTTP://OPENJDK.JAVA.NET/PROJECTS/CODE-TOOLS/JMH