Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Дмитрий Чуйко — "Преждевременная" компиляция — это нормально?

Дмитрий Чуйко — "Преждевременная" компиляция — это нормально?

В Oracle JDK 9 появилась статическая (ahead-of-time) компиляция кода. Мы уже обсуждали, зачем это нужно, и рамки текущей реализации. Теперь имеет смысл поговорить о технических деталях. Какая информация и как генерируется при работе AOT, как скомпилированный AOT код взаимодействует с Hotspot. Что можно поделать с AOT-кодом внешними инструментами, и как встроиться в процесс компиляции. И конечно, какие крутилки можно покрутить, и какая будет производительность при использовании AOT. Грабли уже аккуратно разложены, но кое-где уже ждут и румяные плюшки.

3fc5b5eb32bd3b48d7810fd67b37f9a1?s=128

Moscow JUG

May 22, 2017
Tweet

Transcript

  1. Compile ahead of time. It’s fine? Hotspot & AOT Dmitry

    Chuyko Java SE Performance Team May 22, 2017 Copyright © 2017, Oracle and/or its affiliates. All rights reserved.
  2. Program Agenda 1 JEP 295 in JDK 9 and beyond

    2 Generated Library 3 External Tools 4 Performance 5 Future Directions Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 2/82
  3. Safe Harbor Statement The following is intended to outline our

    general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle. Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 3/82
  4. JEP 295 in JDK 9 and beyond Copyright © 2017,

    Oracle and/or its affiliates. All rights reserved. 4/82
  5. AOT 9: Components ‚ JEP 295: Ahead-of-Time Compilation http://openjdk.java.net/jeps/295 JDK

    9 EA build 150 ‚ JEP 243: Java-Level JVM Compiler Interface http://openjdk.java.net/jeps/243 ‚ Graal Compiler https://github.com/graalvm/graal-core Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 5/82
  6. AOT 9: Workflow Regular Java javac .class .class JVM nmethod1

    Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 6/82
  7. AOT 9: Workflow Pre-compilation Java javac .class .class JVM nmethod1

    .class jaotc .so JVM Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 7/82
  8. AOT 9: Targeted Problems ‚ Application Warm-up – Startup Time

    – Time to Performance ‚ Steady state – Peak Performance – Application Latency ‚ Complex case – Bootstrapping (meta-circular implementations) ‚ Possible impact – Density – Power Consumption Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 8/82
  9. AOT 9: Solutions ‚ Pre-compile initialization code – No interpreter

    for class loading, initializers etc. – Spare resources for compilation – May stay at AOT ‚ Pre-compile critical code – Start with much better than interpreter performance – Spare resources for compilation – May stay at AOT ‚ Collect same profiling info as tier 2 – Reach peak performance Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 9/82
  10. Java on Java Copyright © 2017, Oracle and/or its affiliates.

    All rights reserved. 10/82
  11. Java on Java: Targeted Problems ‚ Simple maintenance ‚ Faster

    development ‚ Better security ‚ Embeddable VM Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 11/82
  12. Java on Java: Write parts in Java ‚ Current –

    Class library – Method handles – Graal/JVMCI – AOT ‚ Possible – Replace C1, C2, interpreter – Runtime routines – Compiler routines Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 12/82
  13. Java on Java: Project Metropolis ‚ JDK 10 based ‚

    Translated parts of Hotspot ‚ Graal ‚ AOT Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 13/82
  14. AOT 9: Measurements ‚ JDK 9 EA build 162 ‚

    Linux x64 ‚ G1 ‚ Compressed oops ‚ Dedicated server hardware or small machine Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 14/82
  15. AOT 9: AOT vs. JIT naïve CompressCrypto Derby FFT LU

    MPEG 0 % 50 % 100 % 150 % ‚ No-AOT ‚ AOT SPECjvm2008 G1 Tiered AOT of java.base Linux x64 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 15/82
  16. AOT 9: AOT vs. JIT naïve MonteCarlo SOR Serial Sparse

    XML 0 % 50 % 100 % 150 % ‚ No-AOT ‚ AOT SPECjvm2008 G1 Tiered AOT of java.base Linux x64 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 16/82
  17. AOT 9: Tiered AOT throughput Not so useless It works

    Ensure peak performance in steady state There may be differencies – Treated as bugs – Ignored Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 17/82
  18. AOT 9: AOT vs. JIT Frustrating CompressCrypto Derby FFT LU

    MPEG 0 % 50 % 100 % 150 % ‚ No-AOT ‚ AOT nt SPECjvm2008 G1 Non-tiered AOT of java.base Linux x64 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 18/82
  19. AOT 9: AOT vs. JIT Frustrating MonteCarlo SOR Serial Sparse

    XML 0 % 50 % 100 % 150 % ‚ No-AOT ‚ AOT nt SPECjvm2008 G1 Non-tiered AOT of java.base Linux x64 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 19/82
  20. AOT 9: Is it Graal? Ones regressed with AOT may

    not differ Compress Crypto Derby Serial XML 0 % 50 % 100 % 150 % ‚ C1-C2 ‚ AOT ‚ C1-Graal SPECjvm2008 G1 Non-tiered AOT of java.base Linux x64 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 20/82
  21. AOT 9: Is it Graal? Ones may only differ with

    Graal as JIT FFT LU MPEG MonteCarloSOR Sparse 0 % 50 % 100 % 150 % ‚ C1-C2 ‚ AOT ‚ C1-Graal SPECjvm2008 G1 Non-tiered AOT of java.base Linux x64 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 21/82
  22. AOT 9: AOT throughput ‚ Benchmarks regressed with AOT may

    not differ with Graal as JIT ‚ Benchmarks may only differ with Graal as JIT ‚ Same for other large benchmarks (e.g. SPECjbb) ‚ Same for many JVM micro-benchmarks ‚ It’s common to see NN% difference Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 22/82
  23. Generated Library Copyright © 2017, Oracle and/or its affiliates. All

    rights reserved. 23/82
  24. Generated Libraries: Auto-loaded Original / striped, compressed oops jmod Methods

    Tiered G1 NT G1 Tiered Par base 19M 50673 416M / 286M 318M / 201M 395M / 264M logging 118K 532 3.8M / 2.6M 2.9M / 1.8M 3.6M / 2.3M nashorn 2.2M 11865 84M / 54M 64M / 37M 79M / 49M jvmci 386K 1750 12M / 8.5M 8.9M / 5.8M 12M / 7.6M graal 5.5M 18166 163M / 104M 127M / 73M 154M / 95M javac 6.3M 12446 115M / 75M 91M / 55M 109M / 69M Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 24/82
  25. Generated Libraries: Basic subsets Original / striped, compressed oops Methods

    Tiered G1 java.base-CDS 22375 163M / 112M java.base-Hello 615 5.3M / 3.5M hello 2 99K / 76K Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 25/82
  26. Generated Libraries: libjava.base-coop.so readelf -S, size -A -d Striped Debug

    0M 100M 200M 300M 400M 286 416 286 286 261 261 166 166 ‚ Code ‚ RW ‚ Other ‚ Debug Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 26/82
  27. Generated Libraries: Shared library ‚ Shareable ‚ Native debug information

    ‚ Code ‚ Metadata – .so Ñ VM linkage – VM Ñ .so linkage – Runtime support Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 27/82
  28. Generated Libraries: Hello World ./objconv -dh any-aot.so.dbg | ... .hash

    Symbol hash table .dynsym Dynamic linker symbol table .dynstr String table .rela.dyn Relocation w addends . text Program data .metaspace.names Program data . klasses .offsets Program data . methods .offsets Program data .klasses.dependencies Program data .stubs.offsets Program data .header Program data .code.segments Program data .method.constdata Program data .config Program data .eh_frame Program data .dynamic Dynamic linking info .metadata.got Program data .method.metadata Program data . hotspot .linkage.got Program data . metaspace .got bss .method.state bss .oop.got bss .shstrtab String table .symtab Symbol table .strtab String table Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 28/82
  29. Generated Libraries: Hello World ./objdump -d hello.so.dbg | ... 00000000000023a0

    <test. HelloWorld.<init> ()V>: 0000000000002520 <test. HelloWorld.main ([Ljava/lang/String;)V>: 0000000000002b48 <M1_375_java.io. PrintStream.write (Ljava/lang/String;)V_plt.entry>: 0000000000002b5b <M1_375_java.io.PrintStream.write(Ljava/lang/String;)V_plt.jmp>: 0000000000002b68 <M1_391_java.io. PrintStream.newLine ()V_plt.entry>: 0000000000002b7b <M1_391_java.io.PrintStream.newLine()V_plt.jmp>: ... Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 29/82
  30. Generated Libraries: Hello World ./objdump -d hello.so.dbg | ... 0000000000002c20

    <Stub<AMD64MathStub. log >>: ... 0000000000005e20 <Stub<NewInstanceStub.newInstance>>: 0000000000005f20 <Stub<NewArrayStub.newArray>>: 0000000000006020 <Stub<ExceptionHandlerStub. exceptionHandler >>: ... 0000000000007ca0 <Stub<test_ deoptimize _call_int(int)int>>: ... 0000000000007d80 <plt._aot_jvmci_runtime_new_instance>: 0000000000007d88 <plt._aot_ jvmci _runtime_new_array>: 0000000000007d90 <plt._aot_jvmci_runtime_exception_handler_for_pc>: ... 0000000000007e58 <plt._ aot _backedge_event>: 0000000000007e60 <plt._aot_jvmci_runtime_thread_is_interrupted>: 0000000000007e68 <plt._aot_jvmci_runtime_test_deoptimize_call_int>: Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 30/82
  31. Generated Libraries: Cold HelloWorld startup Slow HDD. Size matters real

    user sys No-AOT 1.8s 0.2s 0.0s java.base (used) 12.5s 0.4s 0.4s Large unused 2.1s 0.2s 0.1s App 1.8s 0.2s 0.0s Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 31/82
  32. Generated Libraries: Warm HelloWorld startup real user sys No-AOT 0.12s

    0.15s 0.02s java.base 0.15s 0.13s 0.02s Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 32/82
  33. Generated Libraries: Profiling strategies jaotc -J-Dgraal.ProfileSimpleMethods=false Tiered G1 Tiered no-PSM

    Non-tiered G1 java.base 416M / 286M 370M / 252M 318M / 201M Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 33/82
  34. Generated Libraries: Profiling strategies org.graalvm.compiler.hotspot.phases.profiling.FinalizeProfileNodesPhase @Override protected void run(StructuredGraph graph,

    PhaseContext context) { if ( simpleMethodHeuristic(graph) ) { removeAllProfilingNodes(graph); return; } assignInlineeInvokeFrequencies(graph); if (ProfileNode.Options.ProbabilisticProfiling.getValue()) { assignRandomSources(graph); } } Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 34/82
  35. Generated Libraries: Profiling strategies org.graalvm.compiler.hotspot.phases.profiling.FinalizeProfileNodesPhase private static boolean simpleMethodHeuristic(StructuredGraph graph)

    { if (Options.ProfileSimpleMethods.getValue()) { return false; } // Check if the graph is smallish.. if ( graph.getNodeCount() > Options.SimpleMethodGraphSize.getValue()) { return false; } // Check if method has loops if ( graph.hasLoops() ) { return false; } ... Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 35/82
  36. Generated Libraries: Patching Graal org.graalvm.compiler.hotspot.phases.profiling.FinalizeProfileNodesPhase static ExecutorService io = Executors.newSingleThreadExecutor();

    @Override protected void run(StructuredGraph graph, PhaseContext context) { int nodeCount = graph.getNodeCount() ; // int nodeCount = graph.getNodes().filter(InvokeNode.class).count(); etc. io.execute(() -> { try { File hist = new File("hist.csv") ; if(!hist.exists()) hist.createNewFile(); BufferedWriter bw = new BufferedWriter(new FileWriter(hist.getName(), true)); bw.write(Integer.toString(nodeCount)); bw.write("\n"); bw.close(); } catch (IOException e) { }; }); ... Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 36/82
  37. Generated Libraries: Patching Graal javac --patch-module jdk.internal.vm.compiler=. \\ org/graalvm/compiler/hotspot/phases/profiling/FinalizeProfileNodesPhase.java jaotc

    -J--patch-module -Jjdk.internal.vm.compiler=/home/tp/aot/patching \\ -J-XX:+UseCompressedOops -J-XX:+UseG1GC -J-Xmx4g \\ --info --module java.base --compile-for-tiered --output ignored.so Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 37/82
  38. Generated Libraries: Number of nodes in method graphs java.base Copyright

    © 2017, Oracle and/or its affiliates. All rights reserved. 38/82
  39. Generated Libraries: Number of nodes in method graphs java.base Copyright

    © 2017, Oracle and/or its affiliates. All rights reserved. 39/82
  40. Generated Libraries: Number of nodes in method graphs java.base Copyright

    © 2017, Oracle and/or its affiliates. All rights reserved. 40/82
  41. External Tools Copyright © 2017, Oracle and/or its affiliates. All

    rights reserved. 41/82
  42. Profiling: Flames ‚ CPU Flame Graphs http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html ‚ Perf a

    fork after warm-up perf record -F 399 -a -g -- javac-javac -XX:+PreserveFramePointer ‚ AOT’ed modules java.base, jdk.compiler ‚ No perf-map-agent Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 42/82
  43. Profiling: Flames No-AOT AOT without debug info AOT with debug

    info Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 43/82
  44. Profiling: Flames No-AOT Copyright © 2017, Oracle and/or its affiliates.

    All rights reserved. 44/82
  45. Profiling: Flames AOT without debug info Copyright © 2017, Oracle

    and/or its affiliates. All rights reserved. 45/82
  46. Profiling: Flames AOT with native debug info Copyright © 2017,

    Oracle and/or its affiliates. All rights reserved. 46/82
  47. Performance Copyright © 2017, Oracle and/or its affiliates. All rights

    reserved. 47/82
  48. Time: Startup* No-AOT java.base+app Javac-Hello 1.8s ´20% Javac-Javac 17.1s ´24%

    Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 48/82
  49. Time: Startup* No-AOT java.base+app java.base-nt+app-nt Javac-Hello 1.8s ´20% ´38% Javac-Javac

    17.1s ´24% ´32% * Multi-threaded (T=32) Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 49/82
  50. Time: Startup* No-AOT java.base+app java.base-nt+app-nt Javac-Hello 1.8s ´20% ´38% Javac-Javac

    17.1s ´24% ´32% * Multi-threaded (T=32) Single-threaded (T=1): No-AOT java.base+app java.base-nt+app-nt Javac-Hello 0.5s ´11% +2% Javac-Javac 4.5s +8% +10% Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 49/82
  51. Warmup: Contended ‚ No profiling Ñ no contention ‚ -J-Dgraal.ProbabilisticProfiling=true

    – Tuning – Multiple Graal threads – Switch off when T = 1 ‚ -J-Dgraal.ProfileSimpleMethods=true – Pick strategy to not profile – Would it be better to inline simple ones? Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 50/82
  52. Warmup: Contended ProbabilisticProfiling HotSpotAOTProfilingPlugin.java -J-Dgraal.TierAInvokeNotifyFreqLog=13 -J-Dgraal.TierABackedgeNotifyFreqLog=16 -J-Dgraal.TierAInvokeProfileProbabilityLog=8 -J-Dgraal.TierABackedgeProfileProbabilityLog=12 globals.hpp -XX:Tier2InvokeNotifyFreqLog=11

    -XX:Tier2BackedgeNotifyFreqLog=14 ‚ Profile method ‚ Notify counters ‚ Logarithm of denominator Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 51/82
  53. Time: Startup & Post-warmup ‚ C1, C1(2), C1(3) -XX:TieredStopAtLevel= k

    ‚ C2 ‚ AOT-nt. java.base-nt & app-nt ‚ AOT. java.base & app -XX:Tier3AOTInvocationThreshold=2000000000 -XX:Tier3AOTMinInvocationThreshold=2000000000 -XX:Tier3AOTCompileThreshold=2000000000 -XX:Tier3AOTBackEdgeThreshold=2000000000 -XX:CICompilerCount=2 -XX:TieredStopAtLevel=2 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 52/82
  54. Time: Startup & Post-warmup AOT-nt AOT C1 C1(2) C1(3) C2

    0 ms 2,000 ms 4,000 ms 6,000 ms ‚ Javac- Javac T = 1 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 53/82
  55. Time: Startup & Post-warmup AOT-nt AOT C1 C1(2) C1(3) C2

    0 2 4 6 ¨104 ‚ Javac- Javac T = 32 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 54/82
  56. Warmup: Single threaded Javac-javac, tiered 0 5 10 15 20

    25 30 35 40 0 2,000 4,000 iteration ms No-AOT C1 java.base Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 55/82
  57. Warmup: Time to iterate Javac-javac, tiered Tiered C1 java.base app

    0s 20s 40s 60s 80s 100s 67 100 70 83 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 56/82
  58. Warmup: Tiered vm/runtime/globals.hpp -XX:Tier3AOTInvocationThreshold=10000 -XX:Tier3AOTMinInvocationThreshold=1000 -XX:Tier3AOTCompileThreshold=15000 -XX:Tier3AOTBackEdgeThreshold=120000 -XX:Tier3InvocationThreshold=200 -XX:Tier3MinInvocationThreshold=100 -XX:Tier3CompileThreshold=2000

    -XX:Tier3BackEdgeThreshold=60000 ‚ Thresholds are different ‚ Delay tier 3 on startup ‚ No qualitative effect on long warmup Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 57/82
  59. Warmup: Single threaded C1-Graal Javac-javac, tiered 0 5 10 15

    20 25 30 35 40 0 2,000 4,000 6,000 8,000 iteration ms No-AOT java.base Graal Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 58/82
  60. Throughput: Measurement What may be interesting ‚ AOT’ed code calling

    other code ‚ AOT’ed code touching other data ‚ java.base @State(Thread) public class OpsBench { @Benchmark public Result maybeFromAot() { return OpsClass1.doOp(<args>); } } @CompilerControl(DONT_INLINE) public class OpsClass1 { public static Result doPr(String s) { // May use OpsClass2, may be .so } } Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 59/82
  61. Throughput: Simple method calls VM .soÑVM VMÑ.so 1.soÑ2.so .so instance

    final 3.1 3.5 3.1 3.5 3.5 static direct 2.7 3.1 2.7 3.1 3.1 static indirect self 4.7 3.1 4.7 3.1 3.1 static indirect other 4.7 3.5 4.6 3.5 3.5 infra 0.4 ns/op, ˘0.5ns ‚ Non-tiered AOT ‚ It’s hard to measure directly Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 60/82
  62. Throughput: Simple method calls VMÑ.so, perfasm ....[Distribution by Source].............. 46.57%

    45.21% c2, level 4 25.62% 25.60% c1, level 1 25.62% 27.22% lib2.so 0.75% 0.71% kernel 0.69% 0.61% libjvm.so ....[Hottest Methods (after inlining)].... 37.83% 41.27% c2, level 4 micro.generated.CallBench_invokeStaticOther_jmhTest::invokeS 25.60% 25.17% c1, level 1 micro.TargetClass1::staticThatTarget, version 543 24.90% 26.19% lib2.so micro.TargetClass2.staticEmptyTarget()V 8.94% 5.80% c2, level 4 micro.generated.CallBench_invokeStaticOther_jmhTest::invokeS 1.72% 0.79% kernel [unknown] 0.08% 0.18% libjvm.so ElfSymbolTable::lookup Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 61/82
  63. Throughput: Simple method calls -XX:-TieredCompilation VM .soÑVM VMÑ.so 1.soÑ2.so .so

    instance final 3.1 3.1 3.1 3.5 3.5 static direct 2.7 3.1 2.7 3.1 3.1 static indirect self 4.2 3.1 4.2 3.1 3.1 static indirect other 4.3 3.5 4.6 3.5 3.5 infra 0.4 ns/op, ˘0.5ns ‚ It’s still hard to measure directly Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 62/82
  64. Throughput: Simple method calls C2, perfasm VM ....[Hottest Regions]..................... 52.39%

    59.81% c2 micro.generated.CallBench_invokeStaticOther_jmhTest::invokeSta 27.36% 24.16% c2 micro.TargetClass1::staticThatTarget, version 133 (31 bytes) 19.69% 15.97% c2 micro.TargetClass2:: staticEmptyTarget , version 134 (17 bytes) VMÑ.so ....[Hottest Regions]..................... 60.71% 52.88% lib1.so micro.TargetClass1.staticThatTarget()V (70 bytes) 39.17% 47.10% c2 micro.generated.CallBench_invokeStaticOther_jmhTest::invokeSta Someone is missing! objdump -d Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 63/82
  65. Throughput: Simple method calls @CompilerControl is not enough Working no

    Graal inlining during AOT: jaotc -J-XX:CompileCommand= dontinline ,*/*.* Broken alternative: jaotc -J-Dgraal.Inline=false -J-Dgraal.InlineDuringParsing=false Our non-local guy: 2347: callq 23a0 <M3_39_micro.TargetClass2. staticEmptyTarget ()V_plt.entry> Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 64/82
  66. Throughput: Simple method calls -XX:-TieredCompilation, no inlining in .so VM

    .soÑVM VMÑ.so 1.soÑ2.so .so instance final 3.1 3.5 3.1 3.5 3.5 static direct 2.7 3.1 2.7 3.1 3.1 static indirect self 4.2 6.2 4.8 6.2 6.2 static indirect other 4.3 5.8 4.6 6.2 6.2 infra 0.4 ns/op, ˘0.5ns ‚ Finally makes sense Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 65/82
  67. Throughput: Read data VM .soÑVM Read own static integer 5.8

    5.8 Ñ Read length of other’s static string 6.5 10.1 ns/op, ˘0.5ns Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 66/82
  68. Throughput: Read data String length, perfnorm VM .soÑVM Time, ns/op

    6.5 8.8 L1 dcache loads 16.4 26.5 Branches 6.1 13.3 Cycles 17.3 26.9 Instructions 39.8 67.4 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 67/82
  69. Throughput: Read data String length, perfasm ....[Hottest Region 2].................... c2,

    level 4, micro.AccessClass1::staticThatStrlen, version 544 ( 52 bytes ) ....[Hottest Region 1].................... lib1.so, micro.AccessClass1.staticThatStrlen()I ( 159 bytes ) Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 68/82
  70. Throughput: Read data String length, asm Constants in C2 0x00007f4d596b7aac:

    mov $0x8eff8e70 ,%r10 ; {oop(a &apos;java/lang/Class&apos;{0x000000008eff8e70} = &apos;micro/AccessClass2&apos;)} .... 0x00007f4d596b7adb: callq 0x00007f4d51c0dc00 ; ImmutableOopMap{} ;*invokevirtual length {reexecute=0 rethrow=0 return_oop=0} Checks & references in AOT 2541: mov 0x20fad8(%rip),%rcx # 212020 <got.init.Lmicro/AccessClass2;> 2548: test %rcx,%rcx 254b: je 25e2 <micro.AccessClass1.staticThatStrlen()I+0xc2> 2551: mov 0x20fad0(%rip),%rcx # 212028 <got.L/micro/AccessClass2;> ... 25d8: callq 73a0 <Stub<resolve_klass_by_symbol(Word,Word)Word>> ... Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 69/82
  71. Throughput: Checks & references ? Convert code to unshareable ?

    Mix code with known class data to use constants Simple code is good – But not inlineable Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 70/82
  72. Throughput: Checks & references java.base Unique of classes Avg. in

    method got.init.L<class> 779 14% 4.5 got.L<class> 1394 24% 8.0 ‚ United class list AppCDS Ý Ý Ý Ý Ñ 21 M .jsa ‚ Graal stats are close (4.9, 9.5) ‚ Intersection with CDS list: [ 908 [ 496 ] 658 ] Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 71/82
  73. Throughput: Inlining Library size Non-tiered G1 No inlining java.base 318M

    / 201M 236M / 128M Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 72/82
  74. Throughput: Inlining CompressCrypto Derby FFT LU MPEG 0 % 50

    % 100 % 150 % ‚ Inlining ‚ No inlining SPECjvm2008 G1 Non-tiered AOT of java.base Linux x64 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 73/82
  75. Throughput: Inlining MonteCarlo SOR Serial Sparse XML 0 % 50

    % 100 % 150 % ‚ Inlining ‚ No inlining SPECjvm2008 G1 Non-tiered AOT of java.base Linux x64 Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 74/82
  76. Latency: Garbage collection With AOT ‚ Some additional GC work

    ‚ No sensitive impact on mean ‚ No sensitive impact on max ‚ Same distributions Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 75/82
  77. Startup: Applications No-AOT java.base java.base-nt Jetty 0.5s ´15% ´22% Simple

    GUI 0.6s ´8% ´11% Rich GUI 1.9s ´4% ´8% Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 76/82
  78. Startup: Applications Graal bootstrap No-AOT java.base java.base+graal+jvmci Javac-Hello 0.8s ´29%

    ´29% Jetty 0.5s 0% 0% Javac-Javac 4.6s ´6% ´5% Javadoc-Small 2.7s ´2% +2% Simple GUI 0.7s ´17% Rich GUI 2.4s ´13% Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 77/82
  79. Startup: Applications WLS base_domain no-AOT java.base no-AOT java.base System Classloader

    no-CDS no-CDS AppCDS AppCDS Startup 11.4s ´17% ´33% ´48% Footprint [x1] resident 478 M +25% ´3% +25% unique 466 M ´6% ´15% ´18% Footprint [x10] total 4652 M ´3% ´11% ´13% Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 78/82
  80. Future Directions Copyright © 2017, Oracle and/or its affiliates. All

    rights reserved. 79/82
  81. Future Directions: Extensions & improvements ‚ More OSes – Other

    *NIX with ELF – PEF (macOS) – PE (Windows) ‚ More CPUs – ARM64 port ‚ Cross-AOT Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 80/82
  82. Future Directions: Features convergence ‚ Solve class data access problem

    – CDS – AppCDS – Shared strings ‚ Boilerplate – AOT of pre-generated stuff ‚ Product features – WLS ‚ Cloud – Containers Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 81/82
  83. Conclusion: AOT It works in 9 Performance is measurable Current

    results are questionable but already promising Known problems are to be fixed There are big plans Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 82/82