Slide 1

Slide 1 text

The Call of C-Tooling The Secrets Behind Native Image Building

Slide 2

Slide 2 text

@evacchi About Me • Edoardo Vacchi @evacchi • Research @ UniMi / MaTe • Research @ UniCredit R&D • Kogito / Drools / jBPM @ Red Hat • evacchi.github.io

Slide 3

Slide 3 text

@evacchi Kogito

Slide 4

Slide 4 text

Native Java

Slide 5

Slide 5 text

@evacchi Java Applications Build Time Run Time 3 Classloaders ~500 Classes ~160 Static Init 100+ Classloaders 1000+ Classes 1000+ Static Init 100++ Classloaders 1000++ Classes 1000++ Static Init static void Main Framework Initialization Application Initialization Source: Dan Heidinga - “Starting Fast” (QCon Plus 2021)

Slide 6

Slide 6 text

A Bit of History

Slide 7

Slide 7 text

@evacchi Native Java Compilers • Compilation into machine code is not innovative per se • Prior art: native java compilers early 2000s. • GNU Compiler for Java (GCJ) • ExcelsiorJET • ... • More Recently: RoboVM (~2013)

Slide 8

Slide 8 text

@evacchi Pros • Native code, possibly faster to start-up • Smaller memory footprint • by avoiding JIT+scratch memory in address space • possibly aggressive dead code elimination • Self-contained • avoid full JDK class library bundle

Slide 9

Slide 9 text

@evacchi Cons Limitations • Not a JDK: different runtime environment, not cross-platform • May get out-of-sync with the spec • Trade-offs with dynamicity • Difference in run-time behavior (dynamic vs static) • Possibly need compromises with peak-performance (PGO ?) Moreover • The benefits of a native compilation are not compelling enough • Startup time is negligible • "You boot up your application once, you keep it running for a long time" • "Disk is cheap" • Dynamic Linking vs Static Linking • You can still achieve faster startup time through laziness

Slide 10

Slide 10 text

@evacchi Laziness • Defer initialization to a later stage of execution, • Benefits: Shorter Startup Time • Downsides: Less predictable performance profile. Build Time Run Time static void Main Framework Initialization Application Initialization Delayed Inits...

Slide 11

Slide 11 text

Today

Slide 12

Slide 12 text

@evacchi Getting Closer to Today • Shared Managed Infrastructure • Serverless • More interest in “Stateless” Apps • Suddenly attractive: • Fast Startup • Smaller Disk Footprint • Smaller Memory Footprint • Time to revisit?

Slide 13

Slide 13 text

@evacchi Run-Time vs Build-Time • Generate code at build-time • Pre-initialize for boot time • e.g. Read config files, turn them into configuration commands • e.g. Read annotations, produce code for dependency injection • At startup, just execute that code • Benefits: faster startup time • Downsides • you have to write the code that generates code • possibly non-trivial, certainly time-consuming Build Time Run Time static void Main Framework Initialization Application Initialization Codegen

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

SmallTalk VMs

Slide 16

Slide 16 text

@evacchi Smalltalk Environment • Concept of image • At run-time you do not just write code, you manipulate the state of such machine • contributing to the environment itself • possibly altering it or even turning it upside-down • When it is shut down, you do not just save the code you wrote you persist the state of machine to the image • When you start it you do not only run a program the state is restored, and execution resumes from the last saved state Run Time Load State Shutdown Save State

Slide 17

Slide 17 text

Checkpointing

Slide 18

Slide 18 text

@evacchi CRIU + Java Build Time Run Time static void Main Framework Initialization Application Initialization Checkpoint • CRIU: Checkpoint and Restore in Userspace • https:/ /www.criu.org • Jigawatts: • https:/ /github.com/chflood/jigawatts • OpenJ9 Snapshot+Restore • https:/ /danheidinga.github.io/Everyone_wants_fast_startup • CRaC: Coordinated Restore at Checkpoint • https:/ /github.com/CRaC/docs#crac • https:/ /openjdk.java.net/projects/crac/

Slide 19

Slide 19 text

GraalVM

Slide 20

Slide 20 text

@evacchi GraalVM • GraalVM is an umbrella of technologies • A just-in-time compiler • The Truffle framework to implement dynamic languages • they can be seamlessly JITted across language boundaries. • SubstrateVM: the native image builder • reuses the compilation backend for Ahead-of-Time compilation • static init • image heap

Slide 21

Slide 21 text

@evacchi Native Image Restrictions • Native binary compilation • Restriction: “closed-world assumption” • Limitations on reflection • No dynamic code loading: forbidden ClassLoader#defineClass(...byte[]...) • Allows more aggressive optimization (e.g, dead code elimination) • Static initializers may be eager* ! • Evaluated at build time ! * originally opt-out, now opt-in. In some cases default on (e.g. Quarkus) Build Time static void Main Framework Initialization Application Initialization Run Time

Slide 22

Slide 22 text

Image Heap Generation

Slide 23

Slide 23 text

Image-Gen Heap That is another embarrassing pun

Slide 24

Slide 24 text

@evacchi • We run parts of an application at build time and snapshot the objects allocated by this initialization code, using an iterative approach that is intertwined with points-to analysis. • We use points-to analysis results to only AOT-compile the parts of an application that are reachable at run time. “ Source: Initialize Once, Start Fast: Application Initialization at Build Time (Wimmer et al. OOPSLA 2019)

Slide 25

Slide 25 text

@evacchi Static initializers

Slide 26

Slide 26 text

@evacchi Static initializers public class Example { static { System.out.println("hello"); } public static void main(String... args) { System.out.println("world"); } }

Slide 27

Slide 27 text

@evacchi Static initializers $ java Example hello world

Slide 28

Slide 28 text

@evacchi Static initializers $ native-image --initialize-at-build-time Example [example:23074] classlist: 1,032.11 ms, 1.18 GB [example:23074] (cap): 2,301.26 ms, 1.18 GB [example:23074] setup: 3,609.57 ms, 1.69 GB hello [example:23074] (clinit): 82.45 ms, 1.73 GB [example:23074] (typeflow): 3,032.00 ms, 1.73 GB [example:23074] (objects): 2,923.76 ms, 1.73 GB [example:23074] (features): 129.59 ms, 1.73 GB [example:23074] analysis: 6,307.81 ms, 1.73 GB [example:23074] universe: 277.17 ms, 1.73 GB [example:23074] (parse): 525.88 ms, 1.73 GB [example:23074] (inline): 877.57 ms, 1.78 GB [example:23074] (compile): 3,842.94 ms, 1.87 GB [example:23074] compile: 5,504.45 ms, 1.87 GB [example:23074] image: 463.22 ms, 1.87 GB [example:23074] write: 176.80 ms, 1.87 GB [example:23074] [total]: 17,528.27 ms, 1.87 GB

Slide 29

Slide 29 text

@evacchi Static initializers $ native-image --initialize-at-build-time Example [example:23074] classlist: 1,032.11 ms, 1.18 GB [example:23074] (cap): 2,301.26 ms, 1.18 GB [example:23074] setup: 3,609.57 ms, 1.69 GB hello [example:23074] (clinit): 82.45 ms, 1.73 GB [example:23074] (typeflow): 3,032.00 ms, 1.73 GB [example:23074] (objects): 2,923.76 ms, 1.73 GB [example:23074] (features): 129.59 ms, 1.73 GB [example:23074] analysis: 6,307.81 ms, 1.73 GB [example:23074] universe: 277.17 ms, 1.73 GB [example:23074] (parse): 525.88 ms, 1.73 GB [example:23074] (inline): 877.57 ms, 1.78 GB [example:23074] (compile): 3,842.94 ms, 1.87 GB [example:23074] compile: 5,504.45 ms, 1.87 GB [example:23074] image: 463.22 ms, 1.87 GB [example:23074] write: 176.80 ms, 1.87 GB [example:23074] [total]: 17,528.27 ms, 1.87 GB

Slide 30

Slide 30 text

@evacchi Static initializers $ ./example world

Slide 31

Slide 31 text

@evacchi Static initializers public class Example { static { System.out.println("hello"); } public static void main(String... args) { System.out.println("world"); } }

Slide 32

Slide 32 text

@evacchi Static initializers public class Example { static { System.out.println("hello"); } public static void main(String... args) { System.out.println("world"); } } A string constant

Slide 33

Slide 33 text

@evacchi Static initializers public class Example { static { System.out.println("hello"); } public static void main(String... args) { System.out.println("world"); } } A string constant A method invocation Over a PrintStream

Slide 34

Slide 34 text

@evacchi Static initializers public class Example { static { System.out.println("hello"); } public static void main(String... args) { System.out.println("world"); } } A string constant A method invocation A field resolution Over a subtype of OutputStream

Slide 35

Slide 35 text

@evacchi Static initializers public class Example { static { System.out.println("hello"); } public static void main(String... args) { System.out.println("world"); } } A string constant A method invocation A field resolution A static class initializer Over a subtype of OutputStream

Slide 36

Slide 36 text

@evacchi Initialization Code First, class initializers are executed. • In Java, every class can have a class initializer ("static initializer") • represented as a method named in the class file. • It computes the initial value of static fields. • The developer decides which classes are initialized at image build time

Slide 37

Slide 37 text

@evacchi Heap Snapshotting • Builds an object graph i.e., the transitive closure of reachable objects • starts with root pointers e.g. static fields. • This object graph is written into the native image as the image heap

Slide 38

Slide 38 text

@evacchi Heap Snapshotting • Builds an object graph i.e., the transitive closure of reachable objects • starts with root pointers e.g. static fields. • This object graph is written into the native image as the image heap

Slide 39

Slide 39 text

@evacchi Points-To Analysis • determine which classes, methods, and fields are reachable at run time. • starts with all entry points, e.g., the main method of the application, • iteratively processes all transitively reachable methods until a fixed point is reached

Slide 40

Slide 40 text

@evacchi Points-To Analysis (Example) • System.out.println("hello") • java.lang.String • System.out • java.io.PrintStream • java.io.FilterOutputStream • java.io.OutputStream • System

Slide 41

Slide 41 text

@evacchi Ahead-of-Time Compilation • methods marked as reachable by the points-to analysis • placed in the text section of the executable.

Slide 42

Slide 42 text

@evacchi Image Heap at Run-Time • Execution at run-time starts with an already pre-populated Java heap • Relocatable: references relative to the start of the image heap • Objects of the image heap and objects allocated at run-time • i.e., also objects allocated at run time use relative references • (use of a fixed register r14 on x64 architectures). Build Time static void Main Framework Initialization Application Initialization Run Time

Slide 43

Slide 43 text

@evacchi https:/ /twitter.com/reibitto/status/1384795560436113415 Perils of Static Initialization

Slide 44

Slide 44 text

@evacchi

Slide 45

Slide 45 text

Project Leyden

Slide 46

Slide 46 text

@evacchi Project Leyden • Goals • Address Java’s slow startup time • Reduce time to peak performance • Reduce memory footprint • Introduce static images at spec level (TCK) • stand-alone • closed-world

Slide 47

Slide 47 text

@evacchi Project Leyden • Goals • Address Java’s slow startup time • Reduce time to peak performance • Reduce memory footprint • Introduce static images at spec level (TCK) • stand-alone • closed-world “a spectrum of constraints”

Slide 48

Slide 48 text

@evacchi Qbicc • Experimental sandbox project for Leyden • Intended for compiler developers and experts • Goal: prototype approaches to native Java • New self-contained codebase • Allows to experiment with different trade-offs • GraalVM’s choices are known, • possible to explore different trade-offs of the solution space • Currently: Java-based compiler to LLVM IR • Future: different backend? (e.g. C2) https://github.com/qbicc/qbicc https://github.com/qbicc/qbicc/discussions https://qbicc.zulipchat.com

Slide 49

Slide 49 text

@evacchi Qbicc: Architecture • Points-to analysis (static entry points) • Flow graph copied between phases, dropping unreachable nodes • Approaches to static init being investigated ADD ANALYZE LOWER GENERATE ● TRANSFORM ● CORRECT ● OPTIMIZE ● INTEGRITY ● TRANSFORM ● CORRECT ● OPTIMIZE ● INTEGRITY ● TRANSFORM ● CORRECT ● OPTIMIZE ● INTEGRITY ● TRANSFORM ● CORRECT ● OPTIMIZE ● INTEGRITY

Slide 50

Slide 50 text

@evacchi mmap + offset Qbicc Build-time serialization + Fast deserialization routines initially static first + opt-out now runtime first + opt-in Qbicc as close to “all build-time” as possible investigating explicit opt-in (build-time, run-time, reinit) (code hints? annotations? language changes?) Qbicc: Static Initialization Trade-Offs

Slide 51

Slide 51 text

Further Resources

Slide 52

Slide 52 text

@evacchi References David Lloyd (J4K 2021) qbicc: Exploring the possibilities of Java native images Andrew Dinn (2021) Leyden: Lessons from Graal Native Static Java, GraalVM Native and OpenJDK C. Wimmer et al. (OOPSLA 2019) Initialize Once, Start Fast: Application Initialization at Build Time Dan Heidinga (QCon Plus 2021) Starting Fast and Recent Blog Posts Cover Art by François Baranger Duke Art at OpenJDK Wiki @evacchi