Slide 1

Slide 1 text

Memory API: Patterns, Uses Cases, and Performance From the Panama Foreign Functions and Memory API José Paumard Java Developer Advocate Java Platform Group Rémi Forax Maître de conferences Université Gustave Eiffel

Slide 2

Slide 2 text

https://twitter.com/Nope! https://github.com/forax https://speakerdeck.com/forax OpenJDK, ASM, Tatoo, Pro, etc… One of the Father of invokedynamic (Java 7) Lambda (Java 8), Module (Java 9) Constant dynamic (Java 11) Record, text blocks, sealed types (Java 14 / 15) Valhalla (Java 25+)

Slide 3

Slide 3 text

https://twitter.com/JosePaumard https://github.com/JosePaumard https://www.youtube.com/c/JosePaumard01 https://www.youtube.com/user/java https://www.youtube.com/hashtag/jepcafe https://fr.slideshare.net/jpaumard https://www.pluralsight.com/authors/jose- paumard https://dev.java

Slide 4

Slide 4 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 4 https://dev.java/

Slide 5

Slide 5 text

10/7/2024 Copyright © 2023, Oracle and/or its affiliates 5 Tune in! Inside Java Newscast JEP Café Road To 21 series Inside.java Inside Java Podcast Sip of Java Cracking the Java coding interview

Slide 6

Slide 6 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 6 https://openjdk.org/ OpenJDK is the place where it all happens

Slide 7

Slide 7 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 7 https://jdk.java.net/ OpenJDK is the place where it all happens

Slide 8

Slide 8 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 8 Panama

Slide 9

Slide 9 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 9 First Preview in the JDK 14

Slide 10

Slide 10 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 10 Final Feature in the JDK 22

Slide 11

Slide 11 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 11 Final Feature in the JDK 22 https://openjdk.org/projects/panama/

Slide 12

Slide 12 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 12 Demo time!

Slide 13

Slide 13 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 13 Heal the rift between Java and C Fixing issues in the Java NIO API Namely, fix and update what you can do with ByteBuffer ByteBuffer where released in Java 4, in 2002 What is Panama About?

Slide 14

Slide 14 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 14 New assert keyword Exception chaining XML Parser Java NIO! (New Input / Output, JSR 51) What’s New in Java 4? (2002)

Slide 15

Slide 15 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 15 New assert keyword Exception chaining XML Parser Java NIO What’s New in Java 4?

Slide 16

Slide 16 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 16 Dynamic Web Site in 2002 IE 5.5 SP2? IE 6? ActiveXObject("Microsoft.XMLHTTP") Tomcat 3 Maybe 4?

Slide 17

Slide 17 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 17 The World Before Java 4 heap memory native memory write() read() array[i] array[i] Servlet doPost() doGet() ... COPY!

Slide 18

Slide 18 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 18 The World with NIO heap memory native memory write() read() buf.put() buf.get() ByteBuffer position limit capacity ... Servlet doPost() doGet() ...

Slide 19

Slide 19 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 19 Off-heap allocation File mapping Creating a ByteBuffer var buffer = ByteBuffer.allocateDirect(1_024); // int var buffer = FileChannel.map( READ_WRITE, position, size); // longs

Slide 20

Slide 20 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 20 Issues with ByteBuffers

Slide 21

Slide 21 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 21 Demo time!

Slide 22

Slide 22 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 22 Too high level for a Memory Access API position, capacity, reset are not needed 32 bits indexing only Allow for unaligned access, but may be very slow Non-deterministic deallocation! closing a mapped file does not close the ByteBuffer Issues with the ByteBuffer API

Slide 23

Slide 23 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 23 The GC - selects a region containing the ByteBuffer - then sees that the ByteBuffer is dead - then a Cleaner code (weakref) is pushed to a Cleaner queue Later, a cleaner thread dequeues the Cleaner code and calls free on the off-heap memory (or not…) How is Deallocation Working?

Slide 24

Slide 24 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 24 Panama brings a new API At a lower level than the ByteBuffer API The goal is to fix these issues Called the MemorySegment API Welcome to Panama

Slide 25

Slide 25 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 25 Introducing MemorySegment

Slide 26

Slide 26 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 26 A MemorySegment: - is safe (cannot be used once freed) - gives you control over the allocation / deallocation - brings close to C performance - offers direct access, indexed 64 bits access, structured access - opt-in unsafe access (for C interop, may crash later) - retrofit ByteBuffer on top MemorySegment

Slide 27

Slide 27 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 27 It is unsafe! - Close to C performance - No use after free protection (security) - Can peek/poke everywhere (may crash later) - No null check for on heap array access (may crash) Memory access methods are - deprecated for removal (JEP 471) - warnings since 2006 What about sun.misc.Unsafe?

Slide 28

Slide 28 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 28 Demo time!

Slide 29

Slide 29 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 29 Most CPU require your data to be aligned in memory These are properly aligned Alignement 0x00FFA000 0x00FFA004 0x00FFA008 0x00FFA00C 0x00FFA010 byte short int short byte

Slide 30

Slide 30 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 30 Most CPU require your data to be aligned in memory These are misaligned Alignement 0x00FFA000 0x00FFA004 0x00FFA008 0x00FFA00C 0x00FFA010 short int

Slide 31

Slide 31 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 31 Off-heap allocation, direct access The MemorySegment API var arena = Arena.global(); var segment = arena.allocation(1_024); // long, off-heap segment.set(ValueLayout.JAVA_INT, 4L, 42); var value = segment.get(ValueLayout.JAVA_INT, 4L);

Slide 32

Slide 32 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 32 Heap allocation, indexed access The MemorySegment API var ints = new int[] {1, 2, 3, 4}; var segment = MemorySegment.ofArray(ints); // on-heap segment.setAtIndex(ValueLayout.JAVA_INT, 2L, 65); var cell = segment.getAtIndex(ValueLayout.JAVA_INT, 2L);

Slide 33

Slide 33 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 33 Setting strings adds a trailing \0 The MemorySegment API: String var segment = Arena.global().allocate(512L); println("segment " + segment); segment.setString(0, "hello", UTF_8); // adds a '\0’ var text = segment.getString(0, UTF_8); println(text);

Slide 34

Slide 34 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 34 Copy from on-heap to off-heap The MemorySegment API: File Mapping var ints = new int[] {1, 2, 3, 4}; var arraySegment = MemorySegment.ofArray(ints); // on-heap var offHeapSegment = Arena.global().allocate(64L); // off-heap offHeapSegment.copyFrom(arraySegment);

Slide 35

Slide 35 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 35 Writing data to a mapped file: you need a ByteBuffer The MemorySegment API: File Mapping var ints = new int[] {1, 2, 3, 4}; var arraySegment = MemorySegment.ofArray(ints); // on-heap var offHeapSegment = Arena.global().allocate(64L); // off-heap offHeapSegment.copyFrom(arraySegment); var byteBuffer = offHeapSegment.asByteBuffer(); // this is a view! byteBuffer.limit( ints.length * (int) ValueLayout.JAVA_INT.byteSize()); try (var file = FileChannel.open(path, CREATE, WRITE)) { file.write(byteBuffer); }

Slide 36

Slide 36 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 36 Introducing Arena to Allocate / Deallocate

Slide 37

Slide 37 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 37 Demo time!

Slide 38

Slide 38 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 38 An Arena can create off-heap memory segments It initializes memory segment with zeroes It is AutoCloseable (more on this in a mn) It deallocates the memory segments it created on close() Introducing Arena

Slide 39

Slide 39 text

10/7/2024 Copyright © 2023, Oracle and/or its affiliates 39 There are four of them: What is this Arena object? var global = Arena.global(); // singleton var confined = Arena.ofConfined(); var shared = Arena.ofShared(); var auto = Arena.ofAuto(); // supports legacy // ByteBuffer semantics

Slide 40

Slide 40 text

10/7/2024 Copyright © 2023, Oracle and/or its affiliates 40 There are four of them: What is this Arena object? Closeable Bounded Lifetime Shared among threads Confined Yes Yes No Shared Yes Yes Yes Global No No Yes Auto No Yes Yes

Slide 41

Slide 41 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 41 int[] vs memorySegment.get(JAVA_INT, ...) Benchmarks Array 0.728 ± 0.009 ns/op OfArray 1.358 ± 0.003 ns/op Confined 1.255 ± 0.002 ns/op Auto 1.254 ± 0.002 ns/op Shared 1.254 ± 0.013 ns/op Global 1.258 ± 0.026 ns/op Unsafe 0.627 ± 0.001 ns/op

Slide 42

Slide 42 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 42 Looping and summing 512 ints Benchmarks Array 128.338 ± 0.084 ns/op OfArray 131.927 ± 0.761 ns/op Confined 131.829 ± 0.077 ns/op Auto 131.832 ± 0.491 ns/op Shared 131.760 ± 0.068 ns/op Global 131.727 ± 0.137 ns/op Unsafe 128.083 ± 0.131 ns/op

Slide 43

Slide 43 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 43 Access time is independent of the type of arena For random direct access - Overhead is important (2x) - 3 checks 1. Is it the right thread? 2. Has the Arena been Closed? 3. Is access in bounds? Performance

Slide 44

Slide 44 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 44 Access time is independent of the type of arena For loop + indexed access - Fixed cost at the beginning of the loop - 3 Checks are hoisted out of the loop 1. Is it the right thread? Is done once 2. Has the Arena been Closed? Is done once 3. Is access in bounds? Is elided Performance

Slide 45

Slide 45 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 45 Memory fragmentation Application integrity JExtract Memory Layout Structured memory access with offsets and VarHandle Using Records to create a simple access API After the Break

Slide 46

Slide 46 text

Coffee Break!

Slide 47

Slide 47 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 47 About Arenas: - Different arena types with different semantics - Same access time What about allocation / deallocation? Welcome Back!

Slide 48

Slide 48 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 48 Allocation/Deallocation of an Arena + MemorySegment Benchmarks Array 2.522 ± 0.015 ns/op OfArray 6.494 ± 0.093 ns/op Confined 82.287 ± 1.530 ns/op Unsafe (malloc) 22.834 ± 0.097 ns/op Unsafe with init 72.338 ± 0.390 ns/op

Slide 49

Slide 49 text

10/7/2024 Copyright © 2023, Oracle and/or its affiliates 49 With two arenas How Do Arenas Allocate Segments? var arena1 = Arena.ofConfined(); var s11 = arena1.allocate(...); heap memory native memory Arena JVM s11

Slide 50

Slide 50 text

10/7/2024 Copyright © 2023, Oracle and/or its affiliates 50 With two arenas How Do Arenas Allocate Segments? var arena1 = Arena.ofConfined(); var s11 = arena1.allocate(...); var s12 = arena1.allocate(...); heap memory native memory Arena JVM s11 s12

Slide 51

Slide 51 text

10/7/2024 Copyright © 2023, Oracle and/or its affiliates 51 With two arenas How Do Arenas Allocate Segments? var arena1 = Arena.ofConfined(); var arena2 = Arena.ofConfined(); var s21 = arena2.allocate(...); var s11 = arena1.allocate(...); var s22 = arena2.allocate(...); var s12 = arena1.allocate(...); var s23 = arena2.allocate(...); var s24 = arena2.allocate(...); memory s22 s21 s23 s22 s24 s21

Slide 52

Slide 52 text

10/7/2024 Copyright © 2023, Oracle and/or its affiliates 52 Then arena1 is closed And all its memory segments are deallocated How Do Arenas Allocate Segments? memory arena1.close(); s22 s23 s24 s21

Slide 53

Slide 53 text

10/7/2024 Copyright © 2023, Oracle and/or its affiliates 53 Then arena3 is created How Do Arenas Allocate Segments? memory var arena3 = Arena.ofConfined(); var s31 = arena3.allocate(...); s22 s23 s24 s31 s21

Slide 54

Slide 54 text

10/7/2024 Copyright © 2023, Oracle and/or its affiliates 54 Then arena3 is created How Do Arenas Allocate Segments? memory var arena3 = Arena.ofConfined(); var s31 = arena3.allocate(...); var s32 = arena3.allocate(...); s32 s22 s23 s24 s31 s21 X X

Slide 55

Slide 55 text

10/7/2024 Copyright © 2023, Oracle and/or its affiliates 55 Then arena3 is created And you end up with fragmentation! How Do Arenas Allocate Segments? memory var arena3 = Arena.ofConfined(); var s31 = arena3.allocate(...); var s32 = arena3.allocate(...); s31 s22 s23 s24 s31 s21

Slide 56

Slide 56 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 56 You end up with holes in your memory It leads to two problems: - Allocation is slow, you need to find a large enough space for your memory segment - Not enough contiguous free memory may prevent the creation of a large memory segment Fragmentation

Slide 57

Slide 57 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 57 Allocation/Deallocation of an Arena + MemorySegment Benchmarks Array 2.522 ± 0.015 ns/op OfArray 6.494 ± 0.093 ns/op Confined 82.287 ± 1.530 ns/op Auto 434.694 ± 247.868 ns/op Shared 6696.144 ± 37.833 ns/op Unsafe (malloc) 22.834 ± 0.097 ns/op Unsafe with init 72.338 ± 0.390 ns/op

Slide 58

Slide 58 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 58 ofAuto(): close() has the same semantics as ByteBuffer Slooooow! In the worst case scenario it calls System.gc() (even slooooooower!) Deallocation / close()

Slide 59

Slide 59 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 59 ofShared(): you want to avoid having a volatile access in get() (to know if the arena has been closed) close() ask all threads to go to a (GC) safepoint checks method on top of the stack is annotated as performing an access checks if the locals contains the closing arena ⇒ Linear with the number of platform threads Deallocation / close()

Slide 60

Slide 60 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 60 Demo time!

Slide 61

Slide 61 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 61 Application Integrity

Slide 62

Slide 62 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 62 Memory Integrity is a Big Deal!

Slide 63

Slide 63 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 63 Memory segments are safe by default - even creating a memory segment from a long is safe (byteSize is 0) MemorySegment.reinterpret(newSize) - opt-in to unsafe - requires --enable-native-access on the command line - emits a warning in Java 23, will be an error in the future Unsafe MemorySegment

Slide 64

Slide 64 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 64 Draft JEP: Integrity by Default

Slide 65

Slide 65 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 65 JEP 261: Module System JEP 260: Encapsulate Most Internal APIs JEP 396: Strongly Encapsulate JDK Internals by Default JEP 403: Strongly Encapsulate JDK Internals JEP 451: Prepare to Disallow the Dynamic Loading of Agents JEP 471: Deprecate the Memory-Access Methods in sun.misc.Unsafe for Removal JEP 472: Prepare to Restrict the Use of JNI Draft JEP: Integrity by Default

Slide 66

Slide 66 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 66 Demo time!

Slide 67

Slide 67 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 67 JExtract and MemoryLayout

Slide 68

Slide 68 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 68 Simplify C interoperability - JExtract takes a .h file and creates java classes from it - It creates one class for the .h with the function definitions - Then one per struct What is JExtract?

Slide 69

Slide 69 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 69 Uses LLVM internally To correctly parse C declarations And to extract platform/OS definitions (eg: what is the size of an int) It is an external tool, that needs to be downloaded separately JExtract 22 works fine with the JDK 23+ What is JExtract?

Slide 70

Slide 70 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 70 It is an interface that describe a piece of memory What is MemoryLayout? MemoryLayout SequencedLayout GroupLayout PaddingLayout ValueLayout StructLayout UnionLayout

Slide 71

Slide 71 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 71 Defining a struct Point A MemoryLayout can be Named var pointLayout = MemoryLayout.structLayout( ValueLayout.JAVA_INT, ValueLayout.JAVA_INT );

Slide 72

Slide 72 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 72 Size of the struct Point, offset of x and y in Point MemoryLayout Size and Offset var pointLayout = MemoryLayout.structLayout( ValueLayout.JAVA_INT, ValueLayout.JAVA_INT ); long pointLayoutSize = pointLayout.byteSize(); long xOffset = pointLayout.byteOffset(0); // by index

Slide 73

Slide 73 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 73 Size of the struct Point, offset of x and y in Point MemoryLayout Size and Offset var pointLayout = MemoryLayout.structLayout( ValueLayout.JAVA_INT.withName("x"), ValueLayout.JAVA_INT.withName("y") ).withName("point"); long pointLayoutSize = pointLayout.byteSize(); long xOffset = pointLayout.byteOffset(0); // by index long yOffset = pointLayout.byteOffset("y"); // by offset

Slide 74

Slide 74 text

10/7/2024 Copyright © 2023, Oracle and/or its affiliates 74 Sometimes memory layouts need padding Alignement and Padding struct { char kind; int payload; char extra; } 0 32 ki 64 16 48 payload padding static final MemoryLayout LAYOUT = MemoryLayout.structLayout( ValueLayout.JAVA_BYTE.withName("kind"), MemoryLayout.paddingLayout(3), ValueLayout.JAVA_INT.withName("payload"), ValueLayout.JAVA_BYTE.withName("extra"), MemoryLayout.paddingLayout(3) ); 80 96 ex padding

Slide 75

Slide 75 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 75 Demo time!

Slide 76

Slide 76 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 76 VarHandle

Slide 77

Slide 77 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 77 An object that gives access to fields with different semantics: - a get / set access (plain, opaque, volatile) - and concurrent access: compareAndSet, getAndAdd, … It hides the offset and size computations to access the elements of your memory layout What is a VarHandle?

Slide 78

Slide 78 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 78 Demo time!

Slide 79

Slide 79 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 79 1) The compiler doesn’t do any type checking: it trusts you! (and the different IDE are not there to hep you…) 2) It allows conversion at runtime It can be convenient, but can lead to autoboxing You can use withInvokeExactBehavior() VarHandle Caveats

Slide 80

Slide 80 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 80 Compute the sum of 512 point.x + point.y OfArray with offset 214.479 ± 1.712 ns/op OfArray with VarHandle 141.404 ± 0.081 ns/op Unsafe with offset 137.518 ± 0.476 ns/op Arena with offset 212.881 ± 3.436 ns/op Arena with VarHandle 141.190 ± 0.107 ns/op Benchmarks

Slide 81

Slide 81 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 81 Offset computation by the user is slow VarHandle offers an access pattern to the JVM, which gives you better performance JExtract does not create VarHandles, only offsets Using Offsets or VarHandle?

Slide 82

Slide 82 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 82 Quest For a Simple API

Slide 83

Slide 83 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 83 Automatically maps MemoryLayout to Records, the API is simpler If the JVM can see the record creation and its last use, then the escape analysis can remove the allocation Which will be guaranteed when Valhalla is there! Record Mapping API

Slide 84

Slide 84 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 84 Demo time!

Slide 85

Slide 85 text

10/7/2024 Copyright © 2024, Oracle and/or its affiliates 85 It allows users to use strings for struct field: simpler API. The string must be a constant If not the JIT will generate a cascade of if-else Performance cliff, hard to diagnose Access API

Slide 86

Slide 86 text

Panama rocks!