Executing C, C++ and Fortran Efficiently on the Java Virtual Machine via LLVM IR Manuel Rigger Johannes Kepler University Linz, Austria Computer Laboratory Programming Research Group Seminar, University of Cambridge, 2 March 2018
Why? • Unchecked accesses • Manual memory management • Undefined behavior • Many existing safer alternatives are based on “unsafe” compilers or binary code 3
Why? • Unchecked accesses • Manual memory management • Undefined behavior • Many existing safer alternatives are based on “unsafe” compilers or binary code 3 Buffer overflows are still a serious problem
Why? • Unchecked accesses • Manual memory management • Undefined behavior • Many existing safer alternatives are based on “unsafe” compilers or binary code 3 A sufficiently advanced compiler is indistinguishable from an adversary. – John Regehr (https://blog.regehr.org)
Why the Java Virtual Machine? 4 Sandboxed execution Garbage collection Existing JIT compiler Safe implementation language Part of the multi-lingual GraalVM
Sulong as Part of GraalVM 5 Substrate VM Java HotSpot VM JVM Compiler Interface (JVMCI) JEP 243 Graal Compiler Truffle Framework http://www.oracle.com/technetwork/oracle-labs/program-languages
Truffle and Graal Contributors 8 Oracle Danilo Ansaloni Stefan Anzinger Cosmin Basca Daniele Bonetta Matthias Brantner Petr Chalupa Jürgen Christ Laurent Daynès Gilles Duboscq Martin Entlicher Bastian Hossbach Christian Humer Mick Jordan Vojin Jovanovic Peter Kessler David Leopoldseder Kevin Menard Jakub Podlešák Aleksandar Prokopec Tom Rodriguez Oracle (continued) Roland Schatz Chris Seaton Doug Simon Štěpán Šindelář Zbyněk Šlajchrt Lukas Stadler Codrut Stancu Jan Štola Jaroslav Tulach Michael Van De Vanter Adam Welc Christian Wimmer Christian Wirth Paul Wögerer Mario Wolczko Andreas Wöß Thomas Würthinger JKU Linz Prof. Hanspeter Mössenböck Benoit Daloze Josef Eisl Thomas Feichtinger Matthias Grimmer Christian Häubl Josef Haider Christian Huber Stefan Marr Manuel Rigger Stefan Rumzucker Bernhard Urban Thomas Pointhuber Daniel Pekarek Jacob Kreindl Mario Kahlhofer University of Edinburgh Christophe Dubach Juan José Fumero Alfonso Ranjeet Singh Toomas Remmelg LaBRI Floréal Morandat University of California, Irvine Prof. Michael Franz Gulfem Savrun Yeniceri Wei Zhang Purdue University Prof. Jan Vitek Tomas Kalibera Petr Maj Lei Zhao T. U. Dortmund Prof. Peter Marwedel Helena Kotthaus Ingo Korb University of California, Davis Prof. Duncan Temple Lang Nicholas Ulle University of Lugano, Switzerland Prof. Walter Binder Sun Haiyang Yudi Zheng Oracle Interns Brian Belleville Miguel Garcia Shams Imam Alexey Karyakin Stephen Kell Andreas Kunft Volker Lanting Gero Leinemann Julian Lettner David Piorkowski Gregor Richards Robert Seilbeck Rifat Shariyar Oracle Alumni Erik Eckstein Michael Haupt Christos Kotselidis Hyunjin Lee David Leibs Chris Thalinger Till Westmann
Structure of the Talk Execution and compilation of LLVM IR (Sulong) Memory safety (Safe Sulong) and performance evaluation Introspection to increase the robustness of libraries Challenges of executing C on the Java Virtual Machine 9
LLVM IR Interpreter Truffle LLVM IR Clang C C++ GCC Fortran Other LLVM frontend ... JVM LLVM tools Graal compiler System Overview 11 Manuel Rigger, et al. Bringing low-level languages to the JVM: efficient execution of LLVM IR on Truffle. In Proceedings of VMIL 2016
LLVM IR Interpreter Truffle LLVM IR Clang C C++ GCC Fortran Other LLVM frontend ... JVM LLVM tools Graal compiler System Overview 11 Manuel Rigger, et al. Bringing low-level languages to the JVM: efficient execution of LLVM IR on Truffle. In Proceedings of VMIL 2016
LLVM IR Interpreter Truffle LLVM IR Clang C C++ GCC Fortran Other LLVM frontend ... JVM LLVM tools Graal compiler System Overview 11 Manuel Rigger, et al. Bringing low-level languages to the JVM: efficient execution of LLVM IR on Truffle. In Proceedings of VMIL 2016
LLVM IR Interpreter Truffle LLVM IR Clang C C++ GCC Fortran Other LLVM frontend ... JVM LLVM tools Graal compiler System Overview 11 Manuel Rigger, et al. Bringing low-level languages to the JVM: efficient execution of LLVM IR on Truffle. In Proceedings of VMIL 2016
LLVM IR Interpreter Truffle LLVM IR Clang C C++ GCC Fortran Other LLVM frontend ... JVM LLVM tools Graal compiler System Overview 11 Manuel Rigger, et al. Bringing low-level languages to the JVM: efficient execution of LLVM IR on Truffle. In Proceedings of VMIL 2016
LLVM IR Program Interpret the program Execute the compiled code Deoptimize Compile often executed function Create executable interpreter nodes Executing LLVM IR with Sulong 13
15 write %2 add read %i.0 1 Abstract Syntax Tree class LLVMI32LiteralNode extends LLVMExpressionNode { final int literal; public LLVMI32LiteralNode(int literal) { this.literal = literal; } @Override public int executeI32(VirtualFrame frame) { return literal; } } Executable AST node Nodes return their result in an execute() method Implementation of Operations
16 Abstract Syntax Tree @NodeChildren({@NodeChild("leftNode"), @NodeChild("rightNode")}) class LLVMI32AddNode extends LLVMExpressionNode { @Specialization protected int executeI32(int left, int right) { return left + right; } } Executable AST node write %2 add read %i.0 1 A DSL allows a declarative style of specifying and executing nodes Implementation of Operations
17 Abstract Syntax Tree @NodeChild("valueNode") class LLVMWriteI32Node extends LLVMExpressionNode { final FrameSlot slot; public LLVMWriteI32Node(FrameSlot slot) { this.slot = slot; } @Specialization public void writeI32(VirtualFrame frame, int value) { frame.setInt(slot, value); } } Executable AST node write %2 add read %i.0 1 Local variables are represented by an array-like VirtualFrame object Implementation of Operations
Executing LLVM IR with Sulong 23 LLVM IR Program Interpret the program Execute the compiled code Deoptimize Compile often executed function Create executable interpreter nodes
LLVM IR Program Interpret the program Execute the compiled code Deoptimize Compile often executed function Create executable interpreter nodes Executing LLVM IR with Sulong 48
Deoptimization • Truffle nodes can implement speculative assumptions • A failed assumption requires discarding the machine code and continuing execution in the interpreter 49
Node Rewriting in Truffle 50 U U U U U I I I G G I I I G G Node Specialization for Profiling Feedback AST Interpreter Specialized Nodes AST Interpreter Uninitialized Nodes Compilation using Partial Evaluation Compiled Code Node Transitions S U I D G Uninitialized Integer Generic Double String
Node Rewriting in Truffle 51 I I I G G I I I G G Transfer back to AST Interpreter D I D G G D I D G G Node Specialization to Update Profiling Feedback Recompilation using Partial Evaluation
Speculative Optimization: Value Profiling 52 public class LLVMI32LoadNode extends LLVMExpressionNode { final int expectedValue; // observed value @Specialization protected int doI32(Address addr) { int val = memory.getI32(addr); if (val == expectedValue) { return expectedValue; } else { CompilerDirectives.transferToInterpreter(); replace(new LLVMI32LoadGenericNode()); return val; } } } The compiler can assume that the loaded value is constant
Polymorphic Inline Caches for Indirect Calls 53 int inc(int val) { return val + 1; } int dec(int val) { return val - 1; } int square(int val) { return val * val; } int (*func)(int); // ... result = func(4); uninit call inc
int inc(int val) { return val + 1; } int dec(int val) { return val - 1; } int square(int val) { return val * val; } int (*func)(int); // ... result = func(4); Polymorphic Inline Caches for Indirect Calls 54 call inc uninit call Enables inlining of indirect calls inc dec
int inc(int val) { return val + 1; } int dec(int val) { return val - 1; } int square(int val) { return val * val; } int (*func)(int); // ... result = func(4); call inc call dec uninit call Polymorphic Inline Caches for Indirect Calls 55 inc dec square
Polymorphic Inline Caches for Indirect Calls 56 indirect call int inc(int val) { return val + 1; } int dec(int val) { return val - 1; } int square(int val) { return val * val; } int (*func)(int); // ... result = func(4); inc dec square Can be used to optimize virtual calls in C++
Handling of Allocations in the User Program 58 int *arr = malloc(4 * sizeof(int)) Native Sulong: unmanaged allocations (sun.misc.Unsafe) https://github.com/graalvm/sulong Safe Sulong: managed allocations unsafe.allocateMemory(16); Address offset=0 data I32Array contents {0, 0, 0} Rigger, et al. Sulong, and Thanks For All the Bugs: Finding Errors in C Programs by Abstracting from the Native Execution Model In Proceedings of ASPLOS 2018
Handling of Allocations in the User Program 58 int *arr = malloc(4 * sizeof(int)) Native Sulong: unmanaged allocations (sun.misc.Unsafe) https://github.com/graalvm/sulong Safe Sulong: managed allocations unsafe.allocateMemory(16); Address offset=0 data I32Array contents {0, 0, 0} Rigger, et al. Sulong, and Thanks For All the Bugs: Finding Errors in C Programs by Abstracting from the Native Execution Model In Proceedings of ASPLOS 2018
Handling of Allocations in the User Program 58 int *arr = malloc(4 * sizeof(int)) Native Sulong: unmanaged allocations (sun.misc.Unsafe) https://github.com/graalvm/sulong Safe Sulong: managed allocations unsafe.allocateMemory(16); Address offset=0 data I32Array contents {0, 0, 0} Rigger, et al. Sulong, and Thanks For All the Bugs: Finding Errors in C Programs by Abstracting from the Native Execution Model In Proceedings of ASPLOS 2018
Allocations in the User Program Unmanaged allocations + Interoperability with native libraries + Fallback for programs that make assumptions about the memory layout - No safety guarantees Managed Allocations + Sandboxed execution - Native interoperability 59
Type Hierarchy for Managed Objects 60 Automatic bounds, types, and null pointer checks! ManagedObject ManagedAddress pointee: ManagedObject pointerOffset: int I32Array values: int[] Function functionIndex: int I32 value: int Struct values: Dictionary
Safe Semantics • We assign semantics to otherwise undefined behavior Java semantics • Invalid memory accesses are not optimized away 63 Rigger, et al. Lenient Execution of C on a Java Virtual Machine: or: How I Learned to Stop Worrying and Run the Code. In Proceedings of ManLang 2017 int a = 1, b = INT_MAX; int val = a + b; printf("%d\n", val); UB
Found Errors • 68 errors in small open-source projects • Some of these are not found by LLVM’s AddressSanitizer and Valgrind 64 int main(int argc, char** argv) { printf("%d %s\n", argc, argv[5]); } Out-of-bounds accesses to argv
Introspection Functions 68 size_left size_right sizeof(int) * 10 int *arr = malloc(sizeof (int) * 10) ; int *ptr = &(arr[4]); printf ("%ld\n", size_left(ptr)); // prints 16 printf ("%ld\n", size_right(ptr)); // prints 24 We also expose other meta data such as object types Rigger, et al. Introspection for C and its Applications to Library Robustness. In Programming 2018
Improve availability of the system • Case study on real-world bugs (Dnsmasq, Libxml2, GraphicsMagick) • Insight: most applications stay fully functional when the buffer overflow is mitigated • Drawback: Sulong still aborts execution for missing introspection checks. 71
Introspection is applicable for many other bug-finding tools • We also implemented it in • GCC’s Intel MPX based bounds checks instrumentation • LLVM’s Asan • SoftBound 74 ssize_t _size_right(void* p){ ssize_t upper_bounds = (ssize_t)__builtin___bnd_get_ptr_ubound(p); size_t size = (size_t) (upper_bounds + 1) - (size_t) p; return (ssize_t) size; }
C Projects Consist of More Than C Code 76 public abstract static class LLVMAMD64RdtscReadNode extends LLVMExpressionNode { public long executeRdtsc() { return System.nanoTime(); } } asm("rdtsc":"=a"(tickl),"=d"(tickh));
C Projects Consist of More Than C Code 77 Instructions In % of projects rdtsc 27.4% cpuid 25.4% mov 24.9% 21.8% lock xchg 14.2% … … We determined the usage of inline assembly to prioritize the implementation in Sulong Rigger, et al. An Analysis of x86-64 Inline Assembly in C Programs. In VEE 2018
C Projects Consist of More Than C Code 78 public abstract static class CountLeadingZeroesI64Node extends LLVMExpressionNode { public long executeRdtsc(long val) { return Long.numberOfLeadingZeroes(val); } } __builtin_clz(num);
GCC builtins 79 We are currently investigating the usage of GCC builtins Builtins In % of projects __builtin_expect 48.2% __builtin_clz 29.3% __builtin_bswap32 26.2% __builtin_constant_p 23.3% __builtin_alloca 20.3% … …
Running a complete libc 83 public class LLVMAMD64SyscallGetcwdNode { @Specialization protected long doOp(LLVMAddress buf, long size) { String cwd = LLVMPath.getcwd(); if (cwd.length() >= size) { return -LLVMAMD64Error.ERANGE; } else { LLVMString.strcpy(buf, cwd); return cwd.length() + 1; } } } Emulate the Linux syscall API