Slide 1

Slide 1 text

Safe Execution of LLVM-based Languages on the Java Virtual Machine Manuel Rigger Institute for System Software Supervisor: Hanspeter Mössenböck Programming SRC, April 11, 2018

Slide 2

Slide 2 text

Example 2 long buf[50]; buf[50] = 0x832324321;

Slide 3

Slide 3 text

Example 2 long buf[50]; buf[50] = 0x832324321; Unsafe languages (e.g., C) Undefined Behavior

Slide 4

Slide 4 text

Example 2 long buf[50]; buf[50] = 0x832324321; Unsafe languages (e.g., C) Undefined Behavior Unsafe languages do not specify the semantics of erroneous code

Slide 5

Slide 5 text

Buffer Overflows 3 long buf[50]; buf[50] = 0x832324321; Caller s return address buf[49] buf[0] x + 50 x + 58 x + 0

Slide 6

Slide 6 text

Buffer Overflows 4 long buf[50]; buf[50] = 0x832324321; Caller s return address buf[49] buf[0] x + 50 x + 58 x + 0 0x832324321 buf[49] buf[0] x + 50 x + 58 x + 0

Slide 7

Slide 7 text

Buffer Overflows 4 long buf[50]; buf[50] = 0x832324321; Caller s return address buf[49] buf[0] x + 50 x + 58 x + 0 0x832324321 buf[49] buf[0] x + 50 x + 58 x + 0 Attackers can exploit buffer overflows to divert the control flow of the program execve()

Slide 8

Slide 8 text

Buffer Overflows 4 long buf[50]; buf[50] = 0x832324321; Caller s return address buf[49] buf[0] x + 50 x + 58 x + 0 0x832324321 buf[49] buf[0] x + 50 x + 58 x + 0 Attackers can exploit buffer overflows to divert the control flow of the program execve()

Slide 9

Slide 9 text

Safe Languages 5 int[] arr = new int[50]; arr[50] = …

Slide 10

Slide 10 text

Safe Languages 5 Java ArrayIndexOutOfBoundsException int[] arr = new int[50]; arr[50] = …

Slide 11

Slide 11 text

Safe Languages 5 Java ArrayIndexOutOfBoundsException int[] arr = new int[50]; arr[50] = … The Java Virtual Machine (JVM) automatically checks accesses

Slide 12

Slide 12 text

Safe Languages 5 Java ArrayIndexOutOfBoundsException int[] arr = new int[50]; arr[50] = … The Java Virtual Machine (JVM) automatically checks accesses

Slide 13

Slide 13 text

Goal of my PhD 6 Safely and Efficiently Execute Unsafe Languages on the Java Virtual Machine

Slide 14

Slide 14 text

Contributions (Areas) 7 Safe Sulong Safe Sulong, a system to safely and efficiently execute unsafe languages on the Java Virtual Machine

Slide 15

Slide 15 text

Contributions (Areas) 8 Empirical Studies Safe Sulong Safe Sulong, a system to safely and efficiently execute unsafe languages on the Java Virtual Machine Empirical studies on unstandardized constructs in C code to prioritize their implementation in Safe Sulong

Slide 16

Slide 16 text

Contributions (Areas) 9 Intros- pection Empirical Studies Safe Sulong An introspection interface to allow programmers enhance the robustness of their libraries Safe Sulong, a system to safely and efficiently execute unsafe languages on the Java Virtual Machine Empirical studies on unstandardized constructs in C code to prioritize their implementation in Safe Sulong

Slide 17

Slide 17 text

Contribution 1: Safe Sulong 10 Safe Sulong

Slide 18

Slide 18 text

Execution of LLVM IR 11 Safe Execution Platform LLVM IR Clang C C++ GCC Fortran Other LLVM frontend ...

Slide 19

Slide 19 text

Execution of LLVM IR 11 Safe Execution Platform LLVM IR Clang C C++ GCC Fortran Other LLVM frontend ... Lattner, et al. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004

Slide 20

Slide 20 text

Execution of LLVM IR 11 Safe Execution Platform LLVM IR Clang C C++ GCC Fortran Other LLVM frontend ... Lattner, et al. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004

Slide 21

Slide 21 text

Execution of LLVM IR 11 Safe Execution Platform LLVM IR Clang C C++ GCC Fortran Other LLVM frontend ... Lattner, et al. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004 Targeting LLVM IR allows executing several unsafe languages

Slide 22

Slide 22 text

Execution of LLVM IR 11 Safe Execution Platform LLVM IR Clang C C++ GCC Fortran Other LLVM frontend ... Lattner, et al. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004

Slide 23

Slide 23 text

Execution of LLVM IR 12 LLVM IR Interpreter Truffle LLVM IR Graal JVM

Slide 24

Slide 24 text

Execution of LLVM IR 12 LLVM IR Interpreter Truffle LLVM IR Graal JVM

Slide 25

Slide 25 text

Execution of LLVM IR 12 LLVM IR Interpreter Truffle LLVM IR Graal JVM Würthinger, et al. One VM to rule them all. In Onward!

Slide 26

Slide 26 text

Execution of LLVM IR 12 LLVM IR Interpreter Truffle LLVM IR Graal JVM

Slide 27

Slide 27 text

Prevent Out-Of-Bounds Accesses 13 long buf[50]; buf[50] = 0x832324321; Address offset = 50 data I64Array contents

Slide 28

Slide 28 text

Prevent Out-Of-Bounds Accesses contents[50]  ArrayIndexOutOfBoundsException 13 long buf[50]; buf[50] = 0x832324321; Address offset = 50 data I64Array contents

Slide 29

Slide 29 text

Found Errors • 68 errors in open-source projects • 8 errors not found by LLVM’s AddressSanitizer and Valgrind 14 int main(int argc, char** argv) { printf("%d %s\n", argc, argv[5]); } Out-of-bounds accesses to argv are not instrumented by ASan https://github.com/google/sanitizers/issues/762

Slide 30

Slide 30 text

Evaluation: Peak Performance 15 lower is better

Slide 31

Slide 31 text

Evaluation: Peak Performance 16 lower is better

Slide 32

Slide 32 text

Evaluation: Peak Performance 16 Baseline is Clang –O0, Safe Sulong is faster in all but one case lower is better

Slide 33

Slide 33 text

Evaluation: Peak Performance 17 lower is better

Slide 34

Slide 34 text

Evaluation: Peak Performance 17 Safe Sulong is close to Clang –O3 in some cases lower is better

Slide 35

Slide 35 text

Evaluation: Peak Performance 18 lower is better

Slide 36

Slide 36 text

Evaluation: Peak Performance 18 Safe Sulong –O0 is mostly faster than ASan –O0 lower is better

Slide 37

Slide 37 text

Contribution 2: Empirical Studies 19 Empirical Studies Safe Sulong

Slide 38

Slide 38 text

20 if (__builtin_expect(x, 0)) foo(); asm("rdtsc":"=a"(tickl),"=d"(tickh)); Inline Assembly C Projects Consist of More Than C Code Compiler builtins • Should they be supported in Safe Sulong? • Which ones should be implemented?

Slide 39

Slide 39 text

Which ones and how often are they used? 21 Instructions In % of projects rdtsc 27.4% cpuid 25.4% mov 24.9% Builtins In % of projects __builtin_expect 48.2% __builtin_clz 29.3% __builtin_bswap32 26.2% GCC compiler builtins Inline assembly

Slide 40

Slide 40 text

C Projects Consist of More Than C Code 22 1600 builtins to support 99% of projects

Slide 41

Slide 41 text

C Projects Consist of More Than C Code 22 1600 builtins to support 99% of projects Allowed prioritizing their implementation in Safe Sulong

Slide 42

Slide 42 text

Contribution 3: Introspection 23 Intro- spection Empirical Studies Safe Sulong

Slide 43

Slide 43 text

Introspection Functions 24 int *arr = malloc(sizeof (int) * 10) ; int *ptr = &(arr[4]); printf ("%ld\n", size_right(ptr)); // prints 24 _size_right() sizeof(int) * 10

Slide 44

Slide 44 text

Introspection Functions 24 int *arr = malloc(sizeof (int) * 10) ; int *ptr = &(arr[4]); printf ("%ld\n", size_right(ptr)); // prints 24 _size_right() sizeof(int) * 10 The introspection interface also allows querying other metadata (e.g., types)

Slide 45

Slide 45 text

Example: strlen() 25 size_t strlen(const char *str) { size_t len = 0; while (*str != '\0') { len++; str++; } return len; }

Slide 46

Slide 46 text

Example: strlen() 25 size_t strlen(const char *str) { size_t len = 0; while (*str != '\0') { len++; str++; } return len; } P r o g r a m m i n g \0 ... ...

Slide 47

Slide 47 text

Example: strlen() 25 size_t strlen(const char *str) { size_t len = 0; while (*str != '\0') { len++; str++; } return len; } P r o g r a m m i n g \0 ... ...

Slide 48

Slide 48 text

Example: strlen() 25 size_t strlen(const char *str) { size_t len = 0; while (*str != '\0') { len++; str++; } return len; } 11 P r o g r a m m i n g \0 ... ...

Slide 49

Slide 49 text

Example: strlen() 26 size_t strlen(const char *str) { size_t len = 0; while (*str != '\0') { len++; str++; } return len; } P r o g r a m m i n g ... ...

Slide 50

Slide 50 text

Example: strlen() 26 size_t strlen(const char *str) { size_t len = 0; while (*str != '\0') { len++; str++; } return len; } P r o g r a m m i n g ... ...

Slide 51

Slide 51 text

Example: strlen() 26 size_t strlen(const char *str) { size_t len = 0; while (*str != '\0') { len++; str++; } return len; } 23415 P r o g r a m m i n g ... ...

Slide 52

Slide 52 text

size_t strlen(const char *str) { size_t len = 0; while (size_right(str) > 0 && *str != '\0') { len++; str++; } return len; } Example: strlen() 27 P r o g r a m m i n g ... ...

Slide 53

Slide 53 text

size_t strlen(const char *str) { size_t len = 0; while (size_right(str) > 0 && *str != '\0') { len++; str++; } return len; } Example: strlen() 27 P r o g r a m m i n g ... ...

Slide 54

Slide 54 text

size_t strlen(const char *str) { size_t len = 0; while (size_right(str) > 0 && *str != '\0') { len++; str++; } return len; } Example: strlen() 27 11 P r o g r a m m i n g ... ...

Slide 55

Slide 55 text

Summary 28 Introspection for Library Robustness Empirical Studies Three contribution areas Safe Sulong