Programming '18 SRC: Safe Execution of LLVM-based Languages on the Java Virtual Machine

Programming '18 SRC: Safe Execution of LLVM-based Languages on the Java Virtual Machine

Slides to the talk of the Programming '18 Student Research Competition (https://2018.programming-conference.org/track/programming-2018-src#Winners)

389c8e3d83119ec458c5c57e8d92da2a?s=128

Manuel Rigger

April 11, 2018
Tweet

Transcript

  1. Safe Execution of LLVM-based Languages on the Java Virtual Machine

    Manuel Rigger Institute for System Software Supervisor: Hanspeter Mössenböck Programming SRC, April 11, 2018
  2. Example 2 long buf[50]; buf[50] = 0x832324321;

  3. Example 2 long buf[50]; buf[50] = 0x832324321; Unsafe languages (e.g.,

    C) Undefined Behavior
  4. Example 2 long buf[50]; buf[50] = 0x832324321; Unsafe languages (e.g.,

    C) Undefined Behavior Unsafe languages do not specify the semantics of erroneous code
  5. Buffer Overflows 3 long buf[50]; buf[50] = 0x832324321; Caller s

    return address buf[49] buf[0] x + 50 x + 58 x + 0
  6. Buffer Overflows 4 long buf[50]; buf[50] = 0x832324321; Caller s

    return address buf[49] buf[0] x + 50 x + 58 x + 0 0x832324321 buf[49] buf[0] x + 50 x + 58 x + 0
  7. Buffer Overflows 4 long buf[50]; buf[50] = 0x832324321; Caller s

    return address buf[49] buf[0] x + 50 x + 58 x + 0 0x832324321 buf[49] buf[0] x + 50 x + 58 x + 0 Attackers can exploit buffer overflows to divert the control flow of the program execve()
  8. Buffer Overflows 4 long buf[50]; buf[50] = 0x832324321; Caller s

    return address buf[49] buf[0] x + 50 x + 58 x + 0 0x832324321 buf[49] buf[0] x + 50 x + 58 x + 0 Attackers can exploit buffer overflows to divert the control flow of the program execve()
  9. Safe Languages 5 int[] arr = new int[50]; arr[50] =

  10. Safe Languages 5 Java ArrayIndexOutOfBoundsException int[] arr = new int[50];

    arr[50] = …
  11. Safe Languages 5 Java ArrayIndexOutOfBoundsException int[] arr = new int[50];

    arr[50] = … The Java Virtual Machine (JVM) automatically checks accesses
  12. Safe Languages 5 Java ArrayIndexOutOfBoundsException int[] arr = new int[50];

    arr[50] = … The Java Virtual Machine (JVM) automatically checks accesses
  13. Goal of my PhD 6 Safely and Efficiently Execute Unsafe

    Languages on the Java Virtual Machine
  14. Contributions (Areas) 7 Safe Sulong Safe Sulong, a system to

    safely and efficiently execute unsafe languages on the Java Virtual Machine
  15. Contributions (Areas) 8 Empirical Studies Safe Sulong Safe Sulong, a

    system to safely and efficiently execute unsafe languages on the Java Virtual Machine Empirical studies on unstandardized constructs in C code to prioritize their implementation in Safe Sulong
  16. Contributions (Areas) 9 Intros- pection Empirical Studies Safe Sulong An

    introspection interface to allow programmers enhance the robustness of their libraries Safe Sulong, a system to safely and efficiently execute unsafe languages on the Java Virtual Machine Empirical studies on unstandardized constructs in C code to prioritize their implementation in Safe Sulong
  17. Contribution 1: Safe Sulong 10 Safe Sulong

  18. Execution of LLVM IR 11 Safe Execution Platform LLVM IR

    Clang C C++ GCC Fortran Other LLVM frontend ...
  19. Execution of LLVM IR 11 Safe Execution Platform LLVM IR

    Clang C C++ GCC Fortran Other LLVM frontend ... Lattner, et al. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004
  20. Execution of LLVM IR 11 Safe Execution Platform LLVM IR

    Clang C C++ GCC Fortran Other LLVM frontend ... Lattner, et al. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004
  21. Execution of LLVM IR 11 Safe Execution Platform LLVM IR

    Clang C C++ GCC Fortran Other LLVM frontend ... Lattner, et al. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004 Targeting LLVM IR allows executing several unsafe languages
  22. Execution of LLVM IR 11 Safe Execution Platform LLVM IR

    Clang C C++ GCC Fortran Other LLVM frontend ... Lattner, et al. LLVM: A compilation framework for lifelong program analysis & transformation. In CGO 2004
  23. Execution of LLVM IR 12 LLVM IR Interpreter Truffle LLVM

    IR Graal JVM
  24. Execution of LLVM IR 12 LLVM IR Interpreter Truffle LLVM

    IR Graal JVM
  25. Execution of LLVM IR 12 LLVM IR Interpreter Truffle LLVM

    IR Graal JVM Würthinger, et al. One VM to rule them all. In Onward!
  26. Execution of LLVM IR 12 LLVM IR Interpreter Truffle LLVM

    IR Graal JVM
  27. Prevent Out-Of-Bounds Accesses 13 long buf[50]; buf[50] = 0x832324321; Address

    offset = 50 data I64Array contents
  28. Prevent Out-Of-Bounds Accesses contents[50]  ArrayIndexOutOfBoundsException 13 long buf[50]; buf[50]

    = 0x832324321; Address offset = 50 data I64Array contents
  29. Found Errors • 68 errors in open-source projects • 8

    errors not found by LLVM’s AddressSanitizer and Valgrind 14 int main(int argc, char** argv) { printf("%d %s\n", argc, argv[5]); } Out-of-bounds accesses to argv are not instrumented by ASan https://github.com/google/sanitizers/issues/762
  30. Evaluation: Peak Performance 15 lower is better

  31. Evaluation: Peak Performance 16 lower is better

  32. Evaluation: Peak Performance 16 Baseline is Clang –O0, Safe Sulong

    is faster in all but one case lower is better
  33. Evaluation: Peak Performance 17 lower is better

  34. Evaluation: Peak Performance 17 Safe Sulong is close to Clang

    –O3 in some cases lower is better
  35. Evaluation: Peak Performance 18 lower is better

  36. Evaluation: Peak Performance 18 Safe Sulong –O0 is mostly faster

    than ASan –O0 lower is better
  37. Contribution 2: Empirical Studies 19 Empirical Studies Safe Sulong

  38. 20 if (__builtin_expect(x, 0)) foo(); asm("rdtsc":"=a"(tickl),"=d"(tickh)); Inline Assembly C Projects

    Consist of More Than C Code Compiler builtins • Should they be supported in Safe Sulong? • Which ones should be implemented?
  39. Which ones and how often are they used? 21 Instructions

    In % of projects rdtsc 27.4% cpuid 25.4% mov 24.9% Builtins In % of projects __builtin_expect 48.2% __builtin_clz 29.3% __builtin_bswap32 26.2% GCC compiler builtins Inline assembly
  40. C Projects Consist of More Than C Code 22 1600

    builtins to support 99% of projects
  41. C Projects Consist of More Than C Code 22 1600

    builtins to support 99% of projects Allowed prioritizing their implementation in Safe Sulong
  42. Contribution 3: Introspection 23 Intro- spection Empirical Studies Safe Sulong

  43. Introspection Functions 24 int *arr = malloc(sizeof (int) * 10)

    ; int *ptr = &(arr[4]); printf ("%ld\n", size_right(ptr)); // prints 24 _size_right() sizeof(int) * 10
  44. Introspection Functions 24 int *arr = malloc(sizeof (int) * 10)

    ; int *ptr = &(arr[4]); printf ("%ld\n", size_right(ptr)); // prints 24 _size_right() sizeof(int) * 10 The introspection interface also allows querying other metadata (e.g., types)
  45. Example: strlen() 25 size_t strlen(const char *str) { size_t len

    = 0; while (*str != '\0') { len++; str++; } return len; }
  46. Example: strlen() 25 size_t strlen(const char *str) { size_t len

    = 0; while (*str != '\0') { len++; str++; } return len; } P r o g r a m m i n g \0 ... ...
  47. Example: strlen() 25 size_t strlen(const char *str) { size_t len

    = 0; while (*str != '\0') { len++; str++; } return len; } P r o g r a m m i n g \0 ... ...
  48. Example: strlen() 25 size_t strlen(const char *str) { size_t len

    = 0; while (*str != '\0') { len++; str++; } return len; } 11 P r o g r a m m i n g \0 ... ...
  49. Example: strlen() 26 size_t strlen(const char *str) { size_t len

    = 0; while (*str != '\0') { len++; str++; } return len; } P r o g r a m m i n g ... ...
  50. Example: strlen() 26 size_t strlen(const char *str) { size_t len

    = 0; while (*str != '\0') { len++; str++; } return len; } P r o g r a m m i n g ... ...
  51. Example: strlen() 26 size_t strlen(const char *str) { size_t len

    = 0; while (*str != '\0') { len++; str++; } return len; } 23415 P r o g r a m m i n g ... ...
  52. size_t strlen(const char *str) { size_t len = 0; while

    (size_right(str) > 0 && *str != '\0') { len++; str++; } return len; } Example: strlen() 27 P r o g r a m m i n g ... ...
  53. size_t strlen(const char *str) { size_t len = 0; while

    (size_right(str) > 0 && *str != '\0') { len++; str++; } return len; } Example: strlen() 27 P r o g r a m m i n g ... ...
  54. size_t strlen(const char *str) { size_t len = 0; while

    (size_right(str) > 0 && *str != '\0') { len++; str++; } return len; } Example: strlen() 27 11 P r o g r a m m i n g ... ...
  55. Summary 28 Introspection for Library Robustness Empirical Studies Three contribution

    areas Safe Sulong