Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Programming '18 SRC: Safe Execution of LLVM-based Languages on the Java Virtual Machine

Programming '18 SRC: Safe Execution of LLVM-based Languages on the Java Virtual Machine

Slides to the talk of the Programming '18 Student Research Competition (https://2018.programming-conference.org/track/programming-2018-src#Winners)

Manuel Rigger

April 11, 2018
Tweet

More Decks by Manuel Rigger

Other Decks in Research

Transcript

  1. Safe Execution of LLVM-based Languages on
    the Java Virtual Machine
    Manuel Rigger
    Institute for System Software
    Supervisor: Hanspeter Mössenböck
    Programming SRC, April 11, 2018

    View full-size slide

  2. Example
    2
    long buf[50];
    buf[50] = 0x832324321;

    View full-size slide

  3. Example
    2
    long buf[50];
    buf[50] = 0x832324321;
    Unsafe languages
    (e.g., C)
    Undefined
    Behavior

    View full-size slide

  4. Example
    2
    long buf[50];
    buf[50] = 0x832324321;
    Unsafe languages
    (e.g., C)
    Undefined
    Behavior
    Unsafe languages do not specify
    the semantics of erroneous code

    View full-size slide

  5. Buffer Overflows
    3
    long buf[50];
    buf[50] = 0x832324321;
    Caller s return
    address
    buf[49]
    buf[0]
    x + 50
    x + 58
    x + 0

    View full-size slide

  6. Buffer Overflows
    4
    long buf[50];
    buf[50] = 0x832324321;
    Caller s return
    address
    buf[49]
    buf[0]
    x + 50
    x + 58
    x + 0
    0x832324321
    buf[49]
    buf[0]
    x + 50
    x + 58
    x + 0

    View full-size slide

  7. Buffer Overflows
    4
    long buf[50];
    buf[50] = 0x832324321;
    Caller s return
    address
    buf[49]
    buf[0]
    x + 50
    x + 58
    x + 0
    0x832324321
    buf[49]
    buf[0]
    x + 50
    x + 58
    x + 0
    Attackers can exploit buffer
    overflows to divert the
    control flow of the program
    execve()

    View full-size slide

  8. Buffer Overflows
    4
    long buf[50];
    buf[50] = 0x832324321;
    Caller s return
    address
    buf[49]
    buf[0]
    x + 50
    x + 58
    x + 0
    0x832324321
    buf[49]
    buf[0]
    x + 50
    x + 58
    x + 0
    Attackers can exploit buffer
    overflows to divert the
    control flow of the program
    execve()

    View full-size slide

  9. Safe Languages
    5
    int[] arr = new int[50];
    arr[50] = …

    View full-size slide

  10. Safe Languages
    5
    Java
    ArrayIndexOutOfBoundsException
    int[] arr = new int[50];
    arr[50] = …

    View full-size slide

  11. Safe Languages
    5
    Java
    ArrayIndexOutOfBoundsException
    int[] arr = new int[50];
    arr[50] = …
    The Java Virtual Machine
    (JVM) automatically checks
    accesses

    View full-size slide

  12. Safe Languages
    5
    Java
    ArrayIndexOutOfBoundsException
    int[] arr = new int[50];
    arr[50] = …
    The Java Virtual Machine
    (JVM) automatically checks
    accesses

    View full-size slide

  13. Goal of my PhD
    6
    Safely and Efficiently Execute Unsafe Languages
    on the Java Virtual Machine

    View full-size slide

  14. Contributions (Areas)
    7
    Safe Sulong
    Safe Sulong, a system to safely and
    efficiently execute unsafe languages on
    the Java Virtual Machine

    View full-size slide

  15. Contributions (Areas)
    8
    Empirical Studies
    Safe Sulong
    Safe Sulong, a system to safely and
    efficiently execute unsafe languages on
    the Java Virtual Machine
    Empirical studies on unstandardized
    constructs in C code to prioritize their
    implementation in Safe Sulong

    View full-size slide

  16. Contributions (Areas)
    9
    Intros-
    pection
    Empirical Studies
    Safe Sulong
    An introspection interface to allow
    programmers enhance the robustness of
    their libraries
    Safe Sulong, a system to safely and
    efficiently execute unsafe languages on
    the Java Virtual Machine
    Empirical studies on unstandardized
    constructs in C code to prioritize their
    implementation in Safe Sulong

    View full-size slide

  17. Contribution 1: Safe Sulong
    10
    Safe Sulong

    View full-size slide

  18. Execution of LLVM IR
    11
    Safe Execution
    Platform
    LLVM IR
    Clang
    C C++
    GCC
    Fortran
    Other
    LLVM
    frontend
    ...

    View full-size slide

  19. Execution of LLVM IR
    11
    Safe Execution
    Platform
    LLVM IR
    Clang
    C C++
    GCC
    Fortran
    Other
    LLVM
    frontend
    ...
    Lattner, et al. LLVM: A compilation
    framework for lifelong program analysis &
    transformation. In CGO 2004

    View full-size slide

  20. Execution of LLVM IR
    11
    Safe Execution
    Platform
    LLVM IR
    Clang
    C C++
    GCC
    Fortran
    Other
    LLVM
    frontend
    ...
    Lattner, et al. LLVM: A compilation
    framework for lifelong program analysis &
    transformation. In CGO 2004

    View full-size slide

  21. Execution of LLVM IR
    11
    Safe Execution
    Platform
    LLVM IR
    Clang
    C C++
    GCC
    Fortran
    Other
    LLVM
    frontend
    ...
    Lattner, et al. LLVM: A compilation
    framework for lifelong program analysis &
    transformation. In CGO 2004
    Targeting LLVM IR allows executing
    several unsafe languages

    View full-size slide

  22. Execution of LLVM IR
    11
    Safe Execution
    Platform
    LLVM IR
    Clang
    C C++
    GCC
    Fortran
    Other
    LLVM
    frontend
    ...
    Lattner, et al. LLVM: A compilation
    framework for lifelong program analysis &
    transformation. In CGO 2004

    View full-size slide

  23. Execution of LLVM IR
    12
    LLVM IR Interpreter
    Truffle
    LLVM IR
    Graal
    JVM

    View full-size slide

  24. Execution of LLVM IR
    12
    LLVM IR Interpreter
    Truffle
    LLVM IR
    Graal
    JVM

    View full-size slide

  25. Execution of LLVM IR
    12
    LLVM IR Interpreter
    Truffle
    LLVM IR
    Graal
    JVM
    Würthinger, et al. One VM to rule them all.
    In Onward!

    View full-size slide

  26. Execution of LLVM IR
    12
    LLVM IR Interpreter
    Truffle
    LLVM IR
    Graal
    JVM

    View full-size slide

  27. Prevent Out-Of-Bounds Accesses
    13
    long buf[50];
    buf[50] = 0x832324321;
    Address
    offset = 50
    data
    I64Array
    contents

    View full-size slide

  28. Prevent Out-Of-Bounds Accesses
    contents[50]  ArrayIndexOutOfBoundsException
    13
    long buf[50];
    buf[50] = 0x832324321;
    Address
    offset = 50
    data
    I64Array
    contents

    View full-size slide

  29. Found Errors
    • 68 errors in open-source projects
    • 8 errors not found by LLVM’s AddressSanitizer and Valgrind
    14
    int main(int argc, char** argv) {
    printf("%d %s\n", argc, argv[5]);
    }
    Out-of-bounds accesses to argv
    are not instrumented by ASan
    https://github.com/google/sanitizers/issues/762

    View full-size slide

  30. Evaluation: Peak Performance
    15
    lower is better

    View full-size slide

  31. Evaluation: Peak Performance
    16
    lower is better

    View full-size slide

  32. Evaluation: Peak Performance
    16
    Baseline is Clang –O0, Safe Sulong
    is faster in all but one case
    lower is better

    View full-size slide

  33. Evaluation: Peak Performance
    17
    lower is better

    View full-size slide

  34. Evaluation: Peak Performance
    17
    Safe Sulong is close to
    Clang –O3 in some cases
    lower is better

    View full-size slide

  35. Evaluation: Peak Performance
    18
    lower is better

    View full-size slide

  36. Evaluation: Peak Performance
    18
    Safe Sulong –O0 is mostly faster than ASan –O0
    lower is better

    View full-size slide

  37. Contribution 2: Empirical Studies
    19
    Empirical Studies
    Safe Sulong

    View full-size slide

  38. 20
    if (__builtin_expect(x, 0))
    foo();
    asm("rdtsc":"=a"(tickl),"=d"(tickh));
    Inline Assembly
    C Projects Consist of More Than C Code
    Compiler builtins
    • Should they be supported in Safe
    Sulong?
    • Which ones should be
    implemented?

    View full-size slide

  39. Which ones and how often are they used?
    21
    Instructions
    In % of
    projects
    rdtsc 27.4%
    cpuid 25.4%
    mov 24.9%
    Builtins In % of
    projects
    __builtin_expect 48.2%
    __builtin_clz 29.3%
    __builtin_bswap32 26.2%
    GCC compiler builtins Inline assembly

    View full-size slide

  40. C Projects Consist of More Than C Code
    22
    1600 builtins to support 99% of projects

    View full-size slide

  41. C Projects Consist of More Than C Code
    22
    1600 builtins to support 99% of projects
    Allowed prioritizing their
    implementation in Safe Sulong

    View full-size slide

  42. Contribution 3: Introspection
    23
    Intro-
    spection
    Empirical Studies
    Safe Sulong

    View full-size slide

  43. Introspection Functions
    24
    int *arr = malloc(sizeof (int) * 10) ;
    int *ptr = &(arr[4]);
    printf ("%ld\n", size_right(ptr)); // prints 24
    _size_right()
    sizeof(int) * 10

    View full-size slide

  44. Introspection Functions
    24
    int *arr = malloc(sizeof (int) * 10) ;
    int *ptr = &(arr[4]);
    printf ("%ld\n", size_right(ptr)); // prints 24
    _size_right()
    sizeof(int) * 10
    The introspection interface
    also allows querying other
    metadata (e.g., types)

    View full-size slide

  45. Example: strlen()
    25
    size_t strlen(const char *str) {
    size_t len = 0;
    while (*str != '\0') {
    len++;
    str++;
    }
    return len;
    }

    View full-size slide

  46. Example: strlen()
    25
    size_t strlen(const char *str) {
    size_t len = 0;
    while (*str != '\0') {
    len++;
    str++;
    }
    return len;
    }
    P r o g r a m m i n g \0
    ... ...

    View full-size slide

  47. Example: strlen()
    25
    size_t strlen(const char *str) {
    size_t len = 0;
    while (*str != '\0') {
    len++;
    str++;
    }
    return len;
    }
    P r o g r a m m i n g \0
    ... ...

    View full-size slide

  48. Example: strlen()
    25
    size_t strlen(const char *str) {
    size_t len = 0;
    while (*str != '\0') {
    len++;
    str++;
    }
    return len;
    }
    11
    P r o g r a m m i n g \0
    ... ...

    View full-size slide

  49. Example: strlen()
    26
    size_t strlen(const char *str) {
    size_t len = 0;
    while (*str != '\0') {
    len++;
    str++;
    }
    return len;
    }
    P r o g r a m m i n g
    ... ...

    View full-size slide

  50. Example: strlen()
    26
    size_t strlen(const char *str) {
    size_t len = 0;
    while (*str != '\0') {
    len++;
    str++;
    }
    return len;
    }
    P r o g r a m m i n g
    ... ...

    View full-size slide

  51. Example: strlen()
    26
    size_t strlen(const char *str) {
    size_t len = 0;
    while (*str != '\0') {
    len++;
    str++;
    }
    return len;
    }
    23415
    P r o g r a m m i n g
    ... ...

    View full-size slide

  52. size_t strlen(const char *str) {
    size_t len = 0;
    while (size_right(str) > 0 && *str != '\0') {
    len++;
    str++;
    }
    return len;
    }
    Example: strlen()
    27
    P r o g r a m m i n g
    ... ...

    View full-size slide

  53. size_t strlen(const char *str) {
    size_t len = 0;
    while (size_right(str) > 0 && *str != '\0') {
    len++;
    str++;
    }
    return len;
    }
    Example: strlen()
    27
    P r o g r a m m i n g
    ... ...

    View full-size slide

  54. size_t strlen(const char *str) {
    size_t len = 0;
    while (size_right(str) > 0 && *str != '\0') {
    len++;
    str++;
    }
    return len;
    }
    Example: strlen()
    27
    11
    P r o g r a m m i n g
    ... ...

    View full-size slide

  55. Summary
    28
    Introspection for Library Robustness
    Empirical Studies
    Three contribution areas Safe Sulong

    View full-size slide