$30 off During Our Annual Pro Sale. View Details »

Sulong: Executing Low-level Languages on Truffle

Sulong: Executing Low-level Languages on Truffle

Invited talk held at the ICW 2019 Interconnecting Code Workshop co-located with 2019

Manuel Rigger

April 01, 2019
Tweet

More Decks by Manuel Rigger

Other Decks in Research

Transcript

  1. Sulong: Executing Low-level Languages on
    Truffle
    Manuel Rigger
    Advanced Software Technologies Lab (Zhendong Su)
    ETH Zurich
    1. April 2019
    Interconnecting Code Workshop @ 2019
    @RiggerManuel

    View Slide

  2. PhD Topic
    2
    Safe and Efficient Execution of
    Unsafe Languages on the Java
    Virtual Machine

    View Slide

  3. How is this Relevant for ICW?
    3
    An improved version of Sulong is
    used within GraalVM as a native
    function interface

    View Slide

  4. 4
    GraalVM, its
    language
    interoperability
    mechanism, and
    Sulong’s role

    View Slide

  5. 4
    GraalVM, its
    language
    interoperability
    mechanism, and
    Sulong’s role
    I have not been working on
    language interoperability myself.

    View Slide

  6. Unsafe languages
    5
    Heartbleed Cloudbleed

    View Slide

  7. Unsafe languages
    5
    Heartbleed Cloudbleed Graalbleed

    View Slide

  8. 6
    GraalVM, its
    language
    interoperability
    mechanism, and
    Sulong’s role
    Safe Sulong and
    how it safely
    executes LLVM-
    based ´languages

    View Slide

  9. Sulong Interacts also with Other Code
    7
    Compiler builtins
    System calls
    External Libraries
    Low-level libc/POSIX functions
    Linkage features
    Compiler extensions
    Inline assembly

    View Slide

  10. 8
    The importance of
    inline assembly
    and compiler
    builtins
    GraalVM, its
    language
    interoperability
    mechanism, and
    Sulong’s role
    Safe Sulong and
    how it safely
    executes LLVM-
    based languages

    View Slide

  11. 9
    GraalVM, its language interoperability mechanism, and
    Sulong’s role

    View Slide

  12. GraalVM
    10
    https://www.graalvm.org/

    View Slide

  13. GraalVM
    11
    (Würthinger et al. 2016)
    GraalVM supports the execution
    of various languages
    TruffleRuby Graal.js Graal.python FastR

    View Slide

  14. GraalVM
    12
    (Würthinger et al. 2016)
    TruffleRuby Graal.js Graal.python FastR
    Truffle
    Truffle is an language-
    implementation framework
    • Written in Java
    • Optimization primitives
    • Debugging and profiling
    • Language interoperability!

    View Slide

  15. GraalVM
    13
    (Würthinger et al. 2016)
    TruffleRuby Graal.js Graal.python FastR
    Graal
    Truffle
    Graal is the compiler used by Truffle

    View Slide

  16. GraalVM
    14
    (Würthinger et al. 2016)
    TruffleRuby Graal.js Graal.python FastR
    Graal
    Truffle
    Can execute on the JVM, be
    compiled to a standalone
    executable, …
    JVM

    View Slide

  17. GraalVM
    15
    (Würthinger et al. 2016)
    TruffleRuby Graal.js Graal.python FastR
    The languages are implemented as
    Abstract Syntax Tree (AST) interpreters

    View Slide

  18. AST Interpreters
    16
    =
    a
    b 3
    +
    a = b + 3
    Parse input
    program

    View Slide

  19. AST Interpreters
    17
    Set up
    input
    2
    a b
    =
    a
    b 3
    +

    View Slide

  20. AST Interpreters
    18
    Execute
    =
    a
    b 3
    +
    2
    a b

    View Slide

  21. AST Interpreters
    18
    Execute
    =
    a
    b 3
    +
    2
    a b
    3
    2

    View Slide

  22. AST Interpreters
    18
    Execute
    =
    a
    b 3
    +
    2
    a b
    3
    2
    5
    a

    View Slide

  23. AST Interpreters
    18
    Execute
    =
    a
    b 3
    +
    5 2
    a b
    3
    2
    5
    a

    View Slide

  24. AST Interpreters Optimization
    19
    =
    a
    b 3
    +
    Truffle AST Interpreters specialize for their input
    5 2
    a b
    Variable
    Integer

    View Slide

  25. AST Interpreters Optimization
    19
    =
    a
    b 3
    +
    Truffle AST Interpreters specialize for their input
    5 2
    a b
    if (input is as expected) {
    execute specialized operation
    } else {
    rewrite node
    }
    Variable
    Integer

    View Slide

  26. AST Interpreters Optimization
    20
    =
    a
    b 3
    +
    Partial
    Evaluation
    =
    a +
    b 3
    Variable
    Integer

    View Slide

  27. AST Interpreters Optimization
    21
    Compilation
    =
    a +
    b 3
    if (b is an Integer) {
    a = b + 3
    } else {
    deoptimize and
    rewrite node
    }
    Variable
    Integer

    View Slide

  28. AST Interpreters Optimization
    22
    =
    a +
    b 3
    5 “icw”
    a b
    =
    a
    b 3
    +
    Variable
    Integer
    Deoptimize

    View Slide

  29. AST Interpreters Optimization
    23
    =
    a
    b 3
    +
    =
    a
    b 3
    +
    Respecialize
    Variable
    Integer
    Generic

    View Slide

  30. GraalVM
    24
    (Grimmer et al. 2015)
    TruffleRuby Graal.js Graal.python FastR

    View Slide

  31. GraalVM
    24
    (Grimmer et al. 2015)
    TruffleRuby Graal.js Graal.python FastR
    Language interoperability
    support for individual
    language pairs would not
    scale

    View Slide

  32. GraalVM
    25
    TruffleRuby Graal.js Graal.python FastR
    (Grimmer et al. 2015)

    View Slide

  33. GraalVM
    25
    TruffleRuby Graal.js Graal.python FastR
    Idea: Implement a language-
    independent mechanism
    based on messages
    (Grimmer et al. 2015)

    View Slide

  34. Message-Based Foreign Access
    26
    a = b + 3
    =
    a
    READ
    b
    3
    +
    2
    b

    View Slide

  35. Message-Based Foreign Access
    26
    a = b + 3
    =
    a
    READ
    b
    3
    +
    2
    b
    Foreign objects can be accessed by
    sending a message to the foreign
    language implementation

    View Slide

  36. Message-Based Foreign Accesses
    27
    =
    a
    READ
    B
    3
    +
    2 Execute
    =
    a
    3
    +
    b

    View Slide

  37. Message-Based Foreign Accesses
    27
    =
    a
    READ
    B
    3
    +
    2 Execute
    =
    a
    3
    +
    Subsequent reads
    do not need to send a
    message
    b

    View Slide

  38. Sulong as Part of GraalVM
    28
    Java Virtual Machine
    Graal Compiler
    Truffle Framework
    https://www.graalvm.org/
    TruffleRuby Graal.js Graal.python FastR
    Native Extension

    View Slide

  39. Sulong as Part of GraalVM
    29
    Java Virtual Machine
    Graal Compiler
    Truffle Framework
    https://www.graalvm.org/
    TruffleRuby Graal.js Graal.python FastR
    Java Native Interface

    View Slide

  40. Sulong as Part of GraalVM
    30
    Java Virtual Machine
    Graal Compiler
    Truffle Framework
    https://www.graalvm.org/
    TruffleRuby Graal.js Graal.python FastR
    Optimization Boundary
    Java Native Interface

    View Slide

  41. Sulong as Part of GraalVM
    31
    Java Virtual Machine
    Graal Compiler
    Truffle Framework
    TruffleRuby Graal.js Graal.python FastR LLVM IR Interpreter
    LLVM IR
    Clang Flang
    Optimization Boundary

    View Slide

  42. How to Deal with C Code Accessing VM Internals?
    32
    Native Extension
    VM
    Native Extension API

    View Slide

  43. How to Deal with C Code Accessing VM Internals?
    32
    Native Extension
    VM
    Native Extension API
    Native extension APIs allow
    to access VM internals

    View Slide

  44. Example: Ruby C Extension
    33
    # Ruby Code: array.rb
    s = CArray.new
    puts s.arraySum([1,2,3])
    // The C extension: array.c
    #include “ruby.h”
    VALUE c_arraySum(VALUE self, VALUE array) {
    int sum = 0;
    for (int i = 0; i < RARRAY_LEN(array); i++) {
    sum += FIX2INT(rb_ary_entry(array, i));
    }
    return INT2FIX(sum);
    }
    Slide modified from Matthias Grimmer, with permission

    View Slide

  45. Example: Ruby C Extension
    34
    // The C extension: array.c
    #include “ruby.h”
    VALUE c_arraySum(VALUE self, VALUE array) {
    int sum = 0;
    for (int i = 0; i < RARRAY_LEN(array); i++) {
    sum += FIX2INT(rb_ary_entry(array, i));
    }
    return INT2FIX(sum);
    }
    // ruby.h
    typedef VALUE void*;
    typedef ID void *;
    VALUE rb_ary_entry(VALUE ary, long idx);
    Slide modified from Matthias Grimmer, with permission
    Programmers write their
    native extensions using the
    API provided by MRI

    View Slide

  46. Example: Ruby C Extension
    35
    // The C extension: array.c
    #include “ruby.h”
    VALUE c_arraySum(VALUE self, VALUE array) {
    int sum = 0;
    for (int i = 0; i < RARRAY_LEN(array); i++) {
    sum += FIX2INT(rb_ary_entry(array, i));
    }
    return INT2FIX(sum);
    }
    Slide modified from Matthias Grimmer, with permission
    // ruby.c
    #include “ruby.h”
    #include “truffle.h”
    VALUE rb_ary_entry(VALUE ary, long idx) {
    return truffle_read_idx(ary, (int) idx);
    }
    int FIX2INT(VALUE value) {
    return truffle_invoke_i(RUBY_CEXT,
    “rb_fix2int”, value);
    }
    truffle_read_idx and truffle_invoke_i
    are Sulong intrinsics that send
    messages

    View Slide

  47. Example: Ruby C Extension
    36
    // The C extension: array.c
    #include “ruby.h”
    VALUE c_arraySum(VALUE self, VALUE array) {
    int sum = 0;
    for (int i = 0; i < RARRAY_LEN(array); i++) {
    sum += FIX2INT(rb_ary_entry(array, i));
    }
    return INT2FIX(sum);
    }
    Slide modified from Matthias Grimmer, with permission
    // ruby.c
    #include “ruby.h”
    #include “truffle.h”
    VALUE rb_ary_entry(VALUE ary, long idx) {
    return truffle_read_idx(ary, (int) idx);
    }
    int FIX2INT(VALUE value) {
    return truffle_invoke_i(RUBY_CEXT,
    “rb_fix2int”, value);
    }

    View Slide

  48. Example: Ruby C Extension
    36
    // The C extension: array.c
    #include “ruby.h”
    VALUE c_arraySum(VALUE self, VALUE array) {
    int sum = 0;
    for (int i = 0; i < RARRAY_LEN(array); i++) {
    sum += FIX2INT(rb_ary_entry(array, i));
    }
    return INT2FIX(sum);
    }
    Slide modified from Matthias Grimmer, with permission
    // ruby.c
    #include “ruby.h”
    #include “truffle.h”
    VALUE rb_ary_entry(VALUE ary, long idx) {
    return truffle_read_idx(ary, (int) idx);
    }
    int FIX2INT(VALUE value) {
    return truffle_invoke_i(RUBY_CEXT,
    “rb_fix2int”, value);
    }
    # ruby.rb
    def rb_fix2int(value)
    if value.nil?
    raise TypeError
    else
    int = value.to_int
    raise RangeError if int >= 2**32
    int
    end
    end

    View Slide

  49. Performance
    37
    11
    32
    0
    5
    10
    15
    20
    25
    30
    35
    Peak performance relative to
    MRI running pure Ruby
    MRI with C Extensions GraalVM with C Extensions
    Slide modified from Matthias Grimmer, with permission

    View Slide

  50. Performance
    37
    11
    32
    0
    5
    10
    15
    20
    25
    30
    35
    Peak performance relative to
    MRI running pure Ruby
    MRI with C Extensions GraalVM with C Extensions
    Slide modified from Matthias Grimmer, with permission
    Truffle can inline the function
    call from Ruby to C!

    View Slide

  51. 38
    Safe Sulong and how it safely executes LLVM-based
    Languages

    View Slide

  52. Problem: C/C++ are unsafe languages
    39
    Undefined
    Behavior (UB)
    “behavior, upon use of a nonportable
    or erroneous program construct or of
    erroneous data, for which this
    International Standard imposes no
    requirements “
    (C99 standard)

    View Slide

  53. Examples for Undefined Behavior
    Buffer overflow
    Use-after-free
    error
    Integer overflow
    40

    View Slide

  54. Buffer Overflows: Leaking Sensitive Data
    41
    long *arr = malloc(3 * sizeof(long));
    arr: secret

    View Slide

  55. Buffer Overflows: Leaking Sensitive Data
    42
    long *arr = malloc(3 * sizeof(long));
    long dest[4];
    memcpy(dest, arr, sizeof(dest));
    arr:
    dest:
    secret

    View Slide

  56. Buffer Overflows: Leaking Sensitive Data
    42
    long *arr = malloc(3 * sizeof(long));
    long dest[4];
    memcpy(dest, arr, sizeof(dest));
    arr:
    dest:
    secret
    UB

    View Slide

  57. Buffer Overflows: Leaking Sensitive Data
    43
    long *arr = malloc(3 * sizeof(long));
    long dest[4];
    memcpy(dest, arr, sizeof(dest));
    arr:
    dest:
    secret
    secret

    View Slide

  58. Buffer Overflows: Leaking Sensitive Data
    43
    long *arr = malloc(3 * sizeof(long));
    long dest[4];
    memcpy(dest, arr, sizeof(dest));
    arr:
    dest:
    secret
    secret
    Heartbleed and Cloudbleed
    were such vulnerabilities

    View Slide

  59. Buffer Overflows: Leaking Sensitive Data
    43
    long *arr = malloc(3 * sizeof(long));
    long dest[4];
    memcpy(dest, arr, sizeof(dest));
    arr:
    dest:
    secret
    secret
    Heartbleed and Cloudbleed
    were such vulnerabilities
    Writes can allow attackers to
    change a program’s control flow

    View Slide

  60. Use-after-free Error
    44
    long *arr = malloc(3 * sizeof(long));
    free(arr);
    arr[0] = …;
    UB

    View Slide

  61. Use-after-free Error
    44
    long *arr = malloc(3 * sizeof(long));
    free(arr);
    arr[0] = …;
    UB
    Another object can be overwritten if
    the memory has been reallocated

    View Slide

  62. Integer Overflow
    45
    int a = 1, b = INT_MAX;
    int val = a + b;
    UB

    View Slide

  63. Integer Overflow
    45
    int a = 1, b = INT_MAX;
    int val = a + b;
    UB
    Can result in inconsistent or
    surprising behavior if UB is
    “optimized away”

    View Slide

  64. Integer Overflow
    46
    void pause() {
    int a = 0;
    // run until overflow
    while (a < a + 1) {
    a++;
    }
    }

    View Slide

  65. Integer Overflow
    46
    void pause() {
    int a = 0;
    // run until overflow
    while (a < a + 1) {
    a++;
    }
    }
    What’s the compilation output of Clang/GCC?
    1. The function works as expected by the
    programmer
    2. The function body is optimized away
    3. The function results in an endless loop
    4. It depends on the optimization level

    View Slide

  66. Integer Overflow
    47
    void pause() {
    int a = 0;
    // run until overflow
    while (a < a + 1) {
    a++;
    }
    }

    View Slide

  67. Integer Overflow
    47
    void pause() {
    int a = 0;
    // run until overflow
    while (a < a + 1) {
    a++;
    }
    }
    mov dword ptr [rsp - 4], 0
    jmp loop_header
    loop_body:
    add dword ptr [rsp - 4], 1
    loop_header:
    mov eax, dword ptr [rsp - 4]
    mov ecx, dword ptr [rsp - 4]
    add ecx, 1
    cmp eax, ecx
    jl loop_body ret
    -O0

    View Slide

  68. Integer Overflow
    47
    void pause() {
    int a = 0;
    // run until overflow
    while (a < a + 1) {
    a++;
    }
    }
    loop:
    jmp loop
    mov dword ptr [rsp - 4], 0
    jmp loop_header
    loop_body:
    add dword ptr [rsp - 4], 1
    loop_header:
    mov eax, dword ptr [rsp - 4]
    mov ecx, dword ptr [rsp - 4]
    add ecx, 1
    cmp eax, ecx
    jl loop_body ret
    -O3
    -O0

    View Slide

  69. Goal of my PhD
    48
    Tackle UB by
    safely and efficiently executing
    unsafe languages on the JVM

    View Slide

  70. Goal of my PhD
    49
    Tackle UB by
    safely and efficiently executing
    unsafe languages on the JVM

    View Slide

  71. Goal of my PhD
    49
    Tackle UB by
    safely and efficiently executing
    unsafe languages on the JVM
    Well-defined semantics even for errors
    and corner cases

    View Slide

  72. 50
    Existing Approaches
    Instrumentation-
    based bug-finding
    tools
    Symbolic
    execution
    Safe
    languages
    Hardware
    security
    Static
    analysis
    Attacker
    mitigation

    View Slide

  73. 51
    Existing Approaches
    Instrumentation-
    based bug-finding
    tools
    Symbolic
    execution
    Safe
    languages
    Hardware
    security
    Static
    analysis
    Attacker
    mitigation

    View Slide

  74. State of the Art: Instrumentation-based Tools
    52
    a.out
    Clang/GCC
    C
    ./a.out
    Hello world!

    View Slide

  75. State of the Art: Instrumentation-based Tools
    Compile-time instrumentation
    • AddressSanitizer
    • SoftBound+CETS
    52
    a.out
    Clang/GCC
    C
    ./a.out
    Hello world!

    View Slide

  76. State of the Art: Instrumentation-based Tools
    Compile-time instrumentation
    • AddressSanitizer
    • SoftBound+CETS
    52
    a.out
    Clang/GCC
    C
    ./a.out
    Hello world!
    Run-time instrumentation
    • Memcheck
    • Dr. Memory

    View Slide

  77. Conundrum: Finding Bugs vs. Performance
    53
    a.out
    Clang/GCC
    C
    ./a.out
    Hello world!

    View Slide

  78. Conundrum: Finding Bugs vs. Performance
    53
    a.out
    Clang/GCC
    C
    ./a.out
    Hello world!
    Static compilers: optimize code based
    on Undefined Behavior
    Bug-finding tools: find bugs assuming
    that violations are visible side effects
    (Wang et al. 2012, D'Silva 2015)

    View Slide

  79. Conundrum: Finding Bugs vs. Performance
    54
    To find all bugs, developers need to
    disable compiler optimizations

    View Slide

  80. Map Data Structures and Operations to Java
    55
    long *arr = malloc(3 * sizeof(long));
    arr[4] = …

    View Slide

  81. Map Data Structures and Operations to Java
    55
    long *arr = malloc(3 * sizeof(long));
    arr[4] = …
    Map to Java Code

    View Slide

  82. Map Data Structures and Operations to Java
    55
    long[] arr = new long[3];
    arr[4] = …
    long *arr = malloc(3 * sizeof(long));
    arr[4] = …
    Map to Java Code

    View Slide

  83. Map Data Structures and Operations to Java
    55
    long[] arr = new long[3];
    arr[4] = …
    long *arr = malloc(3 * sizeof(long));
    arr[4] = …
    Map to Java Code
    The semantics of an out-of-
    bounds access are well specified

    View Slide

  84. Map Data Structures and Operations to Java
    55
    long[] arr = new long[3];
    arr[4] = …
    long *arr = malloc(3 * sizeof(long));
    arr[4] = …
    Map to Java Code
    ArrayIndexOutOfBoundsException
    The semantics of an out-of-
    bounds access are well specified

    View Slide

  85. Map Data Structures and Operations to Java
    55
    long[] arr = new long[3];
    arr[4] = …
    long *arr = malloc(3 * sizeof(long));
    arr[4] = …
    Map to Java Code
    ArrayIndexOutOfBoundsException
    The semantics of an out-of-
    bounds access are well specified
    The JVM’s compiler optimizes the
    program, but without optimizing
    Undefined Behavior away

    View Slide

  86. Sulong
    56
    Sulong is a Truffle-based
    LLVM IR Interpreter
    LLVM IR Interpreter
    LLVM IR
    Clang
    program.c libc.c
    Truffle
    Graal
    JVM

    View Slide

  87. Sulong
    56
    Sulong is a Truffle-based
    LLVM IR Interpreter
    LLVM IR Interpreter
    LLVM IR
    Clang
    program.c libc.c
    Truffle
    Graal
    JVM
    We need to disable
    Clang’s optimizations

    View Slide

  88. {0, 0, 0}
    Address
    offset = 0
    data I64Array
    contents
    Prevent Out-Of-Bounds Accesses
    57
    long *arr = malloc(3 * sizeof(long));
    [How do we know the type?]
    [Pointer to an integer?]
    [Array bounds check elimination]
    [Strict-aliasing rule]

    View Slide

  89. Prevent Out-Of-Bounds Accesses
    58
    long *arr = malloc(3 * sizeof(long));
    arr[4] = …
    {0, 0, 0}
    Address
    offset = 4
    data I64Array
    contents
    [Pointer to an integer?]
    [Array bounds check elimination]
    [Strict-aliasing rule]

    View Slide

  90. Prevent Out-Of-Bounds Accesses
    contents[4] → ArrayIndexOutOfBoundsException
    58
    long *arr = malloc(3 * sizeof(long));
    arr[4] = …
    {0, 0, 0}
    Address
    offset = 4
    data I64Array
    contents
    [Pointer to an integer?]
    [Array bounds check elimination]
    [Strict-aliasing rule]

    View Slide

  91. Prevent Use-after-Free Errors
    59
    long *arr = malloc(3 * sizeof(long));
    free(arr);
    {0, 0, 0}
    Address
    offset = 0
    data I64Array
    contents
    [Pointer to an integer?]
    [Strict-aliasing rule]

    View Slide

  92. Prevent Use-after-Free Errors
    60
    long *arr = malloc(3 * sizeof(long));
    free(arr);
    Address
    offset = 0
    data I64Array
    contents=null
    [Pointer to an integer?]
    [Strict-aliasing rule]

    View Slide

  93. Prevent Use-after-Free Errors
    61
    long *arr = malloc(3 * sizeof(long));
    free(arr);
    arr[0] = …
    Address
    offset = 0
    data I64Array
    contents=null
    [Pointer to an integer?]
    [Strict-aliasing rule]

    View Slide

  94. Prevent Use-after-Free Errors
    contents[0] → NullPointerException
    62
    long *arr = malloc(3 * sizeof(long));
    free(arr);
    arr[0] = …
    Address
    offset = 0
    data I64Array
    contents=null
    [Pointer to an integer?]
    [Strict-aliasing rule]

    View Slide

  95. Prevent Integer Overflows
    63
    int a = 1, b = INT_MAX;
    int val = a + b;
    Math.addExact(a, b);
    [Pointer to an integer?]

    View Slide

  96. Prevent Integer Overflows
    63
    int a = 1, b = INT_MAX;
    int val = a + b;
    Math.addExact(a, b);
    ArithmeticException
    [Pointer to an integer?]

    View Slide

  97. Safe Optimizations
    64
    ArrayIndexOutOfBoundsException
    NullPointerException
    ArithmeticException
    Exceptions are visible side effects and
    cannot be optimized away

    View Slide

  98. Evaluation Hypotheses
    • Effectiveness: Safe Sulong detects bugs that are overlooked by other
    tools
    • Performance: Safe Sulong’s performance overhead is “reasonable”
    65

    View Slide

  99. Effectiveness: Errors in GitHub Projects
    66
    http://ssw.jku.at/General/Staff/ManuelRigger/ASPLOS18-SafeSulong-Bugs.csv

    View Slide

  100. Effectiveness: Errors in GitHub Projects
    66
    http://ssw.jku.at/General/Staff/ManuelRigger/ASPLOS18-SafeSulong-Bugs.csv
    68 errors in (small) open-source projects

    View Slide

  101. Effectiveness: Errors in GitHub Projects
    • Valgrind detected half of the errors
    • 8 errors not found by LLVM’s AddressSanitizer (and Valgrind)
    • Compiler optimizations (ASan –O3) prevented the detection of 4
    additional bugs
    67
    [Comparison tools]

    View Slide

  102. Effectiveness: Errors in GitHub Projects
    68
    int main(int argc, char** argv) {
    printf("%d %s\n", argc, argv[5]);
    }
    [Comparison tools]
    Out-of-bounds accesses to argv
    are not instrumented by ASan

    View Slide

  103. Effectiveness: Errors in GitHub Projects
    69
    https://github.com/google/sanitizers/issues/762

    View Slide

  104. Effectiveness: Errors in GitHub Projects
    • 8 errors not found by LLVM’s AddressSanitizer and Valgrind
    70
    int main(int argc, char** argv) {
    printf("%d %s\n", argc, argv[5]);
    }
    [Comparison tools]
    In Safe Sulong instrumentation
    cannot be omitted by design

    View Slide

  105. Peak Performance
    71
    lower is better

    View Slide

  106. Peak Performance
    71
    lower is better Safe Sulong‘s performance is mostly
    between Clang –O0 and Clang –O3, and
    mostly faster than ASan –O0

    View Slide

  107. Sulong as Part of GraalVM
    72
    Java Virtual Machine
    Graal Compiler
    Truffle Framework
    TruffleRuby Graal.js Graal.python FastR LLVM IR Interpreter
    LLVM IR
    Clang Flang
    Optimization Boundary

    View Slide

  108. Sulong as Part of GraalVM
    72
    Java Virtual Machine
    Graal Compiler
    Truffle Framework
    TruffleRuby Graal.js Graal.python FastR LLVM IR Interpreter
    LLVM IR
    Clang Flang
    Optimization Boundary
    Managed Sulong, derived
    from Safe Sulong, is
    available in GraalVM

    View Slide

  109. Sulong Key Collaborators
    73
    Jacob
    Kreindl
    Raphael
    Mosaner
    Roland
    Schatz
    Josef
    Eisl
    Christian
    Häubl
    Matthias
    Grimmer
    Thomas
    Pointhuber
    Daniel
    Pekarek
    Chris
    Seaton
    Lukas
    Stadler
    Florian
    Angerer
    David
    Gnedt
    https://github.com/graalvm/sulong/graphs/contributors
    Swapnil
    Gaikwad

    View Slide

  110. 74
    The importance of inline assembly
    and compiler builtins

    View Slide

  111. C/C++
    Fortran

    View Slide

  112. What about inline assembly?
    76

    View Slide

  113. What about GCC builtins?
    77

    View Slide

  114. What about linkage features?
    78

    View Slide

  115. Inline Assembly
    Compiler builtins
    System calls
    External Libraries
    Low-level libc/POSIX functions
    Linkage features
    C/C++
    Fortran
    Compiler extensions
    Non-standard-compliant code

    View Slide

  116. Inline Assembly
    Compiler builtins
    System calls
    External Libraries
    Low-level libc/POSIX functions
    Linkage features
    C/C++
    Fortran
    Compiler extensions
    Non-standard-compliant code

    View Slide

  117. Collaborators
    81
    Stefan
    Marr
    Stephen
    Kell
    David
    Leopoldseder
    Hanspeter
    Mössenböck
    Bram
    Adams

    View Slide

  118. 82
    if (__builtin_expect(x, 0))
    foo ();
    asm("rdtsc":"=a"(tickl),"=d"(tickh));
    Inline Assembly
    C Projects Consist of More Than C Code
    Compiler builtins
    [Inline assembly details]
    [Inline Assembly and GCC Builtins in Sulong]

    View Slide

  119. 83
    if (__builtin_expect(x, 0))
    foo ();
    asm("rdtsc":"=a"(tickl),"=d"(tickh));
    Inline Assembly
    C Projects Consist of More Than C Code
    Compiler builtins
    [Inline assembly details]
    [Inline Assembly and GCC Builtins in Sulong]
    ~1,000 instructions for a single
    complex ISA like x86-64

    View Slide

  120. if (__builtin_expect(x, 0))
    foo ();
    asm("rdtsc":"=a"(tickl),"=d"(tickh));
    Inline Assembly
    C Projects Consist of More Than C Code
    Compiler builtins
    [Inline assembly details]
    [Inline Assembly and GCC Builtins in Sulong]
    Over 1,000 GCC builtins
    84

    View Slide

  121. C Projects Consist of More Than C Code
    85
    How frequently are these used?
    How are they used?
    What is the implementation effort to cover most
    programs?
    How well do comparable tools support them?

    View Slide

  122. C Projects Consist of More Than C Code
    85
    How frequently are these used?
    How are they used?
    What is the implementation effort to cover most
    programs?
    How well do comparable tools support them?
    Informed decision to decide
    whether and do what extent to
    implement them in Sulong!

    View Slide

  123. Mining of C GitHub Projects
    86
    GCC Builtins Inline Assembly
    # studied projects ~5,000 ~1,300
    Considered
    projects
    All C projects C Client
    Applications
    Identification grep name>
    grep asm

    View Slide

  124. Mining of C GitHub Projects
    86
    GCC Builtins Inline Assembly
    # studied projects ~5,000 ~1,300
    Considered
    projects
    All C projects C Client
    Applications
    Identification grep name>
    grep asm
    Different setups, so the comparison
    should be taken with a grain of salt

    View Slide

  125. How widespread are GCC builtins
    and inline assembly fragments?
    87

    View Slide

  126. In How Many Projects are They Used?
    28%
    37%
    0
    10
    20
    30
    40
    % of projects
    Popular projects with inline assembly (Popular) projects with GCC builtins
    Both GCC builtins and inline assembly
    are frequently used by projects
    88

    View Slide

  127. How Often are They Used Within a Project?
    50k
    6k
    0
    10
    20
    30
    40
    50
    Density (occurrence per KLOC)
    Popular projects with inline assembly (Popular) projects with GCC builtins
    They are infrequently used
    within a project
    89

    View Slide

  128. How are inline assembly and
    GCC builtins used?
    90

    View Slide

  129. Inline Assembly
    91
    Inline assembly fragments can
    contain an arbitrary number of
    instructions; how many do they
    typically contain?

    View Slide

  130. Inline Assembly
    91
    Inline assembly fragments can
    contain an arbitrary number of
    instructions; how many do they
    typically contain?
    uint64 sqlite3Hwtime(void){
    unsigned long val;
    __asm__ ("rdtsc" : "=A" (val));
    return val;
    }

    View Slide

  131. Inline Assembly
    91
    Inline assembly fragments can
    contain an arbitrary number of
    instructions; how many do they
    typically contain?
    uint64 sqlite3Hwtime(void){
    unsigned long val;
    __asm__ ("rdtsc" : "=A" (val));
    return val;
    }
    __asm__ __volatile__ (
    " leaq %0, %%rax\n"
    " movq %%rbp, 8(%%rax)\n" /* save regs rbp and rsp
    " movq %%rsp, (%%rax)\n"
    " movq %%rax, %%rsp\n" /* make rsp point to &ar
    " movq 16(%%rsp), %%rsi\n" /* rsi = in */
    " movq 32(%%rsp), %%rdi\n" /* rdi = out */
    " movq 24(%%rsp), %%r9\n" /* r9 = last */
    " movq 48(%%rsp), %%r10\n" /* r10 = end */
    " movq 64(%%rsp), %%rbp\n" /* rbp = lcode */
    " movq 72(%%rsp), %%r11\n" /* r11 = dcode */
    " movq 80(%%rsp), %%rdx\n" /* rdx = hold */
    " movl 88(%%rsp), %%ebx\n" /* ebx = bits */
    " movl 100(%%rsp), %%r12d\n" /* r12d = lmask */
    " movl 104(%%rsp), %%r13d\n" /* r13d = dmask */
    /* r14d = len */
    /* r15d = dist */
    " cld\n"
    " cmpq %%rdi, %%r10\n"
    " je .L_one_time\n" /* if only one decode le
    " cmpq %%rsi, %%r9\n"
    " je .L_one_time\n"
    " jmp .L_do_loop\n"
    ".L_one_time:\n"
    " movq %%r12, %%r8\n" /* r8 = lmask */
    " cmpb $32, %%bl\n"
    " ja .L_get_length_code_one_time\n"
    " lodsl\n" /* eax = *(uint *)in++ *
    " movb %%bl, %%cl\n" /* cl = bits, needs it f
    " addb $32, %%bl\n" /* bits += 32 */
    " shlq %%cl, %%rax\n"
    " orq %%rax, %%rdx\n" /* hold |= *((uint *)in)
    " jmp .L_get_length_code_one_time\n"

    View Slide

  132. How are Inline Assembly Fragments Used?
    92
    0
    10
    20
    30
    40
    50
    60
    70
    80
    90
    100
    1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33
    Cumulative percentage
    Number of unique fragments per project
    36%
    A number of projects only uses a
    single inline assembly fragments

    View Slide

  133. How are Inline Assembly Fragments Used?
    93
    0
    10
    20
    30
    40
    50
    60
    70
    80
    90
    100
    1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33
    Cumulative percentage
    Number of unique fragments per project
    99%
    Almost all projects use less than
    25 inline assembly fragments

    View Slide

  134. How are Inline Assembly Fragments Used?
    0
    10
    20
    30
    40
    50
    60
    70
    80
    90
    100
    1 2 3 4 5 6 7 8 9 10 11 12
    Cumulative percentage
    Number of instructions per unique fragment
    94
    100%
    438

    We also found fragments with
    several hundred instructions

    View Slide

  135. How are Inline Assembly Fragments Used?
    95
    Inline assembly fragments typically consist of a
    low number of instructions.

    View Slide

  136. How are GCC Builtins Used?
    96
    if (__builtin_expect(x, 0))
    foo ();
    Architecture-independent builtin
    c = __builtin_ia32_paddb(a, b);
    Architecture-specific builtin
    Architecture-specific builtins
    are similar to inline
    assembly. Are they used?

    View Slide

  137. How are GCC Builtins Used?
    97
    38% 36%
    8%
    0
    500
    1000
    1500
    2000
    Number of projects
    Used builtins Machine-independent Machine-specific
    Mainly machine-independent GCC builtins
    are used.

    View Slide

  138. Machine-specific vs. Machine-independent
    Builtins
    98
    17
    3
    4
    A project that uses machine-specific builtins
    uses them in a larger number.

    View Slide

  139. How well do tools support them
    and how much effort needs to be
    invested to support them?
    99

    View Slide

  140. Tool Support for Inline Assembly
    100
    c2go transpile test.c
    panic: unknown node type: 'GCCAsmStmt 0x3a991f8 'goroutine 1
    [running]:github_com_elliotchance_c2go_ast.Parse
    go/src/github.com/elliotchance/c2go/ast/ast.go:211main.convertLinesToNodes
    go/src/github.com/elliotchance/c2go/main.go:81main.Start
    go/src/github.com/elliotchance/c2go/main.go:219main.runCommand
    go/src/github.com/elliotchance/c2go/main.go:350main.main
    go/src/github.com/elliotchance/c2go/main.go:277goroutine 6 [finalizer wait]:
    Splint 3.1.2 --- 03 May 2009
    test.c: (in function rdtsc)
    test.c:5:3: Unrecognized identifier: asm
    Identifier used in code has not been declared. (Use –unrecog
    to inhibit warning)
    test.c:5:15: Parse Error. (For help on parse errors, see splint -help
    parseerrors.)
    *** Cannot continue.

    View Slide

  141. Tool Support
    101
    Test suite for the most
    commonly-used 100 builtins

    View Slide

  142. Bugs in CompCert
    102
    https://github.com/AbsInt/CompCert/issues/243
    [Details bug]

    View Slide

  143. 103
    Tool support is lacking behind

    View Slide

  144. How much effort is needed to implement GCC
    Builtins?
    104
    [Details]

    View Slide

  145. How much effort is needed to implement GCC
    Builtins?
    104
    32 builtins to support
    half of projects
    [Details]

    View Slide

  146. How much effort is needed to implement GCC
    Builtins?
    104
    1600 builtins to support 99% of projects
    32 builtins to support
    half of projects
    [Details]

    View Slide

  147. How much effort is needed to implement GCC
    Builtins?
    104
    1600 builtins to support 99% of projects
    32 builtins to support
    half of projects
    [Details]
    Machine-independent builtins
    are the “low-hanging fruits”

    View Slide

  148. Are they a legacy feature that
    has survived until today?
    105

    View Slide

  149. GCC Builtin Usage Over Time
    106
    [Details]
    We analyzed the commit history of
    the GCC builtin projects

    View Slide

  150. GCC Builtin Usage Over Time
    Trend Projects
    Increasing 38%
    Stagnant 26%
    Decreasing 14%
    Inconclusive 22%
    107
    64% of projects have been
    mainly adding builtins

    View Slide

  151. Research Opportunities
    • Other elements, such as compiler pragmas and function attributes
    are not widely understood
    • Testing the correct usage of inline assembly and GCC builtins
    • Support in formal models and static analysis tools
    • Automatic approaches?
    108

    View Slide

  152. Inline Assembly
    Compiler builtins
    System calls
    External Libraries
    Low-level libc/POSIX functions
    Linkage features
    C/C++
    Fortran
    Compiler extensions
    Non-standard-compliant code

    View Slide

  153. 110
    Addressing the last 20% of the problem took
    80% of the time

    View Slide

  154. Pareto Principle
    111
    80% of the effects come
    from 20% of the causes

    View Slide

  155. Pareto Principle
    112
    It is useful to consider the “seemingly” less-
    important 20% of a problem
    • Avoids oversimplifications
    • Helps designing holistic solutions
    • Leads to new research questions

    View Slide

  156. Discussion: What About Other Overlooked
    Problems?
    113
    In which 20% of important use cases do
    current language interoperability approaches fail?
    Which 20% of important use cases cannot be expressed
    with and how does it affect users?
    Which 20% of an approach for connecting heterogeneous code
    provides bad usability and how can we improve on it?

    View Slide

  157. Summary
    114

    View Slide