Upgrade to Pro — share decks privately, control downloads, hide ads and more …

VMM'16: C, C++, and Fortran on the JVM via Sulong

Manuel Rigger
September 02, 2016

VMM'16: C, C++, and Fortran on the JVM via Sulong

Virtual Machine Meetup 2016

Manuel Rigger

September 02, 2016
Tweet

More Decks by Manuel Rigger

Other Decks in Research

Transcript

  1. C, C++, and Fortran on the JVM via Sulong @RiggerManuel

    Johannes Kepler University Virtual Machine Meetup 2. September, 2016
  2. JVM C C++ Fortran ... Execute on What is Sulong?

    2 Execute low-level languages… … on the JVM …
  3. JVM LLVM IR C C++ Fortran ... Compile to Execute

    on What is Sulong? 3 Execute low-level languages… … on the JVM … … by compiling them to LLVM IR, …
  4. LLVM IR Interpreter JVM LLVM IR C C++ Fortran ...

    Compile to Execute with What is Sulong? 4 Execute low-level languages… … interpreting this IR, … … on the JVM … … by compiling them to LLVM IR, …
  5. What is Sulong? 5 Execute low-level languages… … interpreting this

    IR, … … on the JVM … … and using a dynamic compiler to make the approach fast. … by compiling them to LLVM IR, … LLVM IR Interpreter JVM LLVM IR C C++ Fortran ... JIT compiler Compile to Execute with
  6. Why do We Need Sulong? (1) To implement dynamic language‘s

    native interfaces 6 (2) As an alternative to JNI JRuby+Truffle FastR Graal.JS JVM
  7. (1) Truffle as a Multi Language Runtime [3] Grimmer, et

    al. "High-performance cross-language interoperability in a multi-language runtime." JVM Truffle Language Interoperability 7
  8. (1) Truffle as a Multi Language Runtime #include<stdio.h> struct complex

    { double r; double i; }; int main() { struct complex *a = …; struct complex *b = …; add(a, b); } function add(a, b) { var result = {r:0, i:0}; result.r = a.r + b.r; result.i = a.i + b.i; return result; } main.c complex.js 8
  9. (1) Truffle as a Multi Language Runtime #include<stdio.h> struct complex

    { double r; double i; }; int main() { struct complex *a = …; struct complex *b = …; add(a, b); } function add(a, b) { var result = {r:0, i:0}; result.r = a.r + b.r; result.i = a.i + b.i; return result; } add(a, b) main.c complex.js 8
  10. (1) Truffle as a Multi Language Runtime #include<stdio.h> struct complex

    { double r; double i; }; int main() { struct complex *a = …; struct complex *b = …; add(a, b); } function add(a, b) { var result = {r:0, i:0}; result.r = a.r + b.r; result.i = a.i + b.i; return result; } add(a, b) a->r b->r a->i b->i main.c complex.js 8
  11. (1) Truffle as a Multi Language Runtime LLVM Cauldron September

    8th, 2016 Hebden Bridge, UK 10 Using LLVM and Sulong for Language C Extensions Chris Seaton - Oracle Labs
  12. (2) Java Platform 11 JVM Java Native Interface [1] Rose.

    "Bytecodes meet combinators: invokedynamic on the JVM." [2] Würthinger et al. "One VM to rule them all."
  13. (2) Java Platform: Native Languages? JVM Java Native Interface Native

    side 12 [1] Rose. "Bytecodes meet combinators: invokedynamic on the JVM." [2] Würthinger et al. "One VM to rule them all."
  14. (2) Disadvantages • Slow • Transitions between Java and native

    code • Conversions/marshaling • Language boundaries are compilation boundaries • Breaks Java‘s safety guarantees Java Native Interface Native side JVM 13
  15. (2) Java Platform with Sulong 14 JVM Maybe, in the

    next decade or so we‘ll see C programs, or C++ programs running in managed mode on top of the JVM, I would not be surprised. John Rose (JVM Architect) @ JVMLS 2016
  16. System Overview LLVM IR Interpreter Truffle LLVM IR Clang C

    C++ GCC Fortran Other LLVM frontend ... JVM + Graal tooling 15
  17. System Overview LLVM IR Interpreter Truffle LLVM IR Clang C

    C++ GCC Fortran Other LLVM frontend ... JVM + Graal tooling 15
  18. System Overview LLVM IR Interpreter Truffle LLVM IR Clang C

    C++ GCC Fortran Other LLVM frontend ... JVM + Graal tooling 15
  19. System Overview LLVM IR Interpreter Truffle LLVM IR Clang C

    C++ GCC Fortran Other LLVM frontend ... JVM + Graal tooling 15
  20. LLVM IR AST Interpreter define i32 @sub(i32 %a, i32 %b)

    { %1 = sub nsw i32 %a, %b ret i32 %1 } Function Node WriteI32 Node %1 SubI32 Node ReadI32 Node %a ReadI32 Node %b ReadI32 Node %1 ReturnI32 Node LLVM IR Truffle AST LLVM IR Interpreter Frontend
  21. Graal compiler 17 Function Node WriteI32 Node %1 SubI32 Node

    ReadI32 Node %a ReadI32 Node %b ReadI32 Node %1 ReturnI32 Node sub %a %b ret Truffle AST Machine code Graal compiler
  22. Memory Management? • We want to use programs that use

    non-standard C! • We want to support existing machine code! 18 • We want to be memory safe! Native/Unmanaged Sulong https://github.com/graalvm/sulong Safe/Managed Sulong
  23. Memory Management? • We want to use programs that use

    non-standard C! • We want to support existing machine code! 19 • We want to be memory safe! Native/Unmanaged Sulong https://github.com/graalvm/sulong Safe/Managed Sulong
  24. Support non-standard C 20 struct { char a; int b;

    } example = {1, 2}; long val = *((long*) &example); // 8589934593 b 0 4 8 1 a exampleAddress
  25. Support non-standard C 20 struct { char a; int b;

    } example = {1, 2}; long val = *((long*) &example); // 8589934593 unsafe.putChar(exampleAddress , (char) 1); unsafe.putInt (exampleAddress + 4, 2); long val = unsafe.getLong(exampleAddress); b 0 4 8 1 a exampleAddress We use the same data layout as static compilers produce!
  26. We want to support existing machine code! 21 malloc Sulong

    Truffle AST Graal Native Function Interface
  27. C Performance 0 0,2 0,4 0,6 0,8 1 1,2 1,4

    The Computer Language Benchmark Game Sulong Clang O3 22 Peak runtime performance, higher is better relative to Clang -O3 based on LLVM 3.3
  28. C Performance 23 Peak runtime performance, higher is better relative

    to Clang -O3 based on LLVM 3.3 0 0,2 0,4 0,6 0,8 1 1,2 bzip2 gzip oggenc Large single compilation-unit C programs Sulong Clang O3
  29. C Performance 23 Peak runtime performance, higher is better relative

    to Clang -O3 based on LLVM 3.3 0 0,2 0,4 0,6 0,8 1 1,2 bzip2 gzip oggenc Large single compilation-unit C programs Sulong Clang O3
  30. Memory Management? • We want to use programs that use

    non-standard C! • We want to support existing machine code! 24 • We want to be memory safe! Native/Unmanaged Sulong https://github.com/graalvm/sulong Safe/Managed Sulong
  31. Memory Management? • We want to use programs that use

    non-standard C! • We want to support existing machine code! 24 • We want to be memory safe! Native/Unmanaged Sulong https://github.com/graalvm/sulong Safe/Managed Sulong
  32. Memory Errors in C int *arr = malloc(4 * sizeof(int))

    25 [4] Szekeres, et al. "Sok: Eternal war in memory."
  33. Memory Errors in C int *arr = malloc(4 * sizeof(int))

    … = arr[5] arr[5] = … 25 [4] Szekeres, et al. "Sok: Eternal war in memory."
  34. Memory Errors in C int *arr = malloc(4 * sizeof(int))

    … = arr[5] arr[5] = … Spatial memory safety error 25 [4] Szekeres, et al. "Sok: Eternal war in memory."
  35. Memory Errors in C int *arr = malloc(4 * sizeof(int))

    … = arr[5] arr[5] = … free(arr); … = arr[0] arr[0] = … Spatial memory safety error 25 [4] Szekeres, et al. "Sok: Eternal war in memory."
  36. Memory Errors in C int *arr = malloc(4 * sizeof(int))

    … = arr[5] arr[5] = … free(arr); … = arr[0] arr[0] = … Spatial memory safety error Temporal memory safety error 25 [4] Szekeres, et al. "Sok: Eternal war in memory."
  37. Prevent Spatial Errors … = arr[5] arr[5] = … 26

    (arr[5] ≈ arr + sizeof(int) * 5) ManagedAddress offset=20 data I32Array elementSize=4 contents {1, 2, 3, 4}
  38. Prevent Spatial Errors … = arr[5] arr[5] = … contents[20

    / 4]  ArrayOutOfBoundsException 26 (arr[5] ≈ arr + sizeof(int) * 5) ManagedAddress offset=20 data I32Array elementSize=4 contents {1, 2, 3, 4}
  39. Prevent Temporal Errors free(arr); … = arr[0] arr[0] = …

    27 ManagedAddress offset=0 data I32Array elementSize=4 contents=null
  40. Prevent Temporal Errors free(arr); … = arr[0] arr[0] = …

    contents[0]  NullPointerException 27 ManagedAddress offset=0 data I32Array elementSize=4 contents=null
  41. Support Existing Code Binary translation for closed-source libraries 28 [7]

    Shen, Bor-Yeh, et al. "LLBT: an LLVM- based static binary translator." [6] Chipounov, Vitaly, et al. Dynamically Translating x86 to LLVM using QEMU. [5] Dinaburg, Artem, et al. Mcsema: Static translation of x86 instructions to llvm. LLVM IR MC-Semantics [5]/ QEMU [6] x86 LLBT [7] ARM
  42. (Native) Sulong Contributors Oracle Labs Chris Seaton Mick Jordan 29

    Johannes Kepler University Benoit Daloze Daniel Pekarek David Gnedt Jacob Kreindl Katharina Prinz Manuel Rigger University of Manchester Colin Barrett
  43. Summary Why? How? JRuby+Truffle FastR Graal.JS JVM • We want

    to use programs that use non-standard C! • We want to support existing machine code! • We want to be memory safe! Native/Unmanaged Sulong https://github.com/graalvm/sulong Safe/Managed Sulong
  44. References [1] Rose, John R. "Bytecodes meet combinators: invokedynamic on

    the JVM." Proceedings of the Third Workshop on Virtual Machines and Intermediate Languages. ACM, 2009. [2] Würthinger, Thomas, et al. "One VM to rule them all." Proceedings of the 2013 ACM international symposium on New ideas, new paradigms, and reflections on programming & software. ACM, 2013. [3] Grimmer, Matthias, et al. "High-performance cross-language interoperability in a multi-language runtime." Proceedings of the 11th Symposium on Dynamic Languages. ACM, 2015. [4] Szekeres, Laszlo, et al. "Sok: Eternal war in memory." Security and Privacy (SP), 2013 IEEE Symposium on. IEEE, 2013. [5] Dinaburg, Artem, and Andrew Ruef. Mcsema: Static translation of x86 instructions to llvm. ReCon 2014 Conference, Montreal, Canada [6] Chipounov, Vitaly, and George Candea. Dynamically Translating x86 to LLVM using QEMU. No. EPFL-REPORT-149975. 2010. [7] Shen, Bor-Yeh, et al. "LLBT: an LLVM-based static binary translator." Proceedings of CASES’12. 32