Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | TruffleRuby: Wrapping up compatibility for C extensions Petr Chalupa Principal Member of Technical Staff Oracle Labs April 20, 2019

Slide 3

Slide 3 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to provide some insight into a line of research in Oracle Labs. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. Oracle reserves the right to alter its development plans and practices at any time, and the development, release, and timing of any features or functionality described in connection with any Oracle product or service remains at the sole discretion of Oracle. Any views expressed in this presentation are my own and do not necessarily reflect the views of Oracle.

Slide 4

Slide 4 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Program Agenda Technologies Execution of the C extensions Old approach New approach Conclusion 1 2 3 4 5

Slide 5

Slide 5 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Technologies

Slide 6

Slide 6 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • Alternative implementation of Ruby • A drop-in replacement for the CRuby implementation – C extensions support – Good startup time for development • Just-in-time compilation • Generally faster then any other implementation – If not, please file a bug TruffleRuby TruffleRuby Ruby

Slide 7

Slide 7 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • Specializing (self-modifying) Abstract Syntax Tree interpreter • Simpler language implementation • Polyglot protocol – Languages can pass values to each other – No usual slow language barrier • Instrumentation – Multi-language debugger – Profilers Truffle - Language Implementation Framework Truffle TruffleRuby Ruby

Slide 8

Slide 8 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Truffle - Language Implementation Framework

Slide 9

Slide 9 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • Dynamic compiler written in Java • In combination with Truffle – Inlining – Splitting, method cloning – Partial Evaluation • All the optimizations are done in Truffle and Graal rather than the language implementations – Shared – Optimized together Graal Compiler Graal Truffle TruffleRuby Ruby

Slide 10

Slide 10 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • Executes compiled methods – Provided by Graal • Garbage collector • Runs Java Java VM Java VM Graal Truffle TruffleRuby Ruby

Slide 11

Slide 11 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • Distribution of – Java VM – Graal – Truffle – Languages you can run • Java, Kotlin, Scala, ... • Ruby, JS, R, Python, ... • C, C++, Fortran, ... GraalVM GraalVM Java VM Graal Truffle TruffleRuby Other languages ... Ruby

Slide 12

Slide 12 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • LLVM bitcode runtime – Technically an interpreter with JIT – Any language transformable to LLVM bitcode can be executed – E.g. C/C++ and Fortran • TruffleRuby executes Ruby code • Sulong executes C extensions • Both are Truffle languages optimized together by Graal Sulong GraalVM Java VM Graal Truffle TruffleRuby Sulong C Ruby

Slide 13

Slide 13 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • Ahead of time compilation of Java applications • TruffleRuby, Sulong, Truffle, and Graal are written in Java • Executable Ruby binary is produced with fast startup – No slow startup limitation for day to day development Substrate VM GraalVM Graal Truffle TruffleRuby Sulong C Java VM Substrate VM Ruby

Slide 14

Slide 14 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Startup time Implementation Time Memory MB TruffleRuby native 0.025 65 CRuby 2.6.2 0.048 14 Rubinius 3.107 0.150 78 JRuby 9.2.7.0 1.357 160 TruffleRuby JVM 1.787 456 Of ruby –e “p puts ‘Hello world’”

Slide 15

Slide 15 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Execution of the C extensions

Slide 16

Slide 16 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Execution of the C extensions • Sulong is a Truffle language – Interoperability with other languages – VALUEs in the C extension code can be Ruby objects • Managed vs. unmanaged memory – Managed (Garbage collected Ruby) objects cannot be put into unmanaged (native) memory – We do tricks to store Ruby objects into native memory (e.g., arrays or structs in C) • Optimized together – In Truffle all languages use the same Intermediate Representation • Polyglot protocol

Slide 17

Slide 17 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Polyglot protocol • An API allowing languages to talk to foreign values without conversion – hasMembers, readMember, writeMember, ... • Example from C: a_ruby_object->member – isPointer, asPointer, ... • Example from C: a_struct->member = ruby_object – ... • Part of Truffle – Implemented with specializing nodes • If C reads from a Ruby object Ruby provides nodes defining the Ruby read – JITed

Slide 18

Slide 18 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Understanding C extension evaluation static void gzfile_reader_rewind(struct gzfile *gz) { long n; n = gz->z.stream.total_in; if (!NIL_P(gz->z.input)) { n += RSTRING_LEN(gz->z.input); } rb_funcall(gz->io, id_seek, 2, INT2NUM(-n), INT2FIX(1)); gzfile_reset(gz); } NIL_P calls nil? on a Ruby object read from a nested struct. Get a length as C long of a String stored in a nested struct. Call a method on a ruby object stored in a struct with arguments.

Slide 19

Slide 19 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Old approach

Slide 20

Slide 20 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Storing handles instead of managed objects • Managed objects – Are managed by VM – Can be moved – Can be garbage collected • Struct in native memory cannot hold managed object • Handles are stored instead – a number / virtualized pointer • A table of handles and managed objects is maintained – The managed objects cannot be released until the handle is – The handles and therefore the objects have to be released manually

Slide 21

Slide 21 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | A method from zlib C extension static void zstream_passthrough_input(struct zstream *z) { if (!NIL_P(z->input)) { zstream_append_buffer2(z, z->input); z->input = Qnil; } }

Slide 22

Slide 22 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | A method from zlib C extension static void zstream_passthrough_input(struct zstream *z) { if (!NIL_P(rb_tr_managed_from_handle(z->input))) { zstream_append_buffer2(z, rb_tr_managed_from_handle(z->input)); z->input = rb_tr_handle_for_managed(Qnil); } } Red links are strong references. Convert the handle back to a Ruby managed object. Convert the managed Ruby object to a handle.

Slide 23

Slide 23 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | A method from zlib C extension static void gzfile_reader_rewind(struct gzfile *gz) { long n; n = gz->z.stream.total_in; if (!NIL_P(rb_tr_managed_from_handle(gz->z.input))) { n += RSTRING_LEN(rb_tr_managed_from_handle(gz->z.input)); } rb_funcall(rb_tr_managed_from_handle(gz->io), id_seek, 2, INT2NUM(-n), INT2FIX(1)); gzfile_reset(gz); } • About 200 handle methods added just in zlib.c – Not good, too many patches to maintain

Slide 24

Slide 24 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Using managed structs to reduce handle methods • Trying to make more stuff managed to reduce number of required handle methods • Managed struct is A Ruby object which behaves as a C struct replacing native structs – Managed struct cannot be stored on the native stack and has to be initialized – Sometimes has to be turned into pointer – Inner structs required special handling • Does not solve everything – Number of patches reduced but still remaining – Calls to native libraries (e.g. libz.so) still require handles

Slide 25

Slide 25 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | A method from zlib C extension static void zstream_passthrough_input(struct zstream *z) { if (!NIL_P(rb_tr_managed_from_handle(z->input))) { zstream_append_buffer2(z, rb_tr_managed_from_handle(z->input)); z->input = rb_tr_handle_for_managed(Qnil); } } struct zstream z; raise_zlib_error(err, z.stream.msg);

Slide 26

Slide 26 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | A method from zlib C extension static void zstream_passthrough_input(struct zstream *z) { if (!NIL_P(rb_tr_managed_from_handle(z->input))) { zstream_append_buffer2(z, rb_tr_managed_from_handle(z->input)); z->input = rb_tr_handle_for_managed(Qnil); } } struct zstream *z; z = rb_tr_new_managed_struct(zstream); raise_zlib_error(err, z->stream.msg); The managed struct cannot be stored on native stack, the local variable has to be turned into pointer. The arrow operator has to be used instead.

Slide 27

Slide 27 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Leaking handles • A table handle -> managed object is maintained – The managed objects are not released until the handle is • Part of the C extension patch has to be handle management – The C extension has be understood and the handles freed at the right places – Difficult in practice, e.g. a graph of structs representing a xml document • Red are strong references, blue are weak

Slide 28

Slide 28 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Leaking handles • A table handle -> managed object is maintained – The managed objects are not released until the handle is • Part of the C extension patch has to be handle management – The C extension has be understood and the handles freed at the right places – Difficult in practice, e.g. a graph of structs representing a xml document • Red are strong references, blue are weak

Slide 29

Slide 29 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | New approach

Slide 30

Slide 30 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Wrap all the Ruby objects before giving them to C • A wrapper which knows how to be converted to a pointer – Converted lazily when needed – Allows to track all the conversions of the wrapper to a pointer • The pointer is stored into native memory instead of the managed wrapper public class ValueWrapper implements TruffleObject { private final Object object; private long handle; ! " @ExportMessage public boolean isPointer() { return true; } @ExportMessage public long asPointer() { # lazy $ return handle; } ! " }

Slide 31

Slide 31 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Wrap all the Ruby objects before giving them to C • Ruby C boundary has to translate back and forth – Changes in our implementation, not in the C extensions polyglot_invoke( recv , method_name, 2, arg1 , arg2 )

Slide 32

Slide 32 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Wrap all the Ruby objects before giving them to C • Ruby C boundary has to translate back and forth – Changes in our implementation, not in the C extensions rb_tr_wrap(polyglot_invoke(rb_tr_unwrap(recv), method_name, 2, rb_tr_unwrap(arg1), rb_tr_unwrap(arg2)))

Slide 33

Slide 33 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | A method from zlib.c static void zstream_passthrough_input(struct zstream *z) { if (!NIL_P(z->input)) { zstream_append_buffer2(z, z->input); z->input = Qnil; } } • No changes needed in the C extension code • No patches to maintain A stored pointer converted back to the wrapper and then to a Ruby object before nil? is called. The Qnil constant already contains the wrapped nil Ruby object which is converted to a pointer to be stored in the native struct. The pointer is simply passed in.

Slide 34

Slide 34 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Memory management • We need one solution for everything, we cannot do something special for each C extension – Managing per C extension patches is not a long-term maintainable solution • MRI keeps objects alive with – Stack marking – Custom mark functions for C data stored in Ruby objects

Slide 35

Slide 35 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Memory management – Stack • MRI keeps alive all objects on the stack – We keep them alive by creating a list on each enter into a method implemented in C – Every lazily created pointer for a wrapper is added into the list – The list is discarded when the C method is left • Not all wrappers need the pointer created – Only when stored into native memory

Slide 36

Slide 36 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. |

Slide 37

Slide 37 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. |

Slide 38

Slide 38 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. |

Slide 39

Slide 39 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. |

Slide 40

Slide 40 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. |

Slide 41

Slide 41 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. |

Slide 42

Slide 42 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. |

Slide 43

Slide 43 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. |

Slide 44

Slide 44 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Memory management – Mark functions • C data can be attached to a Ruby object – TypedData_Make_Struct – There is a custom mark function called during GC • Which makes sure the stored objects are not garbage collected struct zstream { VALUE buf; VALUE input; // ... } static void zstream_mark(void *p) { struct zstream *z = p; rb_gc_mark(z->buf); rb_gc_mark(z->input); }

Slide 45

Slide 45 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Memory management – Mark functions • C data can be attached to a Ruby object – TypedData_Make_Struct – There is a custom mark function called during GC • Which makes sure the stored objects are not garbage collected struct zstream { VALUE buf; VALUE input; // ... } static void zstream_mark(void *p) { struct zstream *z = p; rb_gc_mark(z->buf); rb_gc_mark(z->input); }

Slide 46

Slide 46 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Memory management – Mark functions • We keep weak list of all mark functions • We keep a fixed-sized buffer of wrappers which needed conversion to a pointer – Every lazily created pointer in wrapper is put into this buffer • Whenever the buffer is full we run the mark functions – Updating the held references to the marked objects

Slide 47

Slide 47 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • Pentagons are wrapped Ruby objects • Red arrows are strong references • ”A” has a C data attached – A struct with single VALUE member • Preservation table is fixed-sized buffer • Handle table maps pointers to objects Memory management – Mark functions

Slide 48

Slide 48 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • Assign B into the A’s struct member • Blue arrows are weak references – Can be garbage collected • B handle is a long (rectangle) – It is created lazily when B is being stored into native memory • C has no handle – It is translated to B by the Handle table when needed • B is put into Preservation table to prevent its garbage collection Memory management – Mark functions

Slide 49

Slide 49 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • When the preservation table is full the mark functions are executed – B is marked by A’s marking function therefore it is put in A’s list of marked objects – After all mark functions run we can clear Preservation table Memory management – Mark functions static void a_mark(void *p) { struct a_struct *z = p; rb_gc_mark(z->member); }

Slide 50

Slide 50 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • Assign a different Ruby object C into A’s struct • C handle is stored in the struct • A list of marked objects stays pointing to B until mark functions are run again • C is put into preservation table Memory management – Mark functions

Slide 51

Slide 51 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • Another run of mark functions will store C instead of B into A’s list of marked objects Memory management – Mark functions

Slide 52

Slide 52 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • The B and its handle can be garbage collected – Assuming B is not referenced anywhere else Memory management – Mark functions

Slide 53

Slide 53 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | • If A is not referenced anywhere everything can be garbage collected – Only internal global tables remain • Actually thread local tables Memory management – Mark functions

Slide 54

Slide 54 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Conclusion

Slide 55

Slide 55 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Much better compatibility • Significant compatibility improvement • We run without patches, out of the box – All the standard libraries: openssl, zlib, psych, etc, syslog, ... – Database adapters: sqlite3, mysql2, pg, ... • All these need to be re-implemented for JRuby – Gems: puma, nio4r, byebug, websocket_driver, racc, msgpack, nokogiri, ... • Probably many more, we do not know about

Slide 56

Slide 56 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | TruffleRuby • C extensions – Supported • Extensions – Not required – Generally pure Ruby should be fast enough • pure JSON on TruffleRuby is faster than Cext on CRuby • FFI – Supported (RC16) JRuby • C extensions – Not supported • Replacements required • Extensions – Java extensions sometimes required – For performance reasons – Both C and Java extensions • FFI – Supported CRuby • C extensions – Supported • Extensions – Required – For performance reasons • FFI – Supported – Not enough gems though Comparison

Slide 57

Slide 57 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Status • We are ready for experiments • Open-source: https://github.com/oracle/truffleruby • Give TruffleRuby a try and please report issues on Github – We are actively working on them • Installation, latest release for RubyKaigi with full FFI: – rvm install truffleruby – rbenv install truffleruby-1.0.0-rc16 – ruby-install truffleruby

Slide 58

Slide 58 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The preceding is intended to provide some insight into a line of research in Oracle Labs. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. Oracle reserves the right to alter its development plans and practices at any time, and the development, release, and timing of any features or functionality described in connection with any Oracle product or service remains at the sole discretion of Oracle. Any views expressed in this presentation are my own and do not necessarily reflect the views of Oracle.

Slide 59

Slide 59 text

Copyright © 2019, Oracle and/or its affiliates. All rights reserved. |

Slide 60

Slide 60 text

No content