Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ferrari Driven Development: superfast Ruby with...

Ferrari Driven Development: superfast Ruby with Rubex

My talk at Ruby Kaigi 2018, Sendai.

Sameer Deshmukh

June 01, 2018
Tweet

More Decks by Sameer Deshmukh

Other Decks in Programming

Transcript

  1. Various solutions exist (partly) • Ruby inline. – Doesn’t scale.

    • FFI. – Reductive and manual compilation. • SWIG. – Evil, unreadable wrappers. • Helix. – Entirely new language/paradigm.
  2. Improvements from last year •    • Rubex

    is a much more robust and stable language. • Lots of refactoring of internal codebase. • Little shift in Rubex’s goals - from simply speed to portability/readability of C extensions.
  3. What you think of C APIs of gems Your ruby

    library’s C ext Another C ext API that you are using
  4. What it is actually feels like Your ruby library’s C

    ext Another C ext API that you are using
  5. Ruby vs. Rubex Ruby program Rubex program def add(int a,int

    b) return a + b end def add(a, b) return a + b end
  6. Rubex code C code CRuby runtime Language which looks like

    Ruby. C Code ready to interface with Ruby VM. Code actually runs here.
  7. class Array2Hash def self.convert(arr a) long int i = a.size,

    j = 0 hsh result = {} while j < i do result[a[j]] = j j += 1 end return result end end
  8. class Array2Hash def self.convert(arr a) long int i = a.size,

    j = 0 hsh result = {} while j < i do result[a[j]] = j j += 1 end return result end end
  9. class Array2Hash def self.convert(arr a) long int i = a.size,

    j = 0 hsh result = {} while j < i do result[a[j]] = j j += 1 end return result end end
  10. class Array2Hash def self.convert(arr a) long int i = a.size,

    j = 0 hsh result = {} while j < i do result[a[j]] = j j += 1 end return result end end
  11. class Array2Hash def self.convert(arr a) long int i = a.size,

    j = 0 hsh result = {} while j < i do result[a[j]] = j j += 1 end return result end end
  12. Benchmarks Warming up -------------------------------------- convert 368.000 i/100ms each_with_index.to_h 236.000 i/100ms

    Calculating ------------------------------------- convert 3.488k (± 9.8%) i/s - 17.296k in 5.012260s each_with_index.to_h 2.192k (± 8.3%) i/s - 11.092k in 5.097432s Comparison: convert: 3487.8 i/s each_with_index.to_h: 2192.3 i/s - 1.59x slower
  13. • GC marking of Ruby objects. • Memory deallocation. •

    Write an extconf.rb. • struct rb_data_type_t. • TypedData_Make_Struct(). • TypedData_Get_Struct(). • rb_define_instance_method(). • rb_define_class(). • rb_define_alloc_func().
  14. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  15. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  16. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  17. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  18. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  19. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  20. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end _
  21. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  22. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  23. Rubex struct wrapping • ~3x reduction in LoC written. •

    Friendly, elegant Ruby-like interface. • No compromise in speed. • No C code!
  24. class Foo cfunc void bar(int a, b) # some C

    and Ruby intermix end def baz(float c, e) bar(1, e) end end
  25. class Foo cfunc void bar(int a, b) # some C

    and Ruby intermix end def baz(float c, e) bar(1, e) end end
  26. class Foo cfunc void bar(int a, b) # some C

    and Ruby intermix end def baz(float c, e) bar(1, e) end end
  27. class Klass cfunc void foo(int a, int b) def bar

    def baz(int a, b, float c) end class OtherKlass cfunc void foo(int a, int b) def bar def baz(int a, b, float c) end
  28. Advantages • Easily import C extension APIs through a ‘require_rubex’

    compiler declaration. • Supply only the compiled binary and API files to like most C libraries. • Portal implementations across Operating Systems. • Auto-generated packaging and compiling scripts.
  29. Read a file of 5_00_000 lines with a value at

    each line into memory Read line 0 – 1_25_000 Read line 1_25_000 – 2_50_00 Read line 2_50_000 – 3_75_000 Read line 3_75_000 – 5_00_00 Compute sum of values at each line Compute sum of values at each line Compute sum of values at each line Compute sum of values at each line Get all values and compute the average
  30. Read a file of 5_00_000 lines with a value at

    each line into memory Read line 0 – 1_25_000 Read line 1_25_000 – 2_50_00 Read line 2_50_000 – 3_75_000 Read line 3_75_000 – 5_00_00 Compute sum of values at each line Compute sum of values at each line Compute sum of values at each line Compute sum of values at each line CPU 1 CPU 2 CPU 3 CPU 4
  31. Read a file of 5_00_000 lines with a value at

    each line into memory Read line 0 – 1_25_000 Read line 1_25_000 – 2_50_00 Read line 2_50_000 – 3_75_000 Read line 3_75_000 – 5_00_00 Compute sum of values at each line Compute sum of values at each line Compute sum of values at each line Compute sum of values at each line Global Interpreter Lock CPU 1 CPU 2 CPU 3 CPU 4
  32. # In a rubex file test.rubex cfunc void _some_computation int

    i, j, k = 0 no_gil # ... perform some computation end end
  33. # In a calling Ruby script caller.rb require ‘compiled_binary.so’ def

    compute_without_gil t = [] 4.times { t << Thread.new { _some_computation } } 4.times { t.join } end
  34. Actual implementation • Made a simple implementation of the aforementioned

    example of reading and computing values from a file. • Benchmarks indicate huge difference in performance.
  35. Warming up -------------------------------------- without GIL in C 3.000 i/100ms with

    GIL in Ruby 1.000 i/100ms with GIL in C 1.000 i/100ms Calculating ------------------------------------- without GIL in C 36.210 (± 2.8%) i/s - 183.000 in 5.059510s with GIL in Ruby 0.102 (± 0.0%) i/s - 1.000 in 9.830386s with GIL in C 18.591 (± 0.0%) i/s - 93.000 in 5.005381s Comparison: without GIL in C: 36.2 i/s with GIL in C: 18.6 i/s - 1.95x slower with GIL in Ruby: 0.1 i/s - 355.96x slower
  36. Limitations of GIL release • Can only use C data

    structures inside the no_gil block. • Overhead associated with releasing and regaining GIL. • Might break code that depends on the GIL.
  37. Many C functions need to be used • rb_raise() for

    raising error. • rb_rescue(), rb_rescue2(), rb_protect(), rb_ensure() for rescue and ensure blocks. • rb_errinfo() for getting the last error raised. • rb_set_errinfo(Qnil) for resetting error information.
  38. Workflow becomes complex • Almost zero compliance with begin-ensure block

    workflow. • Create C function callbacks. • Manually catch and rescue exceptions. • Inflexibility in sending data to callbacks.
  39. int i = accept_number() begin raise(ArgumentError) if i == 3

    raise(FooBarError) if i == 5 rescue ArgumentError i += 1 rescue FooBarError i += 2 ensure i += 10 end
  40. Differences from Ruby • Must specify brackets for function calls.

    • No support for blocks/closures (yet). • Must specify return keyword to return from functions. • No support for ‘value of’ operator *. • No support for -> operator for struct pointers. Differences from C
  41. Notable Rubex examples • Rubex repo examples/ folder. – Fully

    functional libcsv wrapper for reading CSV files written entirely in Rubex. • Array2Hash gem – https://github.com/v0dro/array2hash
  42. Detailed Docs and Tutorial • REFERENCE.md. – Complete specification of

    the entire language. • TUTORIAL.md. – Quick, easy to use explanation with code samples.
  43. Conclusion • Rubex is a fast and productive way of

    writing Ruby C extensions. • Provides users with the elegance of Ruby and the power of C while following the principle of least surprise. • Provides abstractions in C extensions at no performance cost.
  44. Rubex ideas • Typed memory views. – Get a ‘memory

    view’ of contiguous Ruby types. – Will work with NMatrix and NArray gems. • Direct interfacing with GPUs through native kernels. – Zero-abstraction interfacing with GPUs for accelerating computation. – Possible use in cumo. • Integration with GDB.
  45. Rubyplot – advanced ruby plotting library • Ruby does not

    have a single native plotting solution that even comes close to the likes of matplotlib/bokeh/something else. • Rubyists don’t have a single go-to solution for their visualization needs that can scale. • I think this situation is ridiculous for such a mature language ecosystem.
  46. Various partial solutions exist • Matplotlib.rb – interfaces python matplotlib

    via pycall. • Nyaplot – Bokeh like web visualization but abandoned by author. • Google charts/high charts/etc. – too much dependence on 3rd party web tools, some of which are paid/non-free. • Various GNU plot frontends.
  47. Rubyplot can change that! • A native plotting solution written

    in C++ with a Ruby wrapper. • Will directly interface with image-magick, GTK and GR to create a powerful plotting tool. • Unlike matplotlib, will be eventually a language neutral C++ library to leverage contributions from other language communities.
  48. View the progress of rubyplot • Development started a few

    weeks ago. • Follow on discourse: – https://discourse.ruby-data.org/ • Follow on GitHub: – https://github.com/sciruby/rubyplot • Contributions/opinions are welcome!
  49. Common array library • Nmatrix and numo/narray are two major

    array libraries. • Important to bridge this divide and build on a library that is robust and well supported. • Potential answer is plures – a language independent C backend to numpy.
  50. More about plures • Plures is supported by Quansight by

    the creators of numpy (Python). • Common C API across languages/frameworks. • Need more discussion on Ruby frontend.
  51. Acknowledgements • Ruby Association Grant 2016. • Kenta Murata, Koichi

    Sasada and Naotoshi Seo for their support and mentorship. • Fukuoka Ruby Award 2016. • Ruby Science Foundation.