Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Ruby Kaigi 2017 - C how to supercharge your Ruby with Rubex

Ruby Kaigi 2017 - C how to supercharge your Ruby with Rubex

5083e35c5075b75473919524286239b3?s=128

Sameer Deshmukh

September 18, 2017
Tweet

Transcript

  1. Namaste! こにちは!

  2. Sameer Deshmukh @v0dro @v0dro

  3. India | Pune

  4. None
  5. Master’s Degree Student(HPC) Tokyo Institute of Technology

  6. None
  7. None
  8. Ruby Science Foundation www.sciruby.com @sciruby @sciruby

  9. Rubex: Highly productive C extensions.

  10. Ruby is an awesome language, but it is slow.

  11. Ruby speed reliability C

  12. Ruby speed reliability C Nokogiri Nokogiri::XML() fast_blank String#blank? libxml Handwritten

    C
  13. C extensions have BIG problems

  14. Difficult and irritating to write. Steep learning curve. Lots of

    scaffolding code.
  15. Debugging is time consuming.

  16. Manually bootstrap the extension with Ruby.

  17. Need to care about small things™*. *Matz at RDRC 2016

  18. Various solutions exist (partly) • Ruby inline. – Doesn’t scale.

    • FFI. – Reductive and manual compilation. • SWIG. – Evil, unreadable wrappers. • Helix. – Entirely new language/paradigm.
  19. Ideal solution: Super-fast (and nice) Ruby.

  20. None
  21. Rubex: A language with the elegance of Ruby and the

    power of C.
  22. Ruby vs. Rubex Ruby program Rubex program def add(int a,int

    b) return a + b end def add(a, b) return a + b end
  23. Rubex code C code CRuby runtime Language which looks like

    Ruby. C Code ready to interface with Ruby VM. Code actually runs here.
  24. None
  25. None
  26. ["a", "b", "see", "d" ... ]

  27. { "a" => 0, "b" => 1, "see" => 2,

    "d" => 4 ... }
  28. array.each_with_index.to_h

  29. class Array2Hash def self.convert(arr a) long int i = a.size,

    j = 0 hsh result = {} while j < i do result[a[j]] = j j += 1 end return result end end
  30. class Array2Hash def self.convert(arr a) long int i = a.size,

    j = 0 hsh result = {} while j < i do result[a[j]] = j j += 1 end return result end end
  31. class Array2Hash def self.convert(arr a) long int i = a.size,

    j = 0 hsh result = {} while j < i do result[a[j]] = j j += 1 end return result end end
  32. class Array2Hash def self.convert(arr a) long int i = a.size,

    j = 0 hsh result = {} while j < i do result[a[j]] = j j += 1 end return result end end
  33. class Array2Hash def self.convert(arr a) long int i = a.size,

    j = 0 hsh result = {} while j < i do result[a[j]] = j j += 1 end return result end end
  34. require 'array2hash.so' Array2Hash.convert array

  35. Benchmarks Warming up -------------------------------------- convert 368.000 i/100ms each_with_index.to_h 236.000 i/100ms

    Calculating ------------------------------------- convert 3.488k (± 9.8%) i/s - 17.296k in 5.012260s each_with_index.to_h 2.192k (± 8.3%) i/s - 11.092k in 5.097432s Comparison: convert: 3487.8 i/s each_with_index.to_h: 2192.3 i/s - 1.59x slower
  36. None
  37. None
  38. I’m cold. I need a struct blanket.

  39. struct blanket { int warmth_factor; char* owner; float len, breadth;

    };
  40. • GC marking of Ruby objects. • Memory deallocation. •

    Write an extconf.rb. • struct rb_data_type_t. • TypedData_Make_Struct(). • TypedData_Get_Struct(). • rb_define_instance_method(). • rb_define_class(). • rb_define_alloc_func().
  41. struct blanket int warmth_factor char* owner float len, breadth end

  42. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  43. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  44. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  45. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  46. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  47. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  48. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  49. class BlanketWrapper attach blanket def initialize(warmth_factor, owner, len, breadth) data$.blanket.warmth_factor

    = warmth_factor data$.blanket.owner = owner data$.blanket.len = len data$.blanket.breadth = breadth end def warmth_factor return data$.blanket.warmth_factor end # ... more code for blanket interface. end
  50. Rubex struct wrapping • ~3x reduction in LoC written. •

    Friendly, elegant Ruby-like interface. • No compromise in speed. • No C code!
  51. Exception Handling

  52. Many C functions need to be used • rb_raise() for

    raising error. • rb_rescue(), rb_rescue2(), rb_protect(), rb_ensure() for rescue and ensure blocks. • rb_errinfo() for getting the last error raised. • rb_set_errinfo(Qnil) for resetting error information.
  53. Workflow becomes complex • Almost zero compliance with begin-ensure block

    workflow. • Create C function callbacks. • Manually catch and rescue exceptions. • Inflexibility in sending data to callbacks.
  54. None
  55. int i = accept_number() begin raise(ArgumentError) if i == 3

    raise(FooBarError) if i == 5 rescue ArgumentError i += 1 rescue FooBarError i += 2 ensure i += 10 end
  56. Interfacing 3rd party C libraries using Rubex.

  57. Building a Rubex wrapper for libcsv – a C library

    for parsing CSV files.
  58. 3 steps to write libcsv wrapper 1. Tell Rubex about

    the functions /types/constants and header files that you will be using. 2. Use functions in normal Rubex code. 3. Compile and call in your Ruby script.
  59. lib "csv.h", link: "csv" struct csv_parser; end # more types

    ... int CSV_STRICT_FINI # more macros ... int csv_init(csv_parser, unsigned char) size_t csv_parse( csv_parser *p, void *, size_t, void (*cb1)(void *, size_t, void *), void (*cb2)(int, void *), void * ) end
  60. lib "csv.h", link: "csv" struct csv_parser; end # more types

    ... int CSV_STRICT_FINI # more macros ... int csv_init(csv_parser, unsigned char) size_t csv_parse( csv_parser *, void *, size_t, void (*cb1)(void *, size_t, void *), void (*cb2)(int, void *), void * ) end
  61. lib "csv.h", link: "csv" struct csv_parser; end # more types

    ... int CSV_STRICT_FINI # more macros ... int csv_init(csv_parser, unsigned char) size_t csv_parse( csv_parser *, void *, size_t, void (*cb1)(void *, size_t, void *), void (*cb2)(int, void *), void * ) end
  62. lib "csv.h", link: "csv" struct csv_parser; end # more types

    ... int CSV_STRICT_FINI # more macros ... int csv_init(csv_parser, unsigned char) size_t csv_parse( csv_parser *, void *, size_t, void (*cb1)(void *, size_t, void *), void (*cb2)(int, void *), void * ) end
  63. lib "csv.h", link: "csv" struct csv_parser; end # more types

    ... int CSV_STRICT_FINI # more macros ... int csv_init(csv_parser, unsigned char) size_t csv_parse( csv_parser *, void *, size_t, void (*cb1)(void *, size_t, void *), void (*cb2)(int, void *), void * ) end
  64. lib "csv.h", link: "csv" struct csv_parser; end # more types

    ... int CSV_STRICT_FINI # more macros ... int csv_init(csv_parser, unsigned char) size_t csv_parse( csv_parser *, void *, size_t, void (*cb1)(void *, size_t, void *), void (*cb2)(int, void *), void * ) end
  65. lib "csv.h", link: "csv" struct csv_parser; end # more types

    ... int CSV_STRICT_FINI # more macros ... int csv_init(csv_parser, unsigned char) size_t csv_parse( csv_parser *, void *, size_t, void (*cb1)(void *, size_t, void *), void (*cb2)(int, void *), void * ) end
  66. # Store internal state struct rcsv_metadata size_t current_col size_t current_row

    object last_entry object result end
  67. class LibCSVWrapper def self.parse(file_name, opts) # allocate memory, initialize variables

    ... begin if str_len != csv_parse(&cp, string, str_len, &eof_callback, &eol_callback, &meta) # check and raise errors end ensure # free allocated data end # return computed result end end
  68. class LibCSVWrapper def self.parse(file_name, opts) # allocate memory, initialize variables

    ... begin if str_len != csv_parse(&cp, string, str_len, &eof_callback, &eol_callback, &meta) # check and raise errors end ensure # free allocated data end # return computed result end end
  69. class LibCSVWrapper def self.parse(file_name, opts) # allocate memory, initialize variables

    ... begin if str_len != csv_parse(&cp, string, str_len, &eof_callback, &eol_callback, &meta) # check and raise errors end ensure # free allocated data end # return computed result end end
  70. class LibCSVWrapper def self.parse(file_name, opts) # allocate memory, initialize variables

    ... begin if str_len != csv_parse(&cp, string, str_len, &eof_callback, &eol_callback, &meta) # check and raise errors end ensure # free allocated data end # return computed result end end
  71. https://github.com/sciruby/rubex

  72. Notable Rubex examples • Rubex repo examples/ folder. – Fully

    functional libcsv wrapper for reading CSV files written entirely in Rubex. • Array2Hash gem – https://github.com/v0dro/array2hash
  73. Detailed Docs and Tutorial • REFERENCE.md. – Complete specification of

    the entire language. • TUTORIAL.md. – Quick, easy to use explanation with code samples.
  74. Conclusion • Rubex is a fast and productive way of

    writing Ruby C extensions. • Provides users with the elegance of Ruby and the power of C while following the principle of least surprise. • Future work will involve ability to release the GIL, interface with GPUs and Rubex APIs for gems.
  75. Acknowledgements • Ruby Association Grant 2016. • Kenta Murata and

    Koichi Sasada for their support and mentorship. • Fukuoka Ruby Award 2016.
  76. I haz SciRuby stickers. ^_^

  77. THANK YOU! どうもありがとう ございます!