Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bending The Curve: Putting Rust in Ruby with Helix

Godfrey Chan
September 19, 2017

Bending The Curve: Putting Rust in Ruby with Helix

Two years ago at RubyKaigi, we demonstrated our initial work on Helix, an FFI toolkit that makes it easy for anyone to write Ruby native extensions in Rust. In this talk, we will focus on the challenges and lessons we learned while developing Helix. What did it take to fuse the two languages and still be able to take advantage of their unique features and benefits? How do we distribute the extensions to our end-users? Let's find out!

Godfrey Chan

September 19, 2017
Tweet

More Decks by Godfrey Chan

Other Decks in Programming

Transcript

  1. Motivation
 㵕䱛 - Ruby is slow… - Usually it doesn’t

    matter - Most workload are I/O bound - But occasionally it does…
  2. Motivation
 㵕䱛 - “Best of both worlds”: native extensions -

    JSON gem - Very fast - Transparent to the user - Date, Pathname, etc…
  3. Motivation
 㵕䱛 - Rust - Like C: compiled, statically typed,

    very fast - Unlike C: enjoyable to use, guarantees safety - “If it compiles, it doesn’t crash” - Same guarantee as Ruby, but without GC
  4. Motivation
 㵕䱛 - Zero-cost abstractions™ - In Ruby: tension between

    abstractions and performance - Hash.keys.each vs Hash.each_key - In Rust: no such tradeoff - Compiler is magic
  5. Motivation
 㵕䱛 pub fn main() { let array = [1,2,3];

    let mut sum = 0; for i in 0..3 { sum += array[i] }
 
 println!("{:?}", sum); }
  6. Motivation
 㵕䱛 pub fn main() { let array = [1,2,3];

    let sum = array.iter().fold(0, |sum, &val| sum + val); 
 
 println!("{:?}", sum); }
  7. Motivation
 㵕䱛 - The vision - Keep writing the Ruby

    you love… - …without the fear of eventually hitting a wall - Start with Ruby - Move pieces to Helix when appropriate
  8. VALUE greeter_hello(VALUE self, VALUE name) { return rb_sprintf("Hello, %"PRIsVALUE".", name);

    } void Init_my_extension() { VALUE c_Greeter = rb_define_class("Greeter", rb_cObject); rb_define_singleton_method(c_Greeter, "hello", greeter_hello, 1); } class Greeter def self.hello(name) "Hello, #{name}." end end Ease of use
 ֵ͚Κͯͫ
  9. class Greeter def initialize(name) @name = name end def hello

    "Hello, #{@name}.” end end struct Greeter { VALUE name; }; void Greeter_mark(struct Greeter* data) { rb_gc_mark(data->name); } VALUE Greeter_alloc(VALUE self) { struct Greeter* greeter; return Data_Make_Struct(self, struct Greeter, Greeter_mark, RUBY_DEFAULT_FREE, greeter); } VALUE Greeter_initialize(VALUE self, VALUE name) { struct Greeter* greeter; Data_Get_Struct(self, struct Greeter, greeter); *greeter = name; return self; } VALUE greeter_hello(VALUE self) { return rb_sprintf("Hello, %"PRIsVALUE".", name); } void Init_my_extension() { VALUE c_Greeter = rb_define_class("Greeter", rb_cObject); rb_define_alloc_func(c_Greeter, Greeter_alloc); rb_define_method(c_Greeter, "initialize", Greeter_initialize, 1); rb_define_method(c_Greeter, "hello", greeter_hello, 0); } Ease of use
 ֵ͚Κͯͫ
  10. class Point def initialize(x, y) @x = x @y =

    y end attr_reader :x, :y def +(other) Point.new(@x + other.x, @y + other.y) end def -(other) Point.new(@x - other.x, @y - other.y) end def distance_from(other) delta = self - other Math.sqrt(delta.x ** 2 + delta.y ** 2) end end ' Ease of use
 ֵ͚Κͯͫ
  11. class Greeter def self.hello(name) "Hello, #{name}." end end ruby! {

    class Greeter { def hello(name: String) -> String { format!("Hello, {}.", name) } } } Ease of use
 ֵ͚Κͯͫ
  12. class Greeter def initialize(name) @name = name end def hello

    "Hello, #{@name}.” end end ruby! { class Greeter { struct { name: String } def initialize(helix, name: String) { Greeter { helix, name } } def hello(&self) -> String { format!("Hello, {}.", self.name) } } } Ease of use
 ֵ͚Κͯͫ
  13. ruby! { class Point { struct { x: f64, y:

    f64 } def initialize(helix, x: f64, y: f64) { Point { helix, x, y } } def x(&self) -> f64 { self.x } def y(&self) -> f64 { self.y } #[ruby_name="+"] def add(&self, other: &Point) -> Point { Point::new(self.x + other.x, self.y + other.y) } #[ruby_name="-"] def subtract(&self, other: &Point) -> Point { Point::new(self.x - other.x, self.y - other.y) } def distance_from(&self, other: &Point) -> f64 { let Point { x, y, .. } = self.subtract(&other); x.hypot(y) } } } class Point def initialize(x, y) @x = x @y = y end attr_reader :x, :y def +(other) Point.new(@x + other.x, @y + other.y) end def -(other) Point.new(@x - other.x, @y - other.y) end def distance_from(other) delta = self - other Math.sqrt(delta.x ** 2 + delta.y ** 2) end end Ease of use
 ֵ͚Κͯͫ
  14. But how?
 Ϳ͜Κ͹ͼҘ - Write our own parser? - But…

    we need to parse both languages at the same time - Also… parsing Rust can be quite difficult
  15. But how?
 Ϳ͜Κ͹ͼҘ impl<P: ProgramTy, C: Bit, R: List> Running<St<Nil,

    C, R>> for Left<P> where P: Running<St<Nil, F, Cons<C, R>>> { type Output = <P as Running<St<Nil, F, Cons<C, R>>>>::Output; } From “Rust's Type System is Turing-Complete”
  16. But how?
 Ϳ͜Κ͹ͼҘ - Need something that understands Rust -

    Thankfully that already exists - Rust’s macro system
  17. But how?
 Ϳ͜Κ͹ͼҘ - Macros in C: - Simple text

    substitution - Macro pre-processor ➡ parser…compiler - Macros in Rust: - Pattern matching - Tokenizer/parser* ➡ macro expansion ➡ …compiler
  18. But how?
 Ϳ͜Κ͹ͼҘ struct Point { x: f64, y: f64

    } impl Point { fn x(&self) -> f64 { self.x } fn y(&self) -> f64 { self.y } // ...other methods... }
  19. But how?
 Ϳ͜Κ͹ͼҘ struct Point { x: f64, y: f64

    } impl Point { fn x(&self) -> f64 { self.x } fn y(&self) -> f64 { self.y } // ...other methods... }
  20. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  21. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  22. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  23. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  24. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  25. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  26. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  27. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  28. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  29. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  30. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  31. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  32. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  33. But how?
 Ϳ͜Κ͹ͼҘ - Rust’s macro system has some restrictions

    - Recursion limit - Hygiene - Each step must expand into valid Rust syntax
  34. Type safety
 ࣳਞق௔ // create a new String VALUE rb_str_new(const

    char*, long);
 // define a new class/module
 VALUE rb_define_class(const char*, VALUE); VALUE rb_define_module(const char*); VALUE rb_define_class_under(VALUE, const char*, VALUE); VALUE rb_define_module_under(VALUE, const char*);
  35. Type safety
 ࣳਞق௔ struct VALUE { inner: *mut void }

    size_of::<VALUE>() == size_of::<*mut void>()
  36. struct CheckedValue<T> { inner: VALUE, target_type: PhantomData<T> } Type safety


    ࣳਞق௔ size_of::<CheckedValue<T>>() == size_of::<VALUE>() + 0 == size_of::<*mut void>()
  37. Type safety
 ࣳਞق௔ ruby! { class Greeter { def hello(name:

    String) -> String { format!("Hello, {}.", name) } } }
  38. # From Ruby >> Greeter.hello(“Godfrey”) => “Hello, Godfrey.” >> Greeter.hello(123456)

    TypeError: Expected a UTF-8 String, got 123456 Type safety
 ࣳਞق௔
  39. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ
  40. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ raw VALUE (but not just any void pointer!)
  41. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ Result<CheckedValue<String>, Error>
  42. impl UncheckedValue<String> for VALUE { fn to_checked(self) -> CheckResult<String> {

    if unsafe { sys::RB_TYPE_P(self, sys::T_STRING) } { Ok(unsafe { CheckedValue::<String>::new(self) }) } else { Err(::invalid(self, "a UTF-8 String")) } } } But how?
 Ϳ͜Κ͹ͼҘ
  43. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ CheckedValue<String> Error
  44. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ String
  45. impl ToRust<String> for CheckedValue<String> { fn to_rust(self) -> String {

    let size = unsafe { RSTRING_LEN(self.inner) as usize }; let ptr = unsafe { RSTRING_PTR(self.inner) as *const u8 }; let slice = unsafe { slice::from_raw_parts(ptr, size) }; unsafe { std::str::from_utf8_unchecked(slice) }.to_string() } } But how?
 Ϳ͜Κ͹ͼҘ
  46. impl ToRust<String> for CheckedValue<String> { fn to_rust(self) -> String {

    let size = unsafe { RSTRING_LEN(self.inner) as usize }; let ptr = unsafe { RSTRING_PTR(self.inner) as *const u8 }; let slice = unsafe { slice::from_raw_parts(ptr, size) }; unsafe { std::str::from_utf8_unchecked(slice) }.to_string() } } But how?
 Ϳ͜Κ͹ͼҘ CheckedValue<String> == VALUE == *mut void
  47. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ String
  48. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ String
  49. extern "C" fn some_method(arg1_raw: VALUE, arg2_raw: VALUE, …) -> VALUE

    { let arg1_result = UncheckedValue::<T_arg1>::to_checked(arg1_raw); let arg2_result = UncheckedValue::<T_arg2>::to_checked(arg2_raw); // … let arg1 = match arg1_result { Ok(checked) => { checked.to_rust() }, // Err(m) => … }; let arg2 = match arg2_result { Ok(checked) => { checked.to_rust() }, // Err(m) => … }; // … let result: T_return = { /* user code */ }; result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ
  50. Distribution
 ᯈ૲ - Requires Rust compiler on servers - Potentially

    not a big deal - Not always possible, leads to lower adoption - Longer bundle install times
  51. Distribution
 ᯈ૲ - libv8 gem faced similar problems - Takes

    a (really) long time to compile v8 - Uses Google's own build system (gyp) - Requires Python 2.7
  52. Distribution
 ᯈ૲ - Another example: skylight gem - Written in

    Rust - Supports Linux + Mac OS X - Custom VM image to build & publish binary - gem install skylight Just Works™
  53. But how?
 Ϳ͜Κ͹ͼҘ - Use CI to build natively on

    each platform - 64-bit Linux (Travis CI / Circle CI) - 64-bit Mac OS X (Travis CI / Circle CI) - 32/64-bit Windows (Appveyor) - Script to publish directly to RubyGems from CI
  54. But how?
 Ϳ͜Κ͹ͼҘ Gem::Specification.new do |s| s.name = 'text_transform' s.version

    = '0.1.0' s.authors = ['Godfrey Chan', 'Terence Lee'] s.summary = "Transform text using the power of helix/rust" s.platform = Gem::Platform.local s.require_path = 'lib'
 s.add_dependency 'helix_runtime', '~> 0.6.4' s.add_development_dependency 'rspec', '~> 3.6' end
  55. But how?
 Ϳ͜Κ͹ͼҘ Gem::Specification.new do |s| s.name = 'text_transform' s.version

    = '0.1.0' s.authors = ['Godfrey Chan', 'Terence Lee'] s.summary = "Transform text using the power of helix/rust" s.platform = Gem::Platform.local s.require_path = 'lib'
 s.add_dependency 'helix_runtime', '~> 0.6.4' s.add_development_dependency 'rspec', '~> 3.6' end
  56. But how?
 Ϳ͜Κ͹ͼҘ system "gem build text_transform.gemspec" if GIT_TAG &&

    GIT_TAG.match(/^v[0-9.]+/) credentials_file = "~/.gem/credentials" FileUtils.mkdir_p gem_config_dir File.open(credentials_file, 'w') do |f| f.puts YAML.dump({ rubygems_api_key: ENV['RUBYGEMS_AUTH_TOKEN'] }) end File.chmod 0600, credentials_file system "gem push text_transform-*.gem” end
  57. But how?
 Ϳ͜Κ͹ͼҘ system "gem build text_transform.gemspec" if GIT_TAG &&

    GIT_TAG.match(/^v[0-9.]+/) credentials_file = "~/.gem/credentials" FileUtils.mkdir_p gem_config_dir File.open(credentials_file, 'w') do |f| f.puts YAML.dump({ rubygems_api_key: ENV['RUBYGEMS_AUTH_TOKEN'] }) end File.chmod 0600, credentials_file system "gem push text_transform-*.gem” end
  58. - What about 32-bit Linux? - Not supported by public

    Travis/Circle CI offering - Option is to use 32-bit Linux Docker Image - Also not officially supported by Docker - But how?
 Ϳ͜Κ͹ͼҘ
  59. Roadmap
 ϺЄϖϫϐϤ - Good use cases today: - Use Rust

    libraries - Leverage Rust web browser tech - Mailer, Background job, Action Cable
  60. Roadmap
 ϺЄϖϫϐϤ - Greenfield project - Drop-in replacement - Reopen

    class - Ship to production - Binary distribution - Non-traditional use-cases - Performance parity with C - Miscellaneous features and QoL improvements
  61. How to help
 揙ሠͯΡොဩ - Please try it! - Hack

    on Helix together during Ruby Kaigi? - Need help with… - Debugging Windows, Linux 32-bit, linker questions etc - Cross-compilation - Documentation - Funding?