Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Bending The Curve: Putting Rust in Ruby with Helix

22bb3e56828870ee9a0dd93aeadbe04a?s=47 Godfrey Chan
September 19, 2017

Bending The Curve: Putting Rust in Ruby with Helix

Two years ago at RubyKaigi, we demonstrated our initial work on Helix, an FFI toolkit that makes it easy for anyone to write Ruby native extensions in Rust. In this talk, we will focus on the challenges and lessons we learned while developing Helix. What did it take to fuse the two languages and still be able to take advantage of their unique features and benefits? How do we distribute the extensions to our end-users? Let's find out!

22bb3e56828870ee9a0dd93aeadbe04a?s=128

Godfrey Chan

September 19, 2017
Tweet

Transcript

  1. None
  2. Terence Lee @hone02

  3. None
  4. None
  5. Godfrey Chan @chancancode

  6. None
  7. None
  8. Motivation 㵕䱛

  9. Motivation
 㵕䱛 - Ruby is slow… - Usually it doesn’t

    matter - Most workload are I/O bound - But occasionally it does…
  10. Motivation
 㵕䱛 - “Best of both worlds”: native extensions -

    JSON gem - Very fast - Transparent to the user - Date, Pathname, etc…
  11. Motivation
 㵕䱛 - Example: String#blank? - Sam Saffron’s fast_blank -

    50 LOC in C - 20x speedup
  12. Motivation
 㵕䱛 - But C… - Unsafe, risky - Maintenance

    burden - Contribution barrier
  13. Motivation
 㵕䱛

  14. Motivation
 㵕䱛 - Rust - Like C: compiled, statically typed,

    very fast - Unlike C: enjoyable to use, guarantees safety - “If it compiles, it doesn’t crash” - Same guarantee as Ruby, but without GC
  15. Motivation
 㵕䱛 - Zero-cost abstractions™ - In Ruby: tension between

    abstractions and performance - Hash.keys.each vs Hash.each_key - In Rust: no such tradeoff - Compiler is magic
  16. Motivation
 㵕䱛 pub fn main() { let array = [1,2,3];

    let mut sum = 0; for i in 0..3 { sum += array[i] }
 
 println!("{:?}", sum); }
  17. Motivation
 㵕䱛 pub fn main() { let array = [1,2,3];

    let sum = array.iter().fold(0, |sum, &val| sum + val); 
 
 println!("{:?}", sum); }
  18. Motivation
 㵕䱛 fast_blank in Rust

  19. Motivation
 㵕䱛

  20. Motivation
 㵕䱛 fast_blank in Helix

  21. Motivation
 㵕䱛 - The vision - Keep writing the Ruby

    you love… - …without the fear of eventually hitting a wall - Start with Ruby - Move pieces to Helix when appropriate
  22. Demo 䋚ᄍ

  23. $ http://chancancode.tv/helix ᝕承ਁ଒ %

  24. & http://usehelix.com

  25. How it works 䋚ᤰ΄托奞

  26. Challenges 抓氂;䌏ᒽ

  27. Challenge #1 Ease of use ֵ͚Κͯͫ

  28. VALUE greeter_hello(VALUE self, VALUE name) { return rb_sprintf("Hello, %"PRIsVALUE".", name);

    } void Init_my_extension() { VALUE c_Greeter = rb_define_class("Greeter", rb_cObject); rb_define_singleton_method(c_Greeter, "hello", greeter_hello, 1); } class Greeter def self.hello(name) "Hello, #{name}." end end Ease of use
 ֵ͚Κͯͫ
  29. class Greeter def initialize(name) @name = name end def hello

    "Hello, #{@name}.” end end struct Greeter { VALUE name; }; void Greeter_mark(struct Greeter* data) { rb_gc_mark(data->name); } VALUE Greeter_alloc(VALUE self) { struct Greeter* greeter; return Data_Make_Struct(self, struct Greeter, Greeter_mark, RUBY_DEFAULT_FREE, greeter); } VALUE Greeter_initialize(VALUE self, VALUE name) { struct Greeter* greeter; Data_Get_Struct(self, struct Greeter, greeter); *greeter = name; return self; } VALUE greeter_hello(VALUE self) { return rb_sprintf("Hello, %"PRIsVALUE".", name); } void Init_my_extension() { VALUE c_Greeter = rb_define_class("Greeter", rb_cObject); rb_define_alloc_func(c_Greeter, Greeter_alloc); rb_define_method(c_Greeter, "initialize", Greeter_initialize, 1); rb_define_method(c_Greeter, "hello", greeter_hello, 0); } Ease of use
 ֵ͚Κͯͫ
  30. class Point def initialize(x, y) @x = x @y =

    y end attr_reader :x, :y def +(other) Point.new(@x + other.x, @y + other.y) end def -(other) Point.new(@x - other.x, @y - other.y) end def distance_from(other) delta = self - other Math.sqrt(delta.x ** 2 + delta.y ** 2) end end ' Ease of use
 ֵ͚Κͯͫ
  31. Domain-specific language ϖϮαЀࢴํ᥺承

  32. class Greeter def self.hello(name) "Hello, #{name}." end end ruby! {

    class Greeter { def hello(name: String) -> String { format!("Hello, {}.", name) } } } Ease of use
 ֵ͚Κͯͫ
  33. class Greeter def initialize(name) @name = name end def hello

    "Hello, #{@name}.” end end ruby! { class Greeter { struct { name: String } def initialize(helix, name: String) { Greeter { helix, name } } def hello(&self) -> String { format!("Hello, {}.", self.name) } } } Ease of use
 ֵ͚Κͯͫ
  34. ruby! { class Point { struct { x: f64, y:

    f64 } def initialize(helix, x: f64, y: f64) { Point { helix, x, y } } def x(&self) -> f64 { self.x } def y(&self) -> f64 { self.y } #[ruby_name="+"] def add(&self, other: &Point) -> Point { Point::new(self.x + other.x, self.y + other.y) } #[ruby_name="-"] def subtract(&self, other: &Point) -> Point { Point::new(self.x - other.x, self.y - other.y) } def distance_from(&self, other: &Point) -> f64 { let Point { x, y, .. } = self.subtract(&other); x.hypot(y) } } } class Point def initialize(x, y) @x = x @y = y end attr_reader :x, :y def +(other) Point.new(@x + other.x, @y + other.y) end def -(other) Point.new(@x - other.x, @y - other.y) end def distance_from(other) delta = self - other Math.sqrt(delta.x ** 2 + delta.y ** 2) end end Ease of use
 ֵ͚Κͯͫ
  35. But how? Ϳ͜Κ͹ͼҘ

  36. But how?
 Ϳ͜Κ͹ͼҘ - Write our own parser? - But…

    we need to parse both languages at the same time - Also… parsing Rust can be quite difficult
  37. But how?
 Ϳ͜Κ͹ͼҘ impl<P: ProgramTy, C: Bit, R: List> Running<St<Nil,

    C, R>> for Left<P> where P: Running<St<Nil, F, Cons<C, R>>> { type Output = <P as Running<St<Nil, F, Cons<C, R>>>>::Output; } From “Rust's Type System is Turing-Complete”
  38. But how?
 Ϳ͜Κ͹ͼҘ - Need something that understands Rust -

    Thankfully that already exists - Rust’s macro system
  39. But how?
 Ϳ͜Κ͹ͼҘ - Macros in C: - Simple text

    substitution - Macro pre-processor ➡ parser…compiler - Macros in Rust: - Pattern matching - Tokenizer/parser* ➡ macro expansion ➡ …compiler
  40. But how?
 Ϳ͜Κ͹ͼҘ struct Point { x: f64, y: f64

    } impl Point { fn x(&self) -> f64 { self.x } fn y(&self) -> f64 { self.y } // ...other methods... }
  41. But how?
 Ϳ͜Κ͹ͼҘ struct Point { x: f64, y: f64

    } impl Point { fn x(&self) -> f64 { self.x } fn y(&self) -> f64 { self.y } // ...other methods... }
  42. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  43. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  44. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  45. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  46. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  47. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  48. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  49. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ macro_rules! attr_reader { ( $struct_name:ident, $field_name:ident : $field_type:ty ) => { impl $struct_name { fn $field_name(&self) -> $field_type { self.$field_name } } }; }
  50. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  51. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  52. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  53. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  54. struct Point { x: f64, y: f64 } attr_reader!(Point, x:

    f64); attr_reader!(Point, y: f64); But how?
 Ϳ͜Κ͹ͼҘ
  55. TMI ᥺͚ͯͤ

  56. But how?
 Ϳ͜Κ͹ͼҘ - Rust’s macro system has some restrictions

    - Recursion limit - Hygiene - Each step must expand into valid Rust syntax
  57. But how?
 Ϳ͜Κ͹ͼҘ

  58. But how?
 Ϳ͜Κ͹ͼҘ

  59. But how?
 Ϳ͜Κ͹ͼҘ

  60. But how?
 Ϳ͜Κ͹ͼҘ

  61. Pushdown automata ϤϐτϲύγЀ独ηЄϕϫϕЀ

  62. But how?
 Ϳ͜Κ͹ͼҘ

  63. But how?
 Ϳ͜Κ͹ͼҘ

  64. But how?
 Ϳ͜Κ͹ͼҘ

  65. But how?
 Ϳ͜Κ͹ͼҘ

  66. But how?
 Ϳ͜Κ͹ͼҘ

  67. But how?
 Ϳ͜Κ͹ͼҘ

  68. But how?
 Ϳ͜Κ͹ͼҘ

  69. But how?
 Ϳ͜Κ͹ͼҘ

  70. But how?
 Ϳ͜Κ͹ͼҘ

  71. But how?
 Ϳ͜Κ͹ͼҘ

  72. </TMI>

  73. Challenge #2 Type safety ࣳਞق௔

  74. Type safety
 ࣳਞق௔ Let’s look at Ruby’s C API

  75. Type safety
 ࣳਞق௔ // create a new String VALUE rb_str_new(const

    char*, long);
 // define a new class/module
 VALUE rb_define_class(const char*, VALUE); VALUE rb_define_module(const char*); VALUE rb_define_class_under(VALUE, const char*, VALUE); VALUE rb_define_module_under(VALUE, const char*);
  76. Type safety
 ࣳਞق௔ ) VALUEs everywhere

  77. Type safety
 ࣳਞق௔ typedef uintptr_t VALUE;

  78. Type safety
 ࣳਞق௔ typedef void* VALUE;

  79. Type safety
 ࣳਞق௔ * VALUEs everywhere

  80. Type safety
 ࣳਞق௔ + Safety

  81. Type safety
 ࣳਞق௔ - Rust: - Memory safety % -

    Advanced type system features
  82. Type safety
 ࣳਞق௔ - Type inference - Generics - Lifetime

    - Zero-cost abstractions™
  83. Type safety
 ࣳਞق௔ Zero-cost abstractions™

  84. Type safety
 ࣳਞق௔

  85. Type safety
 ࣳਞق௔ type VALUE = *mut void;

  86. Type safety
 ࣳਞق௔ struct VALUE(*mut void);

  87. Type safety
 ࣳਞق௔ struct VALUE { inner: *mut void }

  88. Type safety
 ࣳਞق௔ struct VALUE { inner: *mut void }

    size_of::<VALUE>() == size_of::<*mut void>()
  89. struct CheckedValue<T> { inner: VALUE, target_type: PhantomData<T> } Type safety


    ࣳਞق௔ size_of::<PhantomData<T>>() == 0
  90. struct CheckedValue<T> { inner: VALUE, target_type: PhantomData<T> } Type safety


    ࣳਞق௔ size_of::<CheckedValue<T>>() == size_of::<VALUE>() + 0 == size_of::<*mut void>()
  91. Coercion ࣳ䄜䟵

  92. Type safety
 ࣳਞق௔ ruby! { class Greeter { def hello(name:

    String) -> String { format!("Hello, {}.", name) } } }
  93. # From Ruby >> Greeter.hello(“Godfrey”) => “Hello, Godfrey.” >> Greeter.hello(123456)

    TypeError: Expected a UTF-8 String, got 123456 Type safety
 ࣳਞق௔
  94. But how? Ϳ͜Κ͹ͼҘ

  95. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ
  96. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ raw VALUE (but not just any void pointer!)
  97. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ Result<CheckedValue<String>, Error>
  98. impl UncheckedValue<String> for VALUE { fn to_checked(self) -> CheckResult<String> {

    if unsafe { sys::RB_TYPE_P(self, sys::T_STRING) } { Ok(unsafe { CheckedValue::<String>::new(self) }) } else { Err(::invalid(self, "a UTF-8 String")) } } } But how?
 Ϳ͜Κ͹ͼҘ
  99. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ CheckedValue<String> Error
  100. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ String
  101. impl ToRust<String> for CheckedValue<String> { fn to_rust(self) -> String {

    let size = unsafe { RSTRING_LEN(self.inner) as usize }; let ptr = unsafe { RSTRING_PTR(self.inner) as *const u8 }; let slice = unsafe { slice::from_raw_parts(ptr, size) }; unsafe { std::str::from_utf8_unchecked(slice) }.to_string() } } But how?
 Ϳ͜Κ͹ͼҘ
  102. impl ToRust<String> for CheckedValue<String> { fn to_rust(self) -> String {

    let size = unsafe { RSTRING_LEN(self.inner) as usize }; let ptr = unsafe { RSTRING_PTR(self.inner) as *const u8 }; let slice = unsafe { slice::from_raw_parts(ptr, size) }; unsafe { std::str::from_utf8_unchecked(slice) }.to_string() } } But how?
 Ϳ͜Κ͹ͼҘ CheckedValue<String> == VALUE == *mut void
  103. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ String
  104. extern "C" fn Greeter_hello(unchecked_name: VALUE) -> VALUE { let maybe_name

    = UncheckedValue::<String>::to_checked(unchecked_name); let name = match maybe_name { Ok(checked_name) => { checked_name.to_rust() }, Err(message) => { unsafe { rb_raise(message) } } }; let result: String = format!(“Hello, {}.", name); result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ String
  105. But how?
 Ϳ͜Κ͹ͼҘ

  106. extern "C" fn some_method(arg1_raw: VALUE, arg2_raw: VALUE, …) -> VALUE

    { let arg1_result = UncheckedValue::<T_arg1>::to_checked(arg1_raw); let arg2_result = UncheckedValue::<T_arg2>::to_checked(arg2_raw); // … let arg1 = match arg1_result { Ok(checked) => { checked.to_rust() }, // Err(m) => … }; let arg2 = match arg2_result { Ok(checked) => { checked.to_rust() }, // Err(m) => … }; // … let result: T_return = { /* user code */ }; result.to_ruby() } But how?
 Ϳ͜Κ͹ͼҘ
  107. Challenge #3 Distribution ᯈ૲

  108. Distribution
 ᯈ૲ - Requires Rust compiler on servers - Potentially

    not a big deal - Not always possible, leads to lower adoption - Longer bundle install times
  109. Distribution
 ᯈ૲ - libv8 gem faced similar problems - Takes

    a (really) long time to compile v8 - Uses Google's own build system (gyp) - Requires Python 2.7
  110. Distribution
 ᯈ૲ Precompiles binaries for libv8 gem

  111. Distribution
 ᯈ૲ - Another example: skylight gem - Written in

    Rust - Supports Linux + Mac OS X - Custom VM image to build & publish binary - gem install skylight Just Works™
  112. But how? Ϳ͜Κ͹ͼҘ

  113. But how?
 Ϳ͜Κ͹ͼҘ - Cross-compilation? - Rust: good support out-of-the-box

    - Ruby: not so much
  114. But how?
 Ϳ͜Κ͹ͼҘ - Use CI to build natively on

    each platform - 64-bit Linux (Travis CI / Circle CI) - 64-bit Mac OS X (Travis CI / Circle CI) - 32/64-bit Windows (Appveyor) - Script to publish directly to RubyGems from CI
  115. But how?
 Ϳ͜Κ͹ͼҘ Gem::Specification.new do |s| s.name = 'text_transform' s.version

    = '0.1.0' s.authors = ['Godfrey Chan', 'Terence Lee'] s.summary = "Transform text using the power of helix/rust" s.platform = Gem::Platform.local s.require_path = 'lib'
 s.add_dependency 'helix_runtime', '~> 0.6.4' s.add_development_dependency 'rspec', '~> 3.6' end
  116. But how?
 Ϳ͜Κ͹ͼҘ Gem::Specification.new do |s| s.name = 'text_transform' s.version

    = '0.1.0' s.authors = ['Godfrey Chan', 'Terence Lee'] s.summary = "Transform text using the power of helix/rust" s.platform = Gem::Platform.local s.require_path = 'lib'
 s.add_dependency 'helix_runtime', '~> 0.6.4' s.add_development_dependency 'rspec', '~> 3.6' end
  117. But how?
 Ϳ͜Κ͹ͼҘ system "gem build text_transform.gemspec" if GIT_TAG &&

    GIT_TAG.match(/^v[0-9.]+/) credentials_file = "~/.gem/credentials" FileUtils.mkdir_p gem_config_dir File.open(credentials_file, 'w') do |f| f.puts YAML.dump({ rubygems_api_key: ENV['RUBYGEMS_AUTH_TOKEN'] }) end File.chmod 0600, credentials_file system "gem push text_transform-*.gem” end
  118. But how?
 Ϳ͜Κ͹ͼҘ system "gem build text_transform.gemspec" if GIT_TAG &&

    GIT_TAG.match(/^v[0-9.]+/) credentials_file = "~/.gem/credentials" FileUtils.mkdir_p gem_config_dir File.open(credentials_file, 'w') do |f| f.puts YAML.dump({ rubygems_api_key: ENV['RUBYGEMS_AUTH_TOKEN'] }) end File.chmod 0600, credentials_file system "gem push text_transform-*.gem” end
  119. - What about 32-bit Linux? - Not supported by public

    Travis/Circle CI offering - Option is to use 32-bit Linux Docker Image - Also not officially supported by Docker - But how?
 Ϳ͜Κ͹ͼҘ
  120. Roadmap ϺЄϖϫϐϤ

  121. Roadmap
 ϺЄϖϫϐϤ - Good use cases today: - CPU-bound -

    Simple inputs - Avoid chatty APIs
  122. Roadmap
 ϺЄϖϫϐϤ - Good use cases today: - Use Rust

    libraries - Leverage Rust web browser tech - Mailer, Background job, Action Cable
  123. Roadmap
 ϺЄϖϫϐϤ - Greenfield project - Drop-in replacement - Reopen

    class - Ship to production - Binary distribution - Non-traditional use-cases - Performance parity with C - Miscellaneous features and QoL improvements
  124. None
  125. & http://usehelix.com

  126. How to help 揙ሠͯΡොဩ

  127. How to help
 揙ሠͯΡොဩ - Please try it! - Hack

    on Helix together during Ruby Kaigi? - Need help with… - Debugging Windows, Linux 32-bit, linker questions etc - Cross-compilation - Documentation - Funding?
  128. Thank you ͘Π͢;͚ͪͬ͜Δͭ͵