Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Extending CRuby with native Graph data type

AntiTyping
November 15, 2013

Extending CRuby with native Graph data type

Reading of the CRuby (MRI) source code provides unparalleled insight into the Ruby language. During the talk we will add new native Graph data type to CRuby. The new Graph data structure will be simple but on par with other native types such as Array or Hash. This talk will demonstrate that it is easy to experiment with CRuby and extend it in C. We will experience the speed advantage of using C to boost Ruby performance. We will implement a few of the greatest hits of graph algorithms: Breath First Search, Dijkstra, and Minimum Spanning Tree.

AntiTyping

November 15, 2013
Tweet

More Decks by AntiTyping

Other Decks in Programming

Transcript

  1. Extending CRuby with native Graph data type Andy Pliszka !

    ! @AntiTyping AntiTyping.com github.com/dracco
  2. Agenda • Problems • Build Ruby • Setup Debugger •

    CRuby Source Code • Simple CRuby Extensions • Graphs in CRuby
  3. Arrays of Objects • More overhead • More indirection •

    Memory fragmentation • Inefficient use of CPU caches
  4. Ruby + C • Ruby productivity and ecosystem • C

    efficiency, speed, and algorithms
  5. High Level Abstractions in Ruby • High level modeling •

    Algorithm coordination • Analysis • Scripting
  6. Low Level Operations in C • Algorithm implementation • Manipulation

    of large in-memory data structures • Integration with libraries LAPACK, CUDA
  7. Get the source ! $ mkdir ~/rubyconf-ruby $ chmod go-w

    ~/rubyconf-ruby $ cd ~/rubyconf-ruby $ git clone [email protected]:ruby/ruby.git $ cd ruby $ git checkout v2_0_0_247 fix for make check
  8. Configure (Mac) $ brew install openssl $ autoconf $ ./configure

    --prefix=$HOME/myruby --with- opt-dir=/usr/local/Cellar/openssl/1.0.1e optflags="-O0" debugflags="-g" --disable- install-doc
  9. Configure (Linux) $ sudo apt install libssl-dev $ autoconf $

    ./configure --prefix=$HOME/myruby optflags="-O0" debugflags="-g3 -ggdb" -- disable-install-doc
  10. Verify $ which irb /Users/apliszka/myruby/bin/irb $ irb irb(main):001:0> raise "hi"

    RuntimeError: hi from (irb):1 from /Users/apliszka/myruby/bin/irb:12:in `<main>' $ which ruby /Users/apliszka/myruby/bin/ruby
  11. gem env $ gem env ! RubyGems Environment: - RUBYGEMS

    VERSION: 2.0.3 - RUBY VERSION: 2.0.0 (2013-06-27 patchlevel 247) [x86_64-darwin12.5.0] - INSTALLATION DIRECTORY: /Users/apliszka/myruby/lib/ruby/gems/2.0.0 - RUBY EXECUTABLE: /Users/apliszka/myruby/bin/ruby - EXECUTABLE DIRECTORY: /Users/apliszka/myruby/bin - RUBYGEMS PLATFORMS: - ruby - x86_64-darwin-12 - GEM PATHS: - /Users/apliszka/myruby/lib/ruby/gems/2.0.0 - GEM CONFIGURATION: - :update_sources => true - :verbose => true - :backtrace => false - :bulk_threshold => 1000 - REMOTE SOURCES: - https://rubygems.org/
  12. Gems $ gem list ! *** LOCAL GEMS *** !

    bigdecimal (1.2.0) io-console (0.4.2) json (1.7.7) minitest (4.3.2) psych (2.0.0) rake (0.9.6) rdoc (4.0.0) test-unit (2.0.0.0) $ ls ~/myruby/lib/ruby/gems/2.0.0/gems/! rake-0.9.6! rdoc-4.0.0! test-unit-2.0.0.0
  13. Install Rails $ ls ~/myruby/lib/ruby/gems/2.0.0/gems/! ! actionmailer-4.0.0! actionpack-4.0.0! activemodel-4.0.0! activerecord-4.0.0!

    activerecord-deprecated_finders-1.0.3! activesupport-4.0.0! arel-4.0.1! atomic-1.1.14! builder-3.1.4! ... $ gem install rails --no-doc
  14. Rails app $ rails new HelloRuby! create! create README.rdoc! create

    Rakefile! ...! ! $ cd HelloRuby! ! $ rails s! ! => Booting WEBrick! => Rails 4.0.0 application starting in development on http:// 0.0.0.0:3000! => Run `rails server -h` for more startup options! => Ctrl-C to shutdown server! [2013-10-29 15:54:21] INFO WEBrick 1.3.1! ...
  15. Level 1 Complete • Build our own version of Ruby

    • Installed it • Rails app is using our Ruby (~/myruby) • ~30min 13
  16. Two Worlds • C • Working directly with memory •

    malloc/free • Ruby • Working with heap and objects • GC
  17. Data types T_NIL :: nil T_OBJECT :: ordinary object T_CLASS

    :: class T_MODULE :: module T_FLOAT :: floating point number T_STRING :: string T_REGEXP :: regular expression T_ARRAY :: array T_HASH :: associative array T_STRUCT :: (Ruby) structure T_BIGNUM :: multi precision integer T_FIXNUM :: Fixnum(31bit or 63bit integer) T_COMPLEX :: complex number T_RATIONAL :: rational number T_FILE :: IO T_TRUE :: true T_FALSE :: false T_DATA :: data T_SYMBOL :: symbol
  18. Type conversion Ruby Fixnum -> C long long c_num =

    NUM2LONG(ruby_num); C long -> Ruby Fixnum VALUE ruby_num = LONG2NUM(c_num);
  19. Level 3 Complete • CRuby folder structure • Class definition

    • Method definition • Convert data C <-> Ruby 26
  20. Fixnum#fib long fibonacci(long n) { long u = 0; long

    v = 1; long i, t; for(i = 2; i <= n; i++) { t = u + v; u = v; v = t; } return v; } class Fixnum def fib u = 0 v = 1 t = 1 2.upto(self) do t = u + v u = v v = t end t end end Ruby C
  21. Fixnum#fib long fibonacci(long n) { long u = 0; long

    v = 1; long i, t; for(i = 2; i <= n; i++) { t = u + v; u = v; v = t; } return v; } static VALUE fix_cfib(VALUE num) { long u = 0; long v = 1; long i, t; for(i = 2; i <= NUM2LONG(num); i++) { t = u + v; u = v; v = t; } return LONG2NUM(v); } C CRuby rb_define_method(rb_cFixnum, "cfib", fix_cfib, 0);
  22. #fib performance it "1M benchmark" do puts "fib(80)" puts Benchmark.measure()

    { 1000000.times { 80.fib } } end ! it "C 1M benchmark" do puts "cfib(80)" puts Benchmark.measure() { 1000000.times { 80.cfib } } end fib(80) 26.440000 0.050000 26.490000 ( 26.704735) cfib(80) 0.870000 0.000000 0.870000 ( 0.879011) CRuby is ~30x faster than Ruby (Macbook Pro)
  23. Fixnum#prime? static VALUE fix_cprime(VALUE num) { long number = NUM2LONG(num);

    long i; for (i = 2; i < number; i++) { if (number % i == 0 && i != number) return Qfalse; } return Qtrue; } class Fixnum def prime? 2.upto(self) do |i| if self % i == 0 && i != self return false; end end true end end Ruby CRuby rb_define_method(rb_cFixnum, "cprime?", fix_cprime, 0)
  24. #cprime? performance prime? 30.510000 0.060000 30.570000 ( 31.414153) cprime? 1.590000

    0.000000 1.590000 ( 1.774056) CRuby is ~17x faster than Ruby (Macbook Pro) it "prime? benchmark" do puts "prime?" puts Benchmark.measure { 94418953.prime? } #Markov end ! it "cprime? benchmark" do puts "cprime?" puts Benchmark.measure { 94418953.cprime? } #Markov end
  25. CLongArray class void Init_CLongArray(void) { cCLongArray = rb_define_class("CLongArray", rb_cObject); !

    rb_define_alloc_func(cCLongArray, array_alloc); ! rb_define_method(cCLongArray, "initialize", array_initialize, 1); rb_define_method(cCLongArray, "qsort", array_quick_sort, 0); rb_define_method(cCLongArray, "[]", array_aref, 1); rb_define_method(cCLongArray, "[]=", array_aset, 2); }
  26. CLongArray alloc static VALUE array_alloc(VALUE klass) { VALUE self; array_t

    *array; self = Data_Make_Struct(klass, array_t, array_mark, array_free, array); return self; } typedef struct { long *array; long size; } array_t;
  27. CLongArray init static VALUE array_initialize(VALUE self, long size) { array_t

    *array; Data_Get_Struct(self, array_t, array); array->array = malloc(size * sizeof(long)); array->size = size; return self; } typedef struct { long *array; long size; } array_t; CLongArray.new(10)
  28. CLongArray [] static VALUE array_aref(VALUE self, VALUE index) { array_t

    *array; Data_Get_Struct(self, array_t, array); long idx = NUM2LONG(index); return LONG2NUM(array->array[idx]); } c_long_array = CLongArray.new(10) puts c_long_array[1] rb_define_method(cCLongArray, "[]", array_aref, 1);
  29. CLongArray []= static VALUE array_aset(VALUE self, VALUE index, VALUE val)

    { array_t *array; Data_Get_Struct(self, array_t, array); long idx = NUM2LONG(index); array->array[idx] = NUM2LONG(val); return val; } c_long_array = CLongArray.new(10) c_long_array[1] = 99 rb_define_method(cCLongArray, "[]=", array_aset, 2);
  30. CLongArray qsort static VALUE array_quick_sort(VALUE self) { array_t *array; Data_Get_Struct(self,

    array_t, array); quick_sort(array->array, array->size); return self; } c_long_array.qsort rb_define_method(cCLongArray, "qsort", array_quick_sort, 0);
  31. Plain C QuickSort void quick_sort (long *a, long n) {

    if (n < 2) return; long p = a[n / 2]; long *l = a; long *r = a + n - 1; while (l <= r) { if (*l < p) { l++; continue; } if (*r > p) { r--; continue; } long t = *l; *l++ = *r; *r-- = t; } quick_sort(a, r - a + 1); quick_sort(l, a + n - l); }
  32. QuickSort Performance CRuby is ~10x faster than Ruby (Macbook Air)

    it "1_000_000 numbers benchmark" do puts "CLongArray#qsort" puts Benchmark.measure() { @c_long_array.qsort } end ! it "C 1_000_000 numbers benchmark" do puts "Array#sort!" puts Benchmark.measure() { @ruby_array.sort! } end CLongArray#qsort 0.270000 0.000000 0.270000 ( 0.173161) Array#sort! 1.910000 0.010000 1.920000 ( 1.786345)
  33. Graph Representation Adjacency-List Representation typedef struct _node_t { long node;

    struct _node_t *next; } node_t; typedef struct _graph_t { long side; node_t **graph; } graph_t;
  34. CGraph class void Init_cgraph(void) { cGraph = rb_define_class("CGraph", rb_cObject); !

    rb_define_alloc_func(cGraph, cgraph_alloc); ! rb_define_method(cGraph, "initialize", cgraph_initialize, 1); rb_define_method(cGraph, "bfs", cgraph_bfs, 1); rb_define_method(cGraph, "dfs", cgraph_dfs, 1); rb_define_method(cGraph, "prim", cgraph_prim, 1); }
  35. CGraph alloc static VALUE cgraph_alloc(VALUE klass) { VALUE self; graph_t

    *graph; self = Data_Make_Struct(klass, graph_t, cgraph_mark, cgraph_free, graph); ! return self; }
  36. CGraph init static VALUE cgraph_initialize(VALUE self, VALUE side) { graph_t

    *graph; Data_Get_Struct(self, graph_t, graph); graph->side = NUM2LONG(side); graph->graph = make_grid_graph(graph->side); ! return self; }
  37. CGraph BFS static VALUE cgraph_bfs(VALUE self, VALUE source) { graph_t

    *graph; Data_Get_Struct(self, graph_t, graph); long side = graph->side; long *path; ! path = bfs(side, graph->graph, NUM2LONG(source)); ! VALUE trace = rb_ary_new(); ! rb_gc_register_address(&trace); int i; for (i = 0; i < side*side; i++) { rb_ary_push(trace, LONG2NUM(path[i])); }; rb_gc_unregister_address(&trace); ! xfree(path); return trace; }
  38. Plain C BFS long *bfs(long side, node_t **graph, long source)

    { long *path = calloc(side*side, sizeof(long)); long path_tail = 0; long *q = calloc(side*side, sizeof(long)); queue_t queue = {0, 0, q}; long *visited = calloc(side*side, sizeof(long)); ! enqueue(&queue, source); visited[source] = 1; while(any(&queue)) { long node = dequeue(&queue); path[path_tail++] = node; node_t *p = graph[node]; while (p) { long i = p->node; if (visited[i] == 0) { enqueue(&queue, i); visited[i] = 1; } p = p->next; } } return path; }
  39. Grid Graph Creation C -> 535.926 milliseconds Ruby -> 27,901.748

    milliseconds CRuby is ~52x faster than Ruby (Macbook Air) 2048*2048 grid = 4,194,304 nodes
  40. Grid Graph BFS C -> 228.536 milliseconds Ruby -> 4,426.137

    milliseconds CRuby is ~19x faster than Ruby (Macbook pro) 1024* 1024 grid = 1,048,576 nodes
  41. Level 5 Complete • CGraph • BFS in C •

    CRuby ~19x faster than Ruby 47
  42. Conclusion • Use C to speed up your Ruby algorithms

    • Rewrites in C are usually very easy • Original C algorithm code can be used without modifications • Trivial CRuby rewrites are between 10x to 50x faster • Better performance possible with specialized algorithms