Upgrade to Pro — share decks privately, control downloads, hide ads and more …

XND and rubyplot - typed arrays and visualization in Ruby

XND and rubyplot - typed arrays and visualization in Ruby

Talk given at Red Data Tools Meetup, Tokyo on XND and rubyplot.

Sameer Deshmukh

November 17, 2018
Tweet

More Decks by Sameer Deshmukh

Other Decks in Programming

Transcript

  1. XND and rubyplot - future of typed arrays and visualization

    in Ruby Sameer Deshmukh 2018-11-17 Sat
  2. Introduction Name: Sameer Deshmukh. From: Pune, India. Currently living in

    Tokyo, Japan. Master’s degree student at Tokyo Institute of Technology. Intern at Quansight Inc. Email: [email protected] Twitter: @v0dro GitHub: @v0dro
  3. Open source work Aiming to make Ruby a viable language

    for scientific computing. Contributor to Ruby Science Foundation since 3 years. Author of libraries like daru and rubex. Contributor to rubyplot, nmatrix, narray, iruby, statsample etc.
  4. Key Ideas Typed arrays in Ruby. Introduction. Problems with present

    approach. Plures - the future. Advanced visualization in Ruby. Present solutions. Rubyplot - the most advanced Ruby plotting tool.
  5. Simple Array vs. Typed Array A normal Ruby Array stores

    data as Ruby objects. For example: 1 a = [1,2,3,4] Each number is internally a Ruby VALUE object. However, a typed array stores values as a sequence of bytes.
  6. Advantages of typed arrays Ability to choose the data type.

    Choices from single bit to 128 bit complex numbers. Contiguous storage and constant buffer size. Optimized math operations. Possibility of SIMD optimizations. Use of accelerators (like GPUs). Interoperability with third party services.
  7. Current scenario in Ruby Community split between numo/narray and sciruby/nmatrix.

    Two typed array libraries with very similar functionality. NMatrix matmul 1 require ’nmatrix’ require ’nmatrix/lapacke’ 3 a = NMatrix.rand([100, 100], dtype: :float64) 5 b = NMatrix.rand([100, 100], dtype: :float64) c = a.dot b NArray matmul require ’numo/narray’ 2 require ’numo/linalg’ 4 a = Numo::DFloat.new(100,100).rand(0,1) b = Numo::DFloat.new(100,100).rand(0,1) 6 c = Numo::Linalg.dot a, b
  8. Problems with current scenario Community effort split between NMatrix and

    NArray. Incompatible with each other. One needs features that the other one wants and vice versa. Not extensible. Cannot cater to modern needs of data processing and AI/ML. No dedicated team of paid maintainers.
  9. The Plures Project High performance, language independent libraries for typed

    arrays. Made keeping in mind current needs of flexibility and interoperability. Consists of three major libraries: NDTYPES XND GUMATH Sponsored by Quansight Inc., a company dedicated to promoting sustainable FOSS.
  10. Limitations of type specification in NMatrix/NArray NMatrix / NArray approach:

    types are Symbol or class. e.g - NMatrix.new([2,2], dtype: :int64) or NArray::DFloat.new(3,3). Limitations: Assumes that entire array if of the same type. Can only store contiguous arrays. Cannot create custom aggregate types or typedefs. No support for string encodings, hashes, tuples or ragged arrays.
  11. NDTypes: specify types as Strings Array of 10 64-bit integers.

    1 a = NDT.new "10 ∗ int64" Hash of strings and floats 1 a = NDT.new "{a: 10 ∗ string, b: 20 ∗ float32}" Define graph type using typedefs 1 NDT.typedef "node", "int32" NDT.typedef "cost", "int32" 3 NDT.typedef "graph", "var ∗ var ∗ (node, cost)" 5 t = NDT.new "var(offsets=[0,2]) ∗ var(offsets =[0,3,10]) ∗ (node, cost)"
  12. XND: data container for typed arrays XND provides containers for

    storing data that is described by NDTypes. It provides a fast and efficient way of data access and sharing. XND allows sharing data between various libraries and services. It aims to be interoperable with NMatrix and NArray using @mrkn’s numbuffer.
  13. XND examples Create (4,4) int32 array. 1 x = XND.new

    [[1,2,3,4]] ∗ 4, type: "4 ∗ 4 ∗ int32" Get the first column of the matrix (view-only). 1 x[0..−1, 0]. to_a #=> [1,1,1,1] Store a Ruby Hash of strings x=XND.new({’a’ => [1,2,3,4], ’b’ => ["foo", "bar", "baz"]}, type: "{a: 4 ∗ int32, b: 3 ∗ string }")
  14. Gumath: multi-dispatch math kernels on any device The gumath framework

    allows you to write multiple dispatch math kernels. Target any device (CPUs, GPUs, etc.). Write fast, efficient algorithms for working on any type of data.
  15. Current stage of development XND, ndtypes and gumath still in

    testing phase. Ruby and Python wrappers ready. Rubyplot will use XND for Artist layer.
  16. Go try it out! Install XND gem install xnd --pre

    Install ndtypes gem install ndtypes --pre Install gumath gem install gumath --pre
  17. Visualization in Ruby Most people use Ruby for quick-and-dirty prototypes.

    Lack of a robust visualization framework is huge hinderance to Ruby’s adoption.
  18. Current solutions for visualization in Ruby Web plotters Bridges to

    libraries nyaplot GNUplotRB (GNUplot) plotrb numo-gnuplot (GNUplot) Gnuplot(GNUplot) JFreeChart charty (GR) Bridges to other languages Third party services matplotlib.rb (Python) Google Charts galaaz (R) Timemetric rsruby (R) Chartkick
  19. Need: native plotting solution Perform all transformations in Ruby itself.

    Have maximum control over visualization APIs. Provide various backends with the same API. gruff is one such library but has very basic functionality.
  20. Solution: rubyplot Native plotting solution built using Image Magick. Highest

    control on plot generation. Two level API - procedural and object-oriented. Support for multiple backends (GR planned). Aspires to be the most advanced Ruby plotting library.
  21. Timeline of rubyplot Started as GSOC 2018 project by students

    Arafat Khan and Pranav Garg. Was two separate libraries with Image Magick and GR backends. Currently in the process of merging into a common API with multiple backends.
  22. Rubyplot: older workable API Simple Bar plot 1 require ’rubyplot’

    3 plot = Rubyplot::Bar.new(400) plot. title = ’My Graph’ 5 plot.data([1, 2, 3, 4, 4, 3], label : ’Apples oranges Watermelon’) plot.data([4, 8, 7, 9, 8, 9], label : ’Oranges’) 7 plot.data([2, 3, 1, 5, 6, 8], label : ’Watermelon’) plot.data([9, 9, 10, 8, 7, 9], label : ’Peaches’) 9 plot. labels = { 0 => ’2003’, 2 => ’2004’, 4 => ’2005’ }
  23. Rubyplot: new experimental API 1 require ’rubyplot’ 3 a =

    Rubyplot::SPI.new a. title = ’My cool graph’ 5 a. line ! [−10, 0, 5, 28], [1, 2, 3, 4] a. scatter ! [2, 4, 16], [10, 20, −40] 7 a.save ’plot.png’ Listing 1: Procedural API. 1 fig = Rubyplot::Figure.new axes = fig.add_subplot 0,0 3 axes. scatter ! do |p| p.data [2, 4, 16], [10, 20, −40] 5 p.label = "data1" p.color = :plum_purple 7 end axes. line ! do |p| 9 p.data [2, 4, 16], [10, 20, −40] p.label = "data2" 11 p.color = :yellow end 13 axes.x_title = "X data" axes.y_title = "Y data" Listing 2: Object-oriented API.
  24. Useful links The Plures Project: https://xnd.io/ XND: https://github.com/plures/xnd/tree/ ruby-wrapper/ruby ndtypes:

    https://github.com/plures/ndtypes/ tree/ruby-wrapper/ruby gumath: https://github.com/plures/gumath/tree/ ruby-wrapper/ruby rubyplot: https://github.com/sciruby/rubyplot