Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Progress of Ruby-Numo: Numerical Computing for Ruby

Progress of Ruby-Numo: Numerical Computing for Ruby

2cadef577a1d0de8edc459a8b45da21f?s=128

Masahiro Tanaka 田中昌宏

September 19, 2017
Tweet

Transcript

  1. Progress of Ruby/Numo: Numerical Computing for Ruby Masahiro TANAKA -

    田中昌宏 RubyKaigi 2017 @ Hiroshima Nov 19, 2017 RubyKaigi2017@Hiroshima 1
  2. Masahiro Tanaka ▶ Research Fellow at Center for Computational Sciences,

    University of Tsukuba ▶ RubyKaigi Speaker ◦ RubyKaigi 2010 in Tsukuba • Topic: NArray • http://rubykaigi.org/2010/ja/events/83 ◦ RubyKaigi 2016 in Kyoto • Topic: Pwrake • http://rubykaigi.org/2016/presentations/masa16tanaka.html ◦ RubyKaigi 2017 in Hiroshima • Topic: Numo::NArray Nov 19, 2017 RubyKaigi2017@Hiroshima 2
  3. Data Science ▶ Artificial Intelligence (AI) ▶ Machine Learning (ML)

    ▶ Deep Learning (DL) ▶ Big Data Nov 19, 2017 RubyKaigi2017@Hiroshima 3
  4. Popular Programming Language for Data Science ▶ R ◦ Language

    for Statistical Computing ▶ Python ◦ Versatile Language + Library for Scientific Computing ◦ Supported by popular Deep Learning frameworks. Nov 19, 2017 RubyKaigi2017@Hiroshima 4
  5. Nov 19, 2017 RubyKaigi2017@Hiroshima 5 https://www.scipy.org/

  6. Nov 19, 2017 RubyKaigi2017@Hiroshima 6 https://docs.scipy.org/doc/scipy/reference/

  7. Nov 19, 2017 RubyKaigi2017@Hiroshima 7 http://scikit-learn.org/stable/

  8. SciPy Stack Pandas Sympy matplotlib Scikit-learn Jupyter (IPython) NumPy Cython

    SciPy lib Python Scientific Computing Data Science Machine Learning Nov 19, 2017 RubyKaigi2017@Hiroshima 8
  9. Two approaches to do Data Science with Ruby ▶ Calls

    Existing Framework from Ruby ◦ PyCall ▶ Build Libraries in Ruby ◦ Data Science Libraries require Science Libraries ◦ Science Libraries require NumPy-like library ◦ Is there any library like NumPy in Ruby? Nov 19, 2017 RubyKaigi2017@Hiroshima 9
  10. NArray NumPy-like Library for Ruby Nov 19, 2017 RubyKaigi2017@Hiroshima 10

  11. History of NArray ▶ First design ◦ 1999 NArray ver.

    0.3.0 (start) ◦ 2000 NArray ver. 0.5.0 (re-design) ◦ 2016 NArray ver. 0.6.1.2 (current) ▶ Second design ◦ 2007 NArray ver. 0.7 (start) ◦ 2011 NArray ver. 0.9 (re-design) ◦ 2016 Numo::NArray ver. 0.9.0.1 (gem release) ◦ 2017 Numo::NArray ver. 0.9.0.8 (current) Nov 19, 2017 RubyKaigi2017@Hiroshima 11
  12. Ruby Numo ▶ Numo::NArray ◦ N-dimensional Numerical Array ▶ Numo::Gnuplot

    ◦ wrapper to Gnuplot ▶ Numo::GSL ◦ wrapper to GNU Scientific Library ▶ Numo::Linalg ◦ Linear Algebra ◦ wrapper to BLAS/LAPACK ▶ Numo::FFTW, Numo::FFTE ◦ wrapper to FFT libraries Nov 19, 2017 RubyKaigi2017@Hiroshima 12 https://github.com/ruby-numo
  13. NArray coverage ▶ 363 NumPy functions ▶ 217 covered ▶

    91 to-do ▶ 55 no plan ◦ NumPy-specific functions, financial functions etc. Nov 19, 2017 RubyKaigi2017@Hiroshima 13 https://github.com/ruby-numo/narray/wiki/Numo-vs-numpy
  14. Numo::NArray Basics Nov 19, 2017 RubyKaigi2017@Hiroshima 14

  15. Creating Numo::NArray require "numo/narray" a = Numo::NArray[[1,2,3],[4,5,6]] => Numo::Int32#shape=[2,3] [[1,

    2, 3], [4, 5, 6]] Nov 19, 2017 RubyKaigi2017@Hiroshima 15
  16. Data Type = Subclass of Numo::NArray ▶ Bit, Boolean –

    Numo::Bit ▶ Signed Integer – Numo::Int8, Numo::Int32 – Numo::Int16, Numo::Int64 ▶ Unsigned Integer – Numo::UInt8, Numo::UInt32 – Numo::UInt16, Numo::UInt64 ▶ Floating point real number – Numo::DFloat (Float64) – Numo::SFloat (Float32) ▶ Floating point complex number – Numo::DComplex (Complex128) – Numo::SComplex (Complex64) ▶ Ruby Object – Numo::RObject Nov 19, 2017 RubyKaigi2017@Hiroshima 16
  17. ▶ shape = [4] – 1-dimensional array ▶ shape =

    [4,4] – 2-dimentional array ▶ shape = [4,4,4] – 3-dimensional array Nov 19, 2017 RubyKaigi2017@Hiroshima 17 Shape = Array of sizes along dimensions a[0,0] a[0,1] a[0,2] a[0,3] a[1,0] a[1,1] a[1,2] a[1,3] a[2,0] a[2,1] a[2,2] a[2,3] a[3,0] a[3,1] a[3,2] a[3,3] a[0] a[1] a[2] a[3] a[0,0,0] a[0,0,1] a[0,0,2] a[0,0,3] a[0,1,0] a[0,1,1] a[0,1,2] a[0,1,3] a[0,2,0] a[0,2,1] a[0,2,2] a[0,2,3] a[0,3,0] a[0,3,1] a[0,3,2] a[0,3,3]
  18. ▶ a[1] ◦ returns an element ▶ a[1..2] ◦ returns

    NArray ▶ a[(1..-1).step(2)] ◦ returns NArray ▶ a[[1,2,4]] ◦ returns NArray Nov 19, 2017 RubyKaigi2017@Hiroshima 18 NArray Slice (Indexing) a[0] a[1] a[2] a[3] a[4] a[0] a[1] a[2] a[3] a[4] a[0] a[1] a[2] a[3] a[4] a[0] a[1] a[2] a[3] a[4]
  19. Notation and Speed of Element-wise Operation a = [1,2,3,4,5] b

    = [10,20,30,40,50] p c = a.zip(b).map{|x,y| x + y} #=> [11, 22, 33, 44, 55] require "benchmark" a = (1..10000).to_a b = (10..100000).step(10).to_a Benchmark.bm do |r| r.report do 1000.times{ c = a.zip(b).map{|x,y| x+y} } end end # user system total real # 1.910000 0.010000 1.920000 ( 1.912284) require "numo/narray" a = Numo::NArray[1,2,3,4,5] b = Numo::NArray[10,20,30,40,50] p c = a + b #=> Numo::Int32#shape=[5] #[11, 22, 33, 44, 55] require "benchmark" a = Numo::Int32.new(10000).seq(1) b = Numo::Int32.new(10000).seq(10,10) Benchmark.bm do |r| r.report do 100000.times{ c = a + b } end end # user system total real # 0.750000 0.080000 0.830000 ( 0.830775) Nov 19, 2017 RubyKaigi2017@Hiroshima 19 230 times faster Simple notation
  20. NArray methods (Excerpt from Numo::DFloat) ▶ Arithmetic ◦ + -

    * / % ** -@ abs divmod reciprocal poly sign square ▶ Statistics ◦ clip cumprod cumsum diff kahan_sum kron max max_index mean median min min_index minmax mulsum ptp prod rms sum stddev var sort sort_index ▶ Random Number (Mersenne Twister) ◦ rand rand_norm ▶ Comparison ◦ eq ge gt le lt ne nearly_eq ▶ Numo::NMath module function ◦ acos acosh asin asinh atan atan2 atanh cbrt cos cosh erf erfc exp exp10 exp2 expm1 frexp hypot ldexp log log10 log1p log2 sin sinc sinh sqrt tan tanh Nov 19, 2017 RubyKaigi2017@Hiroshima 20
  21. Important features of Numo::NArray (and NumPy) 1. Slice View 2.

    Broadcasting 3. Masking Nov 19, 2017 RubyKaigi2017@Hiroshima 21
  22. 1. Create View on Slice a = Numo::DFloat[1..5] => Numo::DFloat#shape=[5]

    [1, 2, 3, 4, 5] b = a[2..3] => Numo::DFloat(view)#shape=[2] [3, 4] b[0..1] = 0 a => Numo::DFloat#shape=[5] [1, 2, 0, 0, 5] ▶ b is a view ▶ Saves memory and copy cost. ▶ Slice view is introduced in Numo::NArray Nov 19, 2017 RubyKaigi2017@Hiroshima 22 a = 1 2 3 4 5 b = 3 4
  23. x = Numo::DFloat[[1,2,3]] => Numo::DFloat#shape=[1,3] [[1, 2, 3]] y =

    Numo::DFloat[[2],[10]] => Numo::DFloat#shape=[2,1] [[2], [10]] x*y => Numo::DFloat#shape=[2,3] [[2, 4, 6], [10, 20, 30]] 2. Broadcasting ▶ Apply an element repeatedly on an axis with length=1 Nov 19, 2017 RubyKaigi2017@Hiroshima 23 x[0,0] x[0,1] x[0,2] y[0,0] x[0,0]*y[0,0] x[0,1]*y[0,0] x[0,2]*y[0,0] y[1,0] x[0,0]*y[1,0] x[0,1]*y[1,0] x[0,2]*y[1,0]
  24. 3. Masking a = Numo::DFloat[1,-2,3,-4,-5] => Numo::DFloat#shape=[5] [1, -2, 3,

    -4, -5] a < 0 => Numo::Bit#shape=[5] [0, 1, 0, 1, 1] a[a<0] = 0 a => Numo::DFloat#shape=[5] [1, 0, 3, 0, 0] ▶ Returns Boolean with bit array ▶ Replaces negative elements with zeros Nov 19, 2017 RubyKaigi2017@Hiroshima 24
  25. Any other Numerical Array in Ruby? Nov 19, 2017 RubyKaigi2017@Hiroshima

    25
  26. NMatrix ▶ Main product of SciRuby. ▶ Supports ◦ Multi-dimensional

    Numerical Array ◦ Dense Matrix operation (wrapper to BLAS/LAPACK) ◦ Sparse Matrix ▶ IMO, NMatrix cannot be an alternative to NumPy. Nov 19, 2017 RubyKaigi2017@Hiroshima 26
  27. Feature Comparison NumPy NMatrix First NArray Numo:: NArray View on

    Slice ✔ ✔ ✔ Broadcasting ✔ ✔ ✔ Masking ✔ ✔ ✔ coerce ✔ ✔ ✔ Nov 19, 2017 RubyKaigi2017@Hiroshima 27
  28. Performance (NumPy, NArray, NMatrix) Nov 19, 2017 RubyKaigi2017@Hiroshima 28 faster

  29. Other modules in Ruby/Numo Nov 19, 2017 RubyKaigi2017@Hiroshima 29

  30. Numo::Linalg ▶ Module for Linear Algebra ▶ Wrapper to BLAS/LAPACK

    ▶ Ruby Association Grant 2016 ◦ Supported Kishimoto-san to increase coverage. ▶ Modules: (Similar to scipy.linalg) ◦ Numo::Linalg::Blas, Numo::Linalg::Lapack • direct wrapper to BLAS/LAPACK ◦ Numo::Linalg • Matrix product, Linear solver, Decompositions, Eigen problems, etc. Nov 19, 2017 RubyKaigi2017@Hiroshima 30
  31. Numo::Linalg coverage ▶ Most of BLAS ▶ LAPACK ◦ covered:

    Non-Symmetric and Symmetric functions ◦ not covered: Triangular and Banded functions Nov 19, 2017 RubyKaigi2017@Hiroshima 31
  32. Backend for Numo::Linalg ▶ Original BLAS/LAPACK ▶ Atlas ▶ OpenBLAS

    ▶ Intel MKL ▶ Numo::Linalg uses dlopen to link Lapack backend ◦ Easy to replace backend unlike Scipy. Nov 19, 2017 RubyKaigi2017@Hiroshima 32
  33. Numo::GSL ▶ Wrapper to GSL (GNU Scientific Library) ◦ GSL

    is a collection of numerical routines (> 1000 functions in total). ◦ Corresponding to SciPy library ◦ Operates efficiently on Numo::NArray ◦ Wrapper code is automatically generated from texinfo. ▶ Why not using Ruby/GSL ? ◦ Wrapper code is written by hand -- hard to maintain. Nov 19, 2017 RubyKaigi2017@Hiroshima 33
  34. Numo::GSL coverage Nov 19, 2017 RubyKaigi2017@Hiroshima 34 Covered by NArray,

    Linalg, FFTW Complex Numbers Vectors and Matrices Sorting BLAS Support Linear Algebra Eigensystems Fast Fourier Transforms To do (20) Least-Squares Fitting (multi-parameter) Nonlinear Least-Squares Fitting Basis Splines Chebyshev Approximations Series Acceleration Discrete Hankel Transforms Quasi-Random Sequences Permutations Combinations Multisets Numerical Integration N-tuples Monte Carlo Integration Simulated Annealing Ordinary Differential Equations Numerical Differentiation One dimensional Root-Finding One dimensional Minimization Multidimensional Root-Finding Multidimensional Minimization Covered (15) Mathematical Functions Special Functions Physical Constants Random Number Generation Random Number Distributions Statistics Running Statistics Histograms Interpolation Wavelet Transforms Least-Squares Fitting Sparse Matrices Sparse BLAS Support Sparse Linear Algebra Polynomials
  35. Numo::FFTW, Numo::FFTE ▶ Wrapper to FFT (Fast Fourier Transform) libraries

    ▶ FFTW ◦ Widely-used DFT (Discrete Fourier Transform) library. ◦ Provides Complex FFT methods. ◦ http://www.fftw.org/ ▶ FFTE ◦ Developed by Prof. Takahashi (University of Tsukuba) ◦ http://www.ffte.jp/ ◦ 2,3,5-radix Nov 19, 2017 RubyKaigi2017@Hiroshima 35
  36. Numo::Gnuplot ▶ One of many wrappers to Gnuplot ▶ Features:

    ◦ Simple interface similar to Gnuplot command line ◦ No class for data handling (use Array or NArray) ▶ Example: require "numo/gnuplot" x = (0..100).map{|i| i*0.1} y = x.map{|i| Math.sin(i)} Numo.gnuplot do set title:"X-Y data plot" plot x,y, w:'lines', t:'sin(x)' end Nov 19, 2017 RubyKaigi2017@Hiroshima 36 set title "X-Y data plot" plot '-' w lines t "sin(x)" 0 0 ... e converted Gnuplot script
  37. Numo::Gnuplot example require "numo/narray" require "numo/gnuplot" x = Numo::DFloat.new(1,21).seq-10 y

    = Numo::DFloat.new(21,1).seq-10 f = Numo::NMath.sin(-Numo::NMath.sqrt((x+5)**2+(y-7)**2)*0.5) Numo.gnuplot do set term:"pngcairo" set output:"hidden.png" set isosamples:[25,25] set :xyplane, at:0 unset :key set :palette, rgbformulae:[31,-11,32] set :style, fill_solid:0.5 set cbrange:-1..1 set :hidden3d, :front splot [f, with:"pm3d"], [x**2-y**2, with:"lines", lc_rgb:"black"] end Nov 19, 2017 RubyKaigi2017@Hiroshima 37 https://github.com/ruby-numo/gnuplot-demo
  38. Summary of Progress ▶ Numo::NArray ◦ Numpy-like array is complete.

    ▶ Numo::Linalg ◦ BLAS routines are complete. ◦ Most of non-symmetric and symmetric routines are complete. ▶ Numo::GSL ◦ 15 modules are complete. ▶ Numo::FFTW ◦ Complex DFT is complete. ▶ Numo::Gnuplot ◦ Almost complete. Nov 19, 2017 RubyKaigi2017@Hiroshima 38
  39. SciPy Stack Pandas Sympy matplotlib Scikit-learn Jupyter (IPython) NumPy Cython

    SciPy lib Python Scientific Computing Data Science Machine Learning Nov 19, 2017 RubyKaigi2017@Hiroshima 39
  40. Numo Stack daru? No Sympy? Numo::Gnuplot ML libs? Jupyter (IRuby)

    Numo::NArray Rubex? Numo::Linalg Numo::GSL Numo::FFTW Ruby Scientific Computing Data Science Machine Learning Nov 19, 2017 RubyKaigi2017@Hiroshima 40
  41. Efforts on ML with Ruby Nov 19, 2017 RubyKaigi2017@Hiroshima 41

    https://github.com/arbox/machine-learning-with-ruby
  42. Numo issues ▶ API desgin ▶ Complete to-do list ▶

    Check correctness ▶ Refactoring ▶ Improve performance ▶ Write test ▶ Write document ▶ Collaborations – GPU (CUDA etc.) support – Data Science libraries ▶ I cannot maintain too many projects... Nov 19, 2017 RubyKaigi2017@Hiroshima 42
  43. Deep Learning with Ruby/Numo Nov 19, 2017 RubyKaigi2017@Hiroshima 43

  44. Deep Learning From Scratch ▶ Published from O'Reilly Japan ◦

    Implement DL with NumPy ◦ https://www.oreilly.co.jp/books/9784873117584/ ▶ Blog post by Akanuma-san ◦ Translation into Ruby/Numo ◦ Elapsed time: • Ruby: 281 sec • Python: 39 sec Nov 19, 2017 RubyKaigi2017@Hiroshima 44 http://blog.akanumahiroaki.com/entry/2017/04/15/160000
  45. Measurement on My Laptop ruby train_neuralnet.rb # 372 sec python3.6

    train_neuralnet.py # 32 sec ▶ Depends on the performance of dot method (matrix product) ◦ NArray: Naïve implementation with single core ◦ NumPy: ATLAS or OpenBLAS with 4 cores ▶ Use OpenBLAS with 4 cores by loading Linalg: ruby -r numo/linalg/use/openblas train_neuralnet.rb # 54 sec ◦ NArray has room for more speed-up. Nov 19, 2017 RubyKaigi2017@Hiroshima 45
  46. Convert Deep Learning code from Python to Ruby ▶ Talk

    in RejectKaigi 2017 by Naitoh-san ▶ Purpose: – Translate Chainer code into Ruby ▶ py2rb.py – Converter from Python to Ruby – Replace NumPy with Numo::NArray Nov 19, 2017 RubyKaigi2017@Hiroshima 46 Another way to "get on the shoulder of the giant" for machine learning with Ruby https://www.slideshare.net/naitoh1/reject-kaigi2017-naitoh
  47. Incompatibility between NumPy and NArray Nov 19, 2017 RubyKaigi2017@Hiroshima 47

    NumPy Numo::NArray Why if condition: if condition != 0 zero is true in Ruby a[:] a[0..-1] or a[true] Range feature a[b:e] a[b..e-1] Range feature b = a.view() b += 1 b = a.view b.inplace + 1 b += 1 is syntax sugar for b = b + 1 a[i] (a.ndim >= 2) a[i,false] NArray feature a[[0,1],[1,0]] a.at([0,1],[1,0]) NArray feature
  48. Summary ▶ Ruby/Numo ◦ N-dimensional Numerical Array (Numo::NArray) ◦ Numerical

    Algorithms (Numo::Linalg|GSL|FFTW|FFTE) ◦ Data visualization (Numo::Gnuplot) ◦ Need further effort. (coverage, performance, test, document) ▶ Experimental cases for Deep Learning Nov 19, 2017 RubyKaigi2017@Hiroshima 48