Slide 1

Slide 1 text

Progress of Ruby/Numo: Numerical Computing for Ruby Masahiro TANAKA - 田中昌宏 RubyKaigi 2017 @ Hiroshima Nov 19, 2017 RubyKaigi2017@Hiroshima 1

Slide 2

Slide 2 text

Masahiro Tanaka ▶ Research Fellow at Center for Computational Sciences, University of Tsukuba ▶ RubyKaigi Speaker ○ RubyKaigi 2010 in Tsukuba • Topic: NArray • http://rubykaigi.org/2010/ja/events/83 ○ RubyKaigi 2016 in Kyoto • Topic: Pwrake • http://rubykaigi.org/2016/presentations/masa16tanaka.html ○ RubyKaigi 2017 in Hiroshima • Topic: Numo::NArray Nov 19, 2017 RubyKaigi2017@Hiroshima 2

Slide 3

Slide 3 text

Data Science ▶ Artificial Intelligence (AI) ▶ Machine Learning (ML) ▶ Deep Learning (DL) ▶ Big Data Nov 19, 2017 RubyKaigi2017@Hiroshima 3

Slide 4

Slide 4 text

Popular Programming Language for Data Science ▶ R ○ Language for Statistical Computing ▶ Python ○ Versatile Language + Library for Scientific Computing ○ Supported by popular Deep Learning frameworks. Nov 19, 2017 RubyKaigi2017@Hiroshima 4

Slide 5

Slide 5 text

Nov 19, 2017 RubyKaigi2017@Hiroshima 5 https://www.scipy.org/

Slide 6

Slide 6 text

Nov 19, 2017 RubyKaigi2017@Hiroshima 6 https://docs.scipy.org/doc/scipy/reference/

Slide 7

Slide 7 text

Nov 19, 2017 RubyKaigi2017@Hiroshima 7 http://scikit-learn.org/stable/

Slide 8

Slide 8 text

SciPy Stack Pandas Sympy matplotlib Scikit-learn Jupyter (IPython) NumPy Cython SciPy lib Python Scientific Computing Data Science Machine Learning Nov 19, 2017 RubyKaigi2017@Hiroshima 8

Slide 9

Slide 9 text

Two approaches to do Data Science with Ruby ▶ Calls Existing Framework from Ruby ○ PyCall ▶ Build Libraries in Ruby ○ Data Science Libraries require Science Libraries ○ Science Libraries require NumPy-like library ○ Is there any library like NumPy in Ruby? Nov 19, 2017 RubyKaigi2017@Hiroshima 9

Slide 10

Slide 10 text

NArray NumPy-like Library for Ruby Nov 19, 2017 RubyKaigi2017@Hiroshima 10

Slide 11

Slide 11 text

History of NArray ▶ First design ○ 1999 NArray ver. 0.3.0 (start) ○ 2000 NArray ver. 0.5.0 (re-design) ○ 2016 NArray ver. 0.6.1.2 (current) ▶ Second design ○ 2007 NArray ver. 0.7 (start) ○ 2011 NArray ver. 0.9 (re-design) ○ 2016 Numo::NArray ver. 0.9.0.1 (gem release) ○ 2017 Numo::NArray ver. 0.9.0.8 (current) Nov 19, 2017 RubyKaigi2017@Hiroshima 11

Slide 12

Slide 12 text

Ruby Numo ▶ Numo::NArray ○ N-dimensional Numerical Array ▶ Numo::Gnuplot ○ wrapper to Gnuplot ▶ Numo::GSL ○ wrapper to GNU Scientific Library ▶ Numo::Linalg ○ Linear Algebra ○ wrapper to BLAS/LAPACK ▶ Numo::FFTW, Numo::FFTE ○ wrapper to FFT libraries Nov 19, 2017 RubyKaigi2017@Hiroshima 12 https://github.com/ruby-numo

Slide 13

Slide 13 text

NArray coverage ▶ 363 NumPy functions ▶ 217 covered ▶ 91 to-do ▶ 55 no plan ○ NumPy-specific functions, financial functions etc. Nov 19, 2017 RubyKaigi2017@Hiroshima 13 https://github.com/ruby-numo/narray/wiki/Numo-vs-numpy

Slide 14

Slide 14 text

Numo::NArray Basics Nov 19, 2017 RubyKaigi2017@Hiroshima 14

Slide 15

Slide 15 text

Creating Numo::NArray require "numo/narray" a = Numo::NArray[[1,2,3],[4,5,6]] => Numo::Int32#shape=[2,3] [[1, 2, 3], [4, 5, 6]] Nov 19, 2017 RubyKaigi2017@Hiroshima 15

Slide 16

Slide 16 text

Data Type = Subclass of Numo::NArray ▶ Bit, Boolean – Numo::Bit ▶ Signed Integer – Numo::Int8, Numo::Int32 – Numo::Int16, Numo::Int64 ▶ Unsigned Integer – Numo::UInt8, Numo::UInt32 – Numo::UInt16, Numo::UInt64 ▶ Floating point real number – Numo::DFloat (Float64) – Numo::SFloat (Float32) ▶ Floating point complex number – Numo::DComplex (Complex128) – Numo::SComplex (Complex64) ▶ Ruby Object – Numo::RObject Nov 19, 2017 RubyKaigi2017@Hiroshima 16

Slide 17

Slide 17 text

▶ shape = [4] – 1-dimensional array ▶ shape = [4,4] – 2-dimentional array ▶ shape = [4,4,4] – 3-dimensional array Nov 19, 2017 RubyKaigi2017@Hiroshima 17 Shape = Array of sizes along dimensions a[0,0] a[0,1] a[0,2] a[0,3] a[1,0] a[1,1] a[1,2] a[1,3] a[2,0] a[2,1] a[2,2] a[2,3] a[3,0] a[3,1] a[3,2] a[3,3] a[0] a[1] a[2] a[3] a[0,0,0] a[0,0,1] a[0,0,2] a[0,0,3] a[0,1,0] a[0,1,1] a[0,1,2] a[0,1,3] a[0,2,0] a[0,2,1] a[0,2,2] a[0,2,3] a[0,3,0] a[0,3,1] a[0,3,2] a[0,3,3]

Slide 18

Slide 18 text

▶ a[1] ○ returns an element ▶ a[1..2] ○ returns NArray ▶ a[(1..-1).step(2)] ○ returns NArray ▶ a[[1,2,4]] ○ returns NArray Nov 19, 2017 RubyKaigi2017@Hiroshima 18 NArray Slice (Indexing) a[0] a[1] a[2] a[3] a[4] a[0] a[1] a[2] a[3] a[4] a[0] a[1] a[2] a[3] a[4] a[0] a[1] a[2] a[3] a[4]

Slide 19

Slide 19 text

Notation and Speed of Element-wise Operation a = [1,2,3,4,5] b = [10,20,30,40,50] p c = a.zip(b).map{|x,y| x + y} #=> [11, 22, 33, 44, 55] require "benchmark" a = (1..10000).to_a b = (10..100000).step(10).to_a Benchmark.bm do |r| r.report do 1000.times{ c = a.zip(b).map{|x,y| x+y} } end end # user system total real # 1.910000 0.010000 1.920000 ( 1.912284) require "numo/narray" a = Numo::NArray[1,2,3,4,5] b = Numo::NArray[10,20,30,40,50] p c = a + b #=> Numo::Int32#shape=[5] #[11, 22, 33, 44, 55] require "benchmark" a = Numo::Int32.new(10000).seq(1) b = Numo::Int32.new(10000).seq(10,10) Benchmark.bm do |r| r.report do 100000.times{ c = a + b } end end # user system total real # 0.750000 0.080000 0.830000 ( 0.830775) Nov 19, 2017 RubyKaigi2017@Hiroshima 19 230 times faster Simple notation

Slide 20

Slide 20 text

NArray methods (Excerpt from Numo::DFloat) ▶ Arithmetic ○ + - * / % ** -@ abs divmod reciprocal poly sign square ▶ Statistics ○ clip cumprod cumsum diff kahan_sum kron max max_index mean median min min_index minmax mulsum ptp prod rms sum stddev var sort sort_index ▶ Random Number (Mersenne Twister) ○ rand rand_norm ▶ Comparison ○ eq ge gt le lt ne nearly_eq ▶ Numo::NMath module function ○ acos acosh asin asinh atan atan2 atanh cbrt cos cosh erf erfc exp exp10 exp2 expm1 frexp hypot ldexp log log10 log1p log2 sin sinc sinh sqrt tan tanh Nov 19, 2017 RubyKaigi2017@Hiroshima 20

Slide 21

Slide 21 text

Important features of Numo::NArray (and NumPy) 1. Slice View 2. Broadcasting 3. Masking Nov 19, 2017 RubyKaigi2017@Hiroshima 21

Slide 22

Slide 22 text

1. Create View on Slice a = Numo::DFloat[1..5] => Numo::DFloat#shape=[5] [1, 2, 3, 4, 5] b = a[2..3] => Numo::DFloat(view)#shape=[2] [3, 4] b[0..1] = 0 a => Numo::DFloat#shape=[5] [1, 2, 0, 0, 5] ▶ b is a view ▶ Saves memory and copy cost. ▶ Slice view is introduced in Numo::NArray Nov 19, 2017 RubyKaigi2017@Hiroshima 22 a = 1 2 3 4 5 b = 3 4

Slide 23

Slide 23 text

x = Numo::DFloat[[1,2,3]] => Numo::DFloat#shape=[1,3] [[1, 2, 3]] y = Numo::DFloat[[2],[10]] => Numo::DFloat#shape=[2,1] [[2], [10]] x*y => Numo::DFloat#shape=[2,3] [[2, 4, 6], [10, 20, 30]] 2. Broadcasting ▶ Apply an element repeatedly on an axis with length=1 Nov 19, 2017 RubyKaigi2017@Hiroshima 23 x[0,0] x[0,1] x[0,2] y[0,0] x[0,0]*y[0,0] x[0,1]*y[0,0] x[0,2]*y[0,0] y[1,0] x[0,0]*y[1,0] x[0,1]*y[1,0] x[0,2]*y[1,0]

Slide 24

Slide 24 text

3. Masking a = Numo::DFloat[1,-2,3,-4,-5] => Numo::DFloat#shape=[5] [1, -2, 3, -4, -5] a < 0 => Numo::Bit#shape=[5] [0, 1, 0, 1, 1] a[a<0] = 0 a => Numo::DFloat#shape=[5] [1, 0, 3, 0, 0] ▶ Returns Boolean with bit array ▶ Replaces negative elements with zeros Nov 19, 2017 RubyKaigi2017@Hiroshima 24

Slide 25

Slide 25 text

Any other Numerical Array in Ruby? Nov 19, 2017 RubyKaigi2017@Hiroshima 25

Slide 26

Slide 26 text

NMatrix ▶ Main product of SciRuby. ▶ Supports ○ Multi-dimensional Numerical Array ○ Dense Matrix operation (wrapper to BLAS/LAPACK) ○ Sparse Matrix ▶ IMO, NMatrix cannot be an alternative to NumPy. Nov 19, 2017 RubyKaigi2017@Hiroshima 26

Slide 27

Slide 27 text

Feature Comparison NumPy NMatrix First NArray Numo:: NArray View on Slice ✔ ✔ ✔ Broadcasting ✔ ✔ ✔ Masking ✔ ✔ ✔ coerce ✔ ✔ ✔ Nov 19, 2017 RubyKaigi2017@Hiroshima 27

Slide 28

Slide 28 text

Performance (NumPy, NArray, NMatrix) Nov 19, 2017 RubyKaigi2017@Hiroshima 28 faster →

Slide 29

Slide 29 text

Other modules in Ruby/Numo Nov 19, 2017 RubyKaigi2017@Hiroshima 29

Slide 30

Slide 30 text

Numo::Linalg ▶ Module for Linear Algebra ▶ Wrapper to BLAS/LAPACK ▶ Ruby Association Grant 2016 ○ Supported Kishimoto-san to increase coverage. ▶ Modules: (Similar to scipy.linalg) ○ Numo::Linalg::Blas, Numo::Linalg::Lapack • direct wrapper to BLAS/LAPACK ○ Numo::Linalg • Matrix product, Linear solver, Decompositions, Eigen problems, etc. Nov 19, 2017 RubyKaigi2017@Hiroshima 30

Slide 31

Slide 31 text

Numo::Linalg coverage ▶ Most of BLAS ▶ LAPACK ○ covered: Non-Symmetric and Symmetric functions ○ not covered: Triangular and Banded functions Nov 19, 2017 RubyKaigi2017@Hiroshima 31

Slide 32

Slide 32 text

Backend for Numo::Linalg ▶ Original BLAS/LAPACK ▶ Atlas ▶ OpenBLAS ▶ Intel MKL ▶ Numo::Linalg uses dlopen to link Lapack backend ○ Easy to replace backend unlike Scipy. Nov 19, 2017 RubyKaigi2017@Hiroshima 32

Slide 33

Slide 33 text

Numo::GSL ▶ Wrapper to GSL (GNU Scientific Library) ○ GSL is a collection of numerical routines (> 1000 functions in total). ○ Corresponding to SciPy library ○ Operates efficiently on Numo::NArray ○ Wrapper code is automatically generated from texinfo. ▶ Why not using Ruby/GSL ? ○ Wrapper code is written by hand -- hard to maintain. Nov 19, 2017 RubyKaigi2017@Hiroshima 33

Slide 34

Slide 34 text

Numo::GSL coverage Nov 19, 2017 RubyKaigi2017@Hiroshima 34 Covered by NArray, Linalg, FFTW Complex Numbers Vectors and Matrices Sorting BLAS Support Linear Algebra Eigensystems Fast Fourier Transforms To do (20) Least-Squares Fitting (multi-parameter) Nonlinear Least-Squares Fitting Basis Splines Chebyshev Approximations Series Acceleration Discrete Hankel Transforms Quasi-Random Sequences Permutations Combinations Multisets Numerical Integration N-tuples Monte Carlo Integration Simulated Annealing Ordinary Differential Equations Numerical Differentiation One dimensional Root-Finding One dimensional Minimization Multidimensional Root-Finding Multidimensional Minimization Covered (15) Mathematical Functions Special Functions Physical Constants Random Number Generation Random Number Distributions Statistics Running Statistics Histograms Interpolation Wavelet Transforms Least-Squares Fitting Sparse Matrices Sparse BLAS Support Sparse Linear Algebra Polynomials

Slide 35

Slide 35 text

Numo::FFTW, Numo::FFTE ▶ Wrapper to FFT (Fast Fourier Transform) libraries ▶ FFTW ○ Widely-used DFT (Discrete Fourier Transform) library. ○ Provides Complex FFT methods. ○ http://www.fftw.org/ ▶ FFTE ○ Developed by Prof. Takahashi (University of Tsukuba) ○ http://www.ffte.jp/ ○ 2,3,5-radix Nov 19, 2017 RubyKaigi2017@Hiroshima 35

Slide 36

Slide 36 text

Numo::Gnuplot ▶ One of many wrappers to Gnuplot ▶ Features: ○ Simple interface similar to Gnuplot command line ○ No class for data handling (use Array or NArray) ▶ Example: require "numo/gnuplot" x = (0..100).map{|i| i*0.1} y = x.map{|i| Math.sin(i)} Numo.gnuplot do set title:"X-Y data plot" plot x,y, w:'lines', t:'sin(x)' end Nov 19, 2017 RubyKaigi2017@Hiroshima 36 set title "X-Y data plot" plot '-' w lines t "sin(x)" 0 0 ... e converted Gnuplot script

Slide 37

Slide 37 text

Numo::Gnuplot example require "numo/narray" require "numo/gnuplot" x = Numo::DFloat.new(1,21).seq-10 y = Numo::DFloat.new(21,1).seq-10 f = Numo::NMath.sin(-Numo::NMath.sqrt((x+5)**2+(y-7)**2)*0.5) Numo.gnuplot do set term:"pngcairo" set output:"hidden.png" set isosamples:[25,25] set :xyplane, at:0 unset :key set :palette, rgbformulae:[31,-11,32] set :style, fill_solid:0.5 set cbrange:-1..1 set :hidden3d, :front splot [f, with:"pm3d"], [x**2-y**2, with:"lines", lc_rgb:"black"] end Nov 19, 2017 RubyKaigi2017@Hiroshima 37 https://github.com/ruby-numo/gnuplot-demo

Slide 38

Slide 38 text

Summary of Progress ▶ Numo::NArray ○ Numpy-like array is complete. ▶ Numo::Linalg ○ BLAS routines are complete. ○ Most of non-symmetric and symmetric routines are complete. ▶ Numo::GSL ○ 15 modules are complete. ▶ Numo::FFTW ○ Complex DFT is complete. ▶ Numo::Gnuplot ○ Almost complete. Nov 19, 2017 RubyKaigi2017@Hiroshima 38

Slide 39

Slide 39 text

SciPy Stack Pandas Sympy matplotlib Scikit-learn Jupyter (IPython) NumPy Cython SciPy lib Python Scientific Computing Data Science Machine Learning Nov 19, 2017 RubyKaigi2017@Hiroshima 39

Slide 40

Slide 40 text

Numo Stack daru? No Sympy? Numo::Gnuplot ML libs? Jupyter (IRuby) Numo::NArray Rubex? Numo::Linalg Numo::GSL Numo::FFTW Ruby Scientific Computing Data Science Machine Learning Nov 19, 2017 RubyKaigi2017@Hiroshima 40

Slide 41

Slide 41 text

Efforts on ML with Ruby Nov 19, 2017 RubyKaigi2017@Hiroshima 41 https://github.com/arbox/machine-learning-with-ruby

Slide 42

Slide 42 text

Numo issues ▶ API desgin ▶ Complete to-do list ▶ Check correctness ▶ Refactoring ▶ Improve performance ▶ Write test ▶ Write document ▶ Collaborations – GPU (CUDA etc.) support – Data Science libraries ▶ I cannot maintain too many projects... Nov 19, 2017 RubyKaigi2017@Hiroshima 42

Slide 43

Slide 43 text

Deep Learning with Ruby/Numo Nov 19, 2017 RubyKaigi2017@Hiroshima 43

Slide 44

Slide 44 text

Deep Learning From Scratch ▶ Published from O'Reilly Japan ○ Implement DL with NumPy ○ https://www.oreilly.co.jp/books/9784873117584/ ▶ Blog post by Akanuma-san ○ Translation into Ruby/Numo ○ Elapsed time: • Ruby: 281 sec • Python: 39 sec Nov 19, 2017 RubyKaigi2017@Hiroshima 44 http://blog.akanumahiroaki.com/entry/2017/04/15/160000

Slide 45

Slide 45 text

Measurement on My Laptop ruby train_neuralnet.rb # 372 sec python3.6 train_neuralnet.py # 32 sec ▶ Depends on the performance of dot method (matrix product) ○ NArray: Naïve implementation with single core ○ NumPy: ATLAS or OpenBLAS with 4 cores ▶ Use OpenBLAS with 4 cores by loading Linalg: ruby -r numo/linalg/use/openblas train_neuralnet.rb # 54 sec ○ NArray has room for more speed-up. Nov 19, 2017 RubyKaigi2017@Hiroshima 45

Slide 46

Slide 46 text

Convert Deep Learning code from Python to Ruby ▶ Talk in RejectKaigi 2017 by Naitoh-san ▶ Purpose: – Translate Chainer code into Ruby ▶ py2rb.py – Converter from Python to Ruby – Replace NumPy with Numo::NArray Nov 19, 2017 RubyKaigi2017@Hiroshima 46 Another way to "get on the shoulder of the giant" for machine learning with Ruby https://www.slideshare.net/naitoh1/reject-kaigi2017-naitoh

Slide 47

Slide 47 text

Incompatibility between NumPy and NArray Nov 19, 2017 RubyKaigi2017@Hiroshima 47 NumPy Numo::NArray Why if condition: if condition != 0 zero is true in Ruby a[:] a[0..-1] or a[true] Range feature a[b:e] a[b..e-1] Range feature b = a.view() b += 1 b = a.view b.inplace + 1 b += 1 is syntax sugar for b = b + 1 a[i] (a.ndim >= 2) a[i,false] NArray feature a[[0,1],[1,0]] a.at([0,1],[1,0]) NArray feature

Slide 48

Slide 48 text

Summary ▶ Ruby/Numo ○ N-dimensional Numerical Array (Numo::NArray) ○ Numerical Algorithms (Numo::Linalg|GSL|FFTW|FFTE) ○ Data visualization (Numo::Gnuplot) ○ Need further effort. (coverage, performance, test, document) ▶ Experimental cases for Deep Learning Nov 19, 2017 RubyKaigi2017@Hiroshima 48