Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Progress of Ruby-Numo: Numerical Computing for Ruby

Progress of Ruby-Numo: Numerical Computing for Ruby

Masahiro Tanaka 田中昌宏

September 19, 2017
Tweet

More Decks by Masahiro Tanaka 田中昌宏

Other Decks in Programming

Transcript

  1. Progress of Ruby/Numo:
    Numerical Computing for Ruby
    Masahiro TANAKA - 田中昌宏
    RubyKaigi 2017 @ Hiroshima
    Nov 19, 2017 [email protected] 1

    View Slide

  2. Masahiro Tanaka
    ▶ Research Fellow at Center for Computational Sciences, University of Tsukuba
    ▶ RubyKaigi Speaker
    ○ RubyKaigi 2010 in Tsukuba
    • Topic: NArray
    • http://rubykaigi.org/2010/ja/events/83
    ○ RubyKaigi 2016 in Kyoto
    • Topic: Pwrake
    • http://rubykaigi.org/2016/presentations/masa16tanaka.html
    ○ RubyKaigi 2017 in Hiroshima
    • Topic: Numo::NArray
    Nov 19, 2017 [email protected] 2

    View Slide

  3. Data Science
    ▶ Artificial Intelligence (AI)
    ▶ Machine Learning (ML)
    ▶ Deep Learning (DL)
    ▶ Big Data
    Nov 19, 2017 [email protected] 3

    View Slide

  4. Popular Programming Language for Data Science
    ▶ R
    ○ Language for Statistical Computing
    ▶ Python
    ○ Versatile Language + Library for Scientific Computing
    ○ Supported by popular Deep Learning frameworks.
    Nov 19, 2017 [email protected] 4

    View Slide

  5. Nov 19, 2017 [email protected] 5
    https://www.scipy.org/

    View Slide

  6. Nov 19, 2017 [email protected] 6
    https://docs.scipy.org/doc/scipy/reference/

    View Slide

  7. Nov 19, 2017 [email protected] 7
    http://scikit-learn.org/stable/

    View Slide

  8. SciPy Stack
    Pandas
    Sympy
    matplotlib
    Scikit-learn
    Jupyter (IPython)
    NumPy
    Cython
    SciPy lib
    Python
    Scientific Computing
    Data Science
    Machine Learning
    Nov 19, 2017 [email protected] 8

    View Slide

  9. Two approaches to do Data Science with Ruby
    ▶ Calls Existing Framework from Ruby
    ○ PyCall
    ▶ Build Libraries in Ruby
    ○ Data Science Libraries require Science Libraries
    ○ Science Libraries require NumPy-like library
    ○ Is there any library like NumPy in Ruby?
    Nov 19, 2017 [email protected] 9

    View Slide

  10. NArray
    NumPy-like Library for Ruby
    Nov 19, 2017 [email protected] 10

    View Slide

  11. History of NArray
    ▶ First design
    ○ 1999 NArray ver. 0.3.0 (start)
    ○ 2000 NArray ver. 0.5.0 (re-design)
    ○ 2016 NArray ver. 0.6.1.2 (current)
    ▶ Second design
    ○ 2007 NArray ver. 0.7 (start)
    ○ 2011 NArray ver. 0.9 (re-design)
    ○ 2016 Numo::NArray ver. 0.9.0.1 (gem release)
    ○ 2017 Numo::NArray ver. 0.9.0.8 (current)
    Nov 19, 2017 [email protected] 11

    View Slide

  12. Ruby Numo
    ▶ Numo::NArray
    ○ N-dimensional Numerical Array
    ▶ Numo::Gnuplot
    ○ wrapper to Gnuplot
    ▶ Numo::GSL
    ○ wrapper to GNU Scientific Library
    ▶ Numo::Linalg
    ○ Linear Algebra
    ○ wrapper to BLAS/LAPACK
    ▶ Numo::FFTW, Numo::FFTE
    ○ wrapper to FFT libraries
    Nov 19, 2017 [email protected] 12
    https://github.com/ruby-numo

    View Slide

  13. NArray coverage
    ▶ 363 NumPy functions
    ▶ 217 covered
    ▶ 91 to-do
    ▶ 55 no plan
    ○ NumPy-specific functions,
    financial functions etc.
    Nov 19, 2017 [email protected] 13
    https://github.com/ruby-numo/narray/wiki/Numo-vs-numpy

    View Slide

  14. Numo::NArray Basics
    Nov 19, 2017 [email protected] 14

    View Slide

  15. Creating Numo::NArray
    require "numo/narray"
    a = Numo::NArray[[1,2,3],[4,5,6]]
    => Numo::Int32#shape=[2,3]
    [[1, 2, 3],
    [4, 5, 6]]
    Nov 19, 2017 [email protected] 15

    View Slide

  16. Data Type = Subclass of Numo::NArray
    ▶ Bit, Boolean
    – Numo::Bit
    ▶ Signed Integer
    – Numo::Int8, Numo::Int32
    – Numo::Int16, Numo::Int64
    ▶ Unsigned Integer
    – Numo::UInt8, Numo::UInt32
    – Numo::UInt16, Numo::UInt64
    ▶ Floating point real number
    – Numo::DFloat (Float64)
    – Numo::SFloat (Float32)
    ▶ Floating point complex number
    – Numo::DComplex (Complex128)
    – Numo::SComplex (Complex64)
    ▶ Ruby Object
    – Numo::RObject
    Nov 19, 2017 [email protected] 16

    View Slide

  17. ▶ shape = [4]
    – 1-dimensional array
    ▶ shape = [4,4]
    – 2-dimentional array
    ▶ shape = [4,4,4]
    – 3-dimensional array
    Nov 19, 2017 [email protected] 17
    Shape = Array of sizes along dimensions
    a[0,0] a[0,1] a[0,2] a[0,3]
    a[1,0] a[1,1] a[1,2] a[1,3]
    a[2,0] a[2,1] a[2,2] a[2,3]
    a[3,0] a[3,1] a[3,2] a[3,3]
    a[0] a[1] a[2] a[3]
    a[0,0,0] a[0,0,1] a[0,0,2] a[0,0,3]
    a[0,1,0] a[0,1,1] a[0,1,2] a[0,1,3]
    a[0,2,0] a[0,2,1] a[0,2,2] a[0,2,3]
    a[0,3,0] a[0,3,1] a[0,3,2] a[0,3,3]

    View Slide

  18. ▶ a[1]
    ○ returns an element
    ▶ a[1..2]
    ○ returns NArray
    ▶ a[(1..-1).step(2)]
    ○ returns NArray
    ▶ a[[1,2,4]]
    ○ returns NArray
    Nov 19, 2017 [email protected] 18
    NArray Slice (Indexing)
    a[0] a[1] a[2] a[3] a[4]
    a[0] a[1] a[2] a[3] a[4]
    a[0] a[1] a[2] a[3] a[4]
    a[0] a[1] a[2] a[3] a[4]

    View Slide

  19. Notation and Speed of Element-wise Operation
    a = [1,2,3,4,5]
    b = [10,20,30,40,50]
    p c = a.zip(b).map{|x,y| x + y}
    #=> [11, 22, 33, 44, 55]
    require "benchmark"
    a = (1..10000).to_a
    b = (10..100000).step(10).to_a
    Benchmark.bm do |r|
    r.report do
    1000.times{ c = a.zip(b).map{|x,y| x+y} }
    end
    end
    # user system total real
    # 1.910000 0.010000 1.920000 ( 1.912284)
    require "numo/narray"
    a = Numo::NArray[1,2,3,4,5]
    b = Numo::NArray[10,20,30,40,50]
    p c = a + b
    #=> Numo::Int32#shape=[5]
    #[11, 22, 33, 44, 55]
    require "benchmark"
    a = Numo::Int32.new(10000).seq(1)
    b = Numo::Int32.new(10000).seq(10,10)
    Benchmark.bm do |r|
    r.report do
    100000.times{ c = a + b }
    end
    end
    # user system total real
    # 0.750000 0.080000 0.830000 ( 0.830775)
    Nov 19, 2017 [email protected] 19
    230 times faster
    Simple notation

    View Slide

  20. NArray methods (Excerpt from Numo::DFloat)
    ▶ Arithmetic
    ○ + - * / % ** [email protected] abs divmod reciprocal poly sign square
    ▶ Statistics
    ○ clip cumprod cumsum diff kahan_sum kron max max_index mean median
    min min_index minmax mulsum ptp prod rms sum stddev var sort
    sort_index
    ▶ Random Number (Mersenne Twister)
    ○ rand rand_norm
    ▶ Comparison
    ○ eq ge gt le lt ne nearly_eq
    ▶ Numo::NMath module function
    ○ acos acosh asin asinh atan atan2 atanh cbrt cos cosh erf erfc exp
    exp10 exp2 expm1 frexp hypot ldexp log log10 log1p log2 sin sinc
    sinh sqrt tan tanh
    Nov 19, 2017 [email protected] 20

    View Slide

  21. Important features of Numo::NArray (and NumPy)
    1. Slice View
    2. Broadcasting
    3. Masking
    Nov 19, 2017 [email protected] 21

    View Slide

  22. 1. Create View on Slice
    a = Numo::DFloat[1..5]
    => Numo::DFloat#shape=[5]
    [1, 2, 3, 4, 5]
    b = a[2..3]
    => Numo::DFloat(view)#shape=[2]
    [3, 4]
    b[0..1] = 0
    a
    => Numo::DFloat#shape=[5]
    [1, 2, 0, 0, 5]
    ▶ b is a view
    ▶ Saves memory and copy cost.
    ▶ Slice view is introduced in
    Numo::NArray
    Nov 19, 2017 [email protected] 22
    a = 1 2 3 4 5
    b = 3 4

    View Slide

  23. x = Numo::DFloat[[1,2,3]]
    => Numo::DFloat#shape=[1,3]
    [[1, 2, 3]]
    y = Numo::DFloat[[2],[10]]
    => Numo::DFloat#shape=[2,1]
    [[2],
    [10]]
    x*y
    => Numo::DFloat#shape=[2,3]
    [[2, 4, 6],
    [10, 20, 30]]
    2. Broadcasting
    ▶ Apply an element repeatedly on
    an axis with length=1
    Nov 19, 2017 [email protected] 23
    x[0,0] x[0,1] x[0,2]
    y[0,0] x[0,0]*y[0,0] x[0,1]*y[0,0] x[0,2]*y[0,0]
    y[1,0] x[0,0]*y[1,0] x[0,1]*y[1,0] x[0,2]*y[1,0]

    View Slide

  24. 3. Masking
    a = Numo::DFloat[1,-2,3,-4,-5]
    => Numo::DFloat#shape=[5]
    [1, -2, 3, -4, -5]
    a < 0
    => Numo::Bit#shape=[5]
    [0, 1, 0, 1, 1]
    a[a<0] = 0
    a
    => Numo::DFloat#shape=[5]
    [1, 0, 3, 0, 0]
    ▶ Returns Boolean with bit array
    ▶ Replaces negative elements with
    zeros
    Nov 19, 2017 [email protected] 24

    View Slide

  25. Any other Numerical Array in Ruby?
    Nov 19, 2017 [email protected] 25

    View Slide

  26. NMatrix
    ▶ Main product of SciRuby.
    ▶ Supports
    ○ Multi-dimensional Numerical Array
    ○ Dense Matrix operation (wrapper to BLAS/LAPACK)
    ○ Sparse Matrix
    ▶ IMO, NMatrix cannot be an alternative to NumPy.
    Nov 19, 2017 [email protected] 26

    View Slide

  27. Feature Comparison
    NumPy NMatrix First
    NArray
    Numo::
    NArray
    View on Slice ✔ ✔ ✔
    Broadcasting ✔ ✔ ✔
    Masking ✔ ✔ ✔
    coerce ✔ ✔ ✔
    Nov 19, 2017 [email protected] 27

    View Slide

  28. Performance (NumPy, NArray, NMatrix)
    Nov 19, 2017 [email protected] 28
    faster →

    View Slide

  29. Other modules in Ruby/Numo
    Nov 19, 2017 [email protected] 29

    View Slide

  30. Numo::Linalg
    ▶ Module for Linear Algebra
    ▶ Wrapper to BLAS/LAPACK
    ▶ Ruby Association Grant 2016
    ○ Supported Kishimoto-san to increase coverage.
    ▶ Modules: (Similar to scipy.linalg)
    ○ Numo::Linalg::Blas, Numo::Linalg::Lapack
    • direct wrapper to BLAS/LAPACK
    ○ Numo::Linalg
    • Matrix product, Linear solver, Decompositions, Eigen problems, etc.
    Nov 19, 2017 [email protected] 30

    View Slide

  31. Numo::Linalg coverage
    ▶ Most of BLAS
    ▶ LAPACK
    ○ covered: Non-Symmetric and Symmetric functions
    ○ not covered: Triangular and Banded functions
    Nov 19, 2017 [email protected] 31

    View Slide

  32. Backend for Numo::Linalg
    ▶ Original BLAS/LAPACK
    ▶ Atlas
    ▶ OpenBLAS
    ▶ Intel MKL
    ▶ Numo::Linalg uses dlopen to link Lapack backend
    ○ Easy to replace backend unlike Scipy.
    Nov 19, 2017 [email protected] 32

    View Slide

  33. Numo::GSL
    ▶ Wrapper to GSL (GNU Scientific Library)
    ○ GSL is a collection of numerical routines (> 1000 functions in total).
    ○ Corresponding to SciPy library
    ○ Operates efficiently on Numo::NArray
    ○ Wrapper code is automatically generated from texinfo.
    ▶ Why not using Ruby/GSL ?
    ○ Wrapper code is written by hand -- hard to maintain.
    Nov 19, 2017 [email protected] 33

    View Slide

  34. Numo::GSL coverage
    Nov 19, 2017 [email protected] 34
    Covered by NArray,
    Linalg, FFTW
    Complex Numbers
    Vectors and Matrices
    Sorting
    BLAS Support
    Linear Algebra
    Eigensystems
    Fast Fourier Transforms
    To do (20)
    Least-Squares Fitting (multi-parameter)
    Nonlinear Least-Squares Fitting
    Basis Splines
    Chebyshev Approximations
    Series Acceleration
    Discrete Hankel Transforms
    Quasi-Random Sequences
    Permutations
    Combinations
    Multisets
    Numerical Integration
    N-tuples
    Monte Carlo Integration
    Simulated Annealing
    Ordinary Differential Equations
    Numerical Differentiation
    One dimensional Root-Finding
    One dimensional Minimization
    Multidimensional Root-Finding
    Multidimensional Minimization
    Covered (15)
    Mathematical Functions
    Special Functions
    Physical Constants
    Random Number Generation
    Random Number Distributions
    Statistics
    Running Statistics
    Histograms
    Interpolation
    Wavelet Transforms
    Least-Squares Fitting
    Sparse Matrices
    Sparse BLAS Support
    Sparse Linear Algebra
    Polynomials

    View Slide

  35. Numo::FFTW, Numo::FFTE
    ▶ Wrapper to FFT (Fast Fourier Transform) libraries
    ▶ FFTW
    ○ Widely-used DFT (Discrete Fourier Transform) library.
    ○ Provides Complex FFT methods.
    ○ http://www.fftw.org/
    ▶ FFTE
    ○ Developed by Prof. Takahashi (University of Tsukuba)
    ○ http://www.ffte.jp/
    ○ 2,3,5-radix
    Nov 19, 2017 [email protected] 35

    View Slide

  36. Numo::Gnuplot
    ▶ One of many wrappers to Gnuplot
    ▶ Features:
    ○ Simple interface similar to Gnuplot command line
    ○ No class for data handling (use Array or NArray)
    ▶ Example:
    require "numo/gnuplot"
    x = (0..100).map{|i| i*0.1}
    y = x.map{|i| Math.sin(i)}
    Numo.gnuplot do
    set title:"X-Y data plot"
    plot x,y, w:'lines', t:'sin(x)'
    end
    Nov 19, 2017 [email protected] 36
    set title "X-Y data plot"
    plot '-' w lines t "sin(x)"
    0 0
    ...
    e
    converted Gnuplot script

    View Slide

  37. Numo::Gnuplot example
    require "numo/narray"
    require "numo/gnuplot"
    x = Numo::DFloat.new(1,21).seq-10
    y = Numo::DFloat.new(21,1).seq-10
    f = Numo::NMath.sin(-Numo::NMath.sqrt((x+5)**2+(y-7)**2)*0.5)
    Numo.gnuplot do
    set term:"pngcairo"
    set output:"hidden.png"
    set isosamples:[25,25]
    set :xyplane, at:0
    unset :key
    set :palette, rgbformulae:[31,-11,32]
    set :style, fill_solid:0.5
    set cbrange:-1..1
    set :hidden3d, :front
    splot [f, with:"pm3d"],
    [x**2-y**2, with:"lines", lc_rgb:"black"]
    end
    Nov 19, 2017 [email protected] 37
    https://github.com/ruby-numo/gnuplot-demo

    View Slide

  38. Summary of Progress
    ▶ Numo::NArray
    ○ Numpy-like array is complete.
    ▶ Numo::Linalg
    ○ BLAS routines are complete.
    ○ Most of non-symmetric and symmetric routines are complete.
    ▶ Numo::GSL
    ○ 15 modules are complete.
    ▶ Numo::FFTW
    ○ Complex DFT is complete.
    ▶ Numo::Gnuplot
    ○ Almost complete.
    Nov 19, 2017 [email protected] 38

    View Slide

  39. SciPy Stack
    Pandas
    Sympy
    matplotlib
    Scikit-learn
    Jupyter (IPython)
    NumPy
    Cython
    SciPy lib
    Python
    Scientific Computing
    Data Science
    Machine Learning
    Nov 19, 2017 [email protected] 39

    View Slide

  40. Numo Stack
    daru?
    No Sympy?
    Numo::Gnuplot
    ML libs?
    Jupyter (IRuby)
    Numo::NArray
    Rubex?
    Numo::Linalg
    Numo::GSL
    Numo::FFTW
    Ruby
    Scientific Computing
    Data Science
    Machine Learning
    Nov 19, 2017 [email protected] 40

    View Slide

  41. Efforts on ML with Ruby
    Nov 19, 2017 [email protected] 41
    https://github.com/arbox/machine-learning-with-ruby

    View Slide

  42. Numo issues
    ▶ API desgin
    ▶ Complete to-do list
    ▶ Check correctness
    ▶ Refactoring
    ▶ Improve performance
    ▶ Write test
    ▶ Write document
    ▶ Collaborations
    – GPU (CUDA etc.) support
    – Data Science libraries
    ▶ I cannot maintain too many
    projects...
    Nov 19, 2017 [email protected] 42

    View Slide

  43. Deep Learning with Ruby/Numo
    Nov 19, 2017 [email protected] 43

    View Slide

  44. Deep Learning From Scratch
    ▶ Published from O'Reilly Japan
    ○ Implement DL with NumPy
    ○ https://www.oreilly.co.jp/books/9784873117584/
    ▶ Blog post by Akanuma-san
    ○ Translation into Ruby/Numo
    ○ Elapsed time:
    • Ruby: 281 sec
    • Python: 39 sec
    Nov 19, 2017 [email protected] 44
    http://blog.akanumahiroaki.com/entry/2017/04/15/160000

    View Slide

  45. Measurement on My Laptop
    ruby train_neuralnet.rb # 372 sec
    python3.6 train_neuralnet.py # 32 sec
    ▶ Depends on the performance of dot method (matrix product)
    ○ NArray: Naïve implementation with single core
    ○ NumPy: ATLAS or OpenBLAS with 4 cores
    ▶ Use OpenBLAS with 4 cores by loading Linalg:
    ruby -r numo/linalg/use/openblas train_neuralnet.rb # 54 sec
    ○ NArray has room for more speed-up.
    Nov 19, 2017 [email protected] 45

    View Slide

  46. Convert Deep Learning code from Python to Ruby
    ▶ Talk in RejectKaigi 2017 by
    Naitoh-san
    ▶ Purpose:
    – Translate Chainer code into Ruby
    ▶ py2rb.py
    – Converter from Python to Ruby
    – Replace NumPy with
    Numo::NArray
    Nov 19, 2017 [email protected] 46
    Another way to "get on the shoulder of
    the giant" for machine learning with Ruby
    https://www.slideshare.net/naitoh1/reject-kaigi2017-naitoh

    View Slide

  47. Incompatibility between NumPy and NArray
    Nov 19, 2017 [email protected] 47
    NumPy Numo::NArray Why
    if condition: if condition != 0 zero is true in Ruby
    a[:] a[0..-1] or a[true] Range feature
    a[b:e] a[b..e-1] Range feature
    b = a.view()
    b += 1
    b = a.view
    b.inplace + 1
    b += 1 is syntax sugar for
    b = b + 1
    a[i] (a.ndim >= 2) a[i,false] NArray feature
    a[[0,1],[1,0]] a.at([0,1],[1,0]) NArray feature

    View Slide

  48. Summary
    ▶ Ruby/Numo
    ○ N-dimensional Numerical Array (Numo::NArray)
    ○ Numerical Algorithms (Numo::Linalg|GSL|FFTW|FFTE)
    ○ Data visualization (Numo::Gnuplot)
    ○ Need further effort. (coverage, performance, test, document)
    ▶ Experimental cases for Deep Learning
    Nov 19, 2017 [email protected] 48

    View Slide