Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learning Programming on Ruby

Deep Learning Programming on Ruby

Presented by @mrkn and @hatappi at RubyKaigi 2018

Kenta Murata

May 31, 2018
Tweet

More Decks by Kenta Murata

Other Decks in Technology

Transcript

  1. RubyKaigi 2018 on 31 May 2018
    https://www.flickr.com/photos/53416677@N08/4972916707/
    Deep Learning
    Programming on Ruby
    Kenta Murata
    Yusaku Hatanaka
    RubyKaigi 2018

    View Slide

  2. Contents
    1. About us
    2. Introduction of this session
    3. Deep learning programming on Ruby
    3.1. mxnet.rb
    3.2. Red Chainer
    4. Overview of Ruby’s current data science support
    5. Summary of this talk

    View Slide

  3. About us (1)
    • Kenta Murata (@mrkn)
    • Full-time CRuby committer at Speee, Inc.
    • bigdecimal, enumerable-statistics, pycall.rb,
    mxnet.rb, etc.
    • Ruby, C/C++, Python, Julia, etc.
    • neovim, vscode (neovim client)

    View Slide

  4. About us (2)
    • Yusaku Hatanaka (@hatappi)
    • Speee, Inc
    • Red Data Tools member
    • Ruby, Go, TypeScript, etc.
    • I love soybeans
    a soybean =>

    View Slide

  5. Session
    Introduction

    View Slide

  6. Deep Learning

    in Ruby

    View Slide

  7. There are several approaches
    ‣ mxnet.rb

    https://github.com/mrkn/mxnet.rb
    ‣ Red Chainer

    https://github.com/red-data-tools/red-chainer
    ‣ Tensorflow.rb

    https://github.com/somaticio/tensorflow.rb
    ‣ TensorStream

    https://github.com/jedld/tensor_stream

    View Slide

  8. Topics in this session
    ‣ mxnet.rb

    https://github.com/mrkn/mxnet.rb
    ‣ Red Chainer

    https://github.com/red-data-tools/red-chainer
    ‣ Tensorflow.rb

    https://github.com/somaticio/tensorflow.rb
    ‣ TensorStream

    https://github.com/jedld/tensor_stream

    View Slide

  9. mxnet.rb

    View Slide

  10. What is mxnet.rb?
    ‣ Ruby binding library of Apache MXNet
    ‣ Since Nov 2017
    ‣ You can write deep learning programs in Ruby by using
    mxnet.rb and MXNet runtime library
    ‣ It doesn’t depend on Python runtime
    ‣ You need only Ruby
    ‣ But `pip install mxnet` is currently easiest way to install
    MXNet runtime library

    View Slide

  11. Why I write mxnet.rb
    ‣ I want to write deep learning programs in Ruby
    ‣ Without dependency on Python (pycall.rb)
    ‣ There is Tensorflow.rb, but I don’t want to use
    Tensorflow C API
    ‣ I think Apache MXNet must be best for Ruby

    View Slide

  12. Why MXNet is best for Ruby?
    ‣ It already supports multiple languages
    ‣ Many stakeholders support MXNet development
    ‣ There are some good features that can compete
    with other frameworks

    View Slide

  13. Multi-language support
    ‣ Not only Python and C/C++
    ‣ But also Julia, R, JavaScript, Perl, Matlab, Scala, Go
    ‣ Ruby will be supported soon (I’m working on it)

    View Slide

  14. Companies that support
    Apache MXNet
    https://mxnet.incubator.apache.org/community/powered_by.html

    View Slide

  15. Academic organizations
    that support Apache MXNet

    View Slide

  16. Good Features
    ‣ Multiple programming paradigm for deep learning
    ‣ Lower memory consumption than other frameworks
    ‣ Efficient multi-GPU computation
    ‣ Multi-node computation
    ‣ An Apache incubator project
    ‣ ONNX support

    View Slide

  17. Good Features
    ‣ Multiple programming paradigm for deep learning
    ‣ Lower memory consumption than other frameworks
    ‣ Efficient multi-GPU computation
    ‣ Multi-node computation
    ‣ An Apache incubator project
    ‣ ONNX support

    View Slide

  18. Multiple programming
    paradigm for deep learning
    ‣ Imperative style
    ‣ Symbolic style
    ‣ Hybrid style

    View Slide

  19. # Computation is executed step by step
    a = MXNet::NDArray.ones([10])
    b = MXNet::NDArray.ones([10]) * 2
    c = b * a
    d = c + 1
    Imperative style

    View Slide

  20. # first generate computation graphs
    a = MXNet::Symbol.var(:a)
    b = MXNet::Symbol.var(:b)
    c = b * a # generate a computation graph
    d = c + 1 # ditto
    # execute the computation graph
    d.eval(a: MXNet::NDArray.ones([10]),
    b: MXNet::NDArray.ones([10]) * 2)
    Symbolic style
    * 1
    +
    a b
    c
    d

    View Slide

  21. Imperative vs Symbolic
    • Imperative Programs Tend to be More Flexible
    • It enables us to write loop directly in the syntax
    of the programming language

    e.g. while, until, loop { … }, each { … }, etc.
    • Symbolic Programs Tend to be More Efficient
    • It can optimize memory usage automatically
    • It can optimize computation orders

    View Slide

  22. Computational graph
    optimization example
    * 1
    +
    a b
    c
    d
    1
    op
    a b
    d
    op = a * b + 1
    This optimization reduces both computation steps and memory consumption.
    Remove c

    View Slide

  23. Hybrid style
    • Mix both imperative and symbolic styles
    • In deep learning programming
    • Imperative style is helpful for writing parameter update
    routines
    • Gradient calculation should be performed in symbolically
    • In MXNet, Gluon API supports hybrid style programming

    View Slide

  24. Good Features
    ‣ Multiple programming paradigm for deep learning
    ‣ Lower memory consumption than other frameworks
    ‣ Efficient multi-GPU computation
    ‣ Multi-node computation
    ‣ An Apache incubator project
    ‣ ONNX support

    View Slide

  25. MXNet vs Tensorflow
    • Investigated by Julien Simon

    https://medium.com/@julsimon/keras-shoot-out-tensorflow-vs-mxnet-51ae2b30a9c0
    • Using Keras to compare them
    • Both MXNet and Tensorflow can be used as
    backends of Keras
    • Three metrics
    • Precision, Speed, and Memory consumption

    View Slide

  26. Good Features
    ‣ Multiple programming paradigm for deep learning
    ‣ Lower memory consumption than other frameworks
    ‣ Efficient multi-GPU computation
    ‣ Multi-node computation
    ‣ An Apache incubator project
    ‣ ONNX support

    View Slide

  27. Multi-node computation
    • You can use MXNet as a framework for distributed
    scientific computation
    • Using Key-Value Store to exchange parameters
    among each thread in each machine
    • For example:
    • Distributed model training

    https://mxnet.incubator.apache.org/versions/master/faq/distributed_training.html

    View Slide

  28. MXNet is an Apache
    incubator project
    • There are a lot of tools for data science under Apache
    Foundation
    • Arrow
    • Hadoop
    • Kudu
    • Spark
    • etc.

    View Slide

  29. ONNX
    • Open Neural Network Exchange Format
    • Founded by Microsoft and Facebook
    • We can interchange learned models between
    different frameworks by ONNX
    • e.g. We can use Python and Keras for experimental,
    and we can use Ruby and MXNet for production

    View Slide

  30. Frameworks that support to
    interchange models by ONNX
    • MXNet
    • PyTorch
    • Chainer
    • Caffe2
    • Tensorflow
    • etc.

    View Slide

  31. Current project status

    of mxnet.rb
    2 developers
    ‣ Me (@mrkn)
    • Conference-driven development
    • Currently focusing on Gluon API
    ‣ Laurent Julliard (@ljulliar)
    • Currently focusing on the coverage of NDArray API
    Future plan
    ‣ I want to achieve 100% feature coverage

    View Slide

  32. We want more developers
    ‣ We are welcome to receive your pull-request
    ‣ I’ll make feature tables and some milestones so
    that you can find your commit chance more easily

    View Slide

  33. require 'mxnet'
    module MLPScratch
    ND = MXNet::NDArray
    class MLP
    def initialize(num_inputs: 784, num_outputs: 10,
    num_hidden_units: [256, 128, 64], ctx: nil)
    @layer_dims = [num_inputs, *num_hidden_units, num_outputs]
    @weight_scale = 0.01
    @ctx = ctx || MXNet::Context.default
    @all_parameters = init_parameters
    end
    attr_reader :ctx, :all_parameters, :layer_dims
    private def rnorm(shape)
    ND.random_normal(shape: shape, scale: @weight_scale, ctx: @ctx)
    end
    private def init_parameters
    @weights = []
    @biases = []
    @layer_dims.each_cons(2) do |dims|
    @weights << rnorm(dims)
    @biases << rnorm([dims[1]])
    end
    [*@weights, *@biases].each(&:attach_grad)
    DEMO

    View Slide

  34. require 'mxnet'
    module MLPScratch
    ND = MXNet::NDArray
    class MLP
    def initialize(num_inputs: 784, num_outputs: 10,
    num_hidden_units: [256, 128, 64], ctx: nil)
    @layer_dims = [num_inputs, *num_hidden_units, num_outputs]
    @weight_scale = 0.01
    @ctx = ctx || MXNet::Context.default
    @all_parameters = init_parameters
    end
    attr_reader :ctx, :all_parameters, :layer_dims
    private def rnorm(shape)
    ND.random_normal(shape: shape, scale: @weight_scale, ctx: @ctx)
    end
    private def init_parameters
    @weights = []
    @biases = []
    @layer_dims.each_cons(2) do |dims|
    @weights << rnorm(dims)
    @biases << rnorm([dims[1]])
    end
    [*@weights, *@biases].each(&:attach_grad)
    end
    private def relu(x)
    ND.maximum(x, ND.zeros_like(x))
    end
    def forward(x)
    h = x
    n = @layer_dims.length
    (n - 2).times do |i|
    h_linear = ND.dot(h, @weights[i]) + @biases[i]
    h = relu(h_linear)
    end
    y_hat_linear = ND.dot(h, @weights[-1]) + @biases[-1]
    end
    private def softmax_cross_entropy(y_hat_linear, t)
    -ND.nansum(t * ND.log_softmax(y_hat_linear), axis: 0, exclude: true)
    end
    def loss(y_hat_linear, t)
    softmax_cross_entropy(y_hat_linear, t)
    end
    def predict(x)
    y_hat_linear = forward(x)
    ND.argmax(y_hat_linear, axis: 1)
    end
    end
    module_function
    def SGD(params, lr)
    params.each do |param|
    param[0..-1] = param - lr * param.grad
    end
    end
    def evaluate_accuracy(data_iter, model)
    num, den = 0.0, 0.0
    data_iter.each_with_index do |batch, i|
    data = batch.data[0].as_in_context(model.ctx)
    data = data.reshape([-1, model.layer_dims[0]])
    label = batch.label[0].as_in_context(model.ctx)
    predictions = model.predict(data)
    num += ND.sum(predictions == label)
    den += data.shape[0]
    end
    (num / den).as_scalar
    end
    def learning_loop(train_iter, test_iter, model,
    epochs: 10, learning_rate: 0.001,
    smoothing_constant: 0.01)
    epochs.times do |e|
    start = Time.now
    cumloss = 0.0
    num_batches = 0
    train_iter.each_with_index do |batch, i|
    data = batch.data[0].as_in_context(model.ctx)
    data = data.reshape([-1, model.layer_dims[0]])
    label = batch.label[0].as_in_context(model.ctx)
    label_one_hot = ND.one_hot(label, depth: model.layer_dims[-1])
    loss = MXNet::Autograd.record do
    y = model.forward(data)
    model.loss(y, label_one_hot)
    end
    loss.backward
    SGD(model.all_parameters, learning_rate)
    cumloss = ND.sum(loss).as_scalar
    num_batches += 1
    end
    test_acc = evaluate_accuracy(test_iter, model)
    train_acc = evaluate_accuracy(train_iter, model)
    duration = Time.now - start
    puts "Epoch #{e}. Loss: #{cumloss / (train_iter.batch_size * num_batches)}, " +
    "train-acc: #{train_acc}, test-acc: #{test_acc} (#{duration} sec)"
    end
    end
    end

    View Slide

  35. Summary of mxnet.rb
    ‣ MXNet is a deep learning framework that is better
    for supporting in Ruby
    ‣ mxnet.rb is under development but some APIs has
    already been usable
    ‣ Contact me if you want to join the development

    View Slide

  36. Red Chainer

    View Slide

  37. Red Chainer
    • Deep learning framework

    it ported python's chainer with ruby
    • Use Numo::NArray for holding and computing
    matrices
    • One project in development under Red Data Tools

    View Slide

  38. Red Data Tools
    • Project providing data processing tool for Ruby
    • @ktou was launched in February 2017
    • red-arrow, red-datasets, csv gem maintenance, etc

    View Slide

  39. Red Data Tools’s Policy
    1. Collaborate across the Ruby community
    2. Acting rather than blaming
    3. Continuous, iterative progress rather than a short,
    big project
    4. The current lack of knowledge doesn't matter
    5. Ignore criticism from outsiders
    6. Fun!

    View Slide

  40. Features of Red Chainer
    1. Define-by-Run
    2. Provide high level API
    3. Can be constructed like Ruby
    4. OSS Project

    View Slide

  41. Define-by-Run
    • Define and Run
    • Build a calculation graph and run data
    • Define by Run
    • Build a calculation graph with data flowing

    View Slide

  42. Provide high level API
    • 2D Convolution
    • BatchNormalization
    • Linear
    • ReLU
    • Sigmoid
    • Softmax
    • Dropout
    • etc…

    View Slide

  43. Can be constructed like Ruby

    View Slide

  44. OSS Project
    red-data-tools/red-chainer
    • You can see the source code at any time
    • You can start developing together anywhere you
    want to modify or API you want to add

    View Slide

  45. By having Red Chainer
    Application Deep Learning
    Red Chainer

    View Slide

  46. DEMO
    • Identify CIFAR-10(32x32 image datasets) with Red
    Chainer using CNN
    • Visualize the accuracy of each epoch with Rails
    using the graph and the identified image

    View Slide

  47. Future of Red Chainer
    • GPU compatible: sonots/cumo
    • Fast Numerical Computing and Deep Learning in Ruby with Cumo

    http://rubykaigi.org/2018/presentations/sonots.html#may31
    • Support Apache Arrow
    • Develop around Red Chainer
    • red-datasets: provides common datasets
    • red-arrow: Apache Arrow Ruby binding

    View Slide

  48. Summary
    • introduced Red Chainer of Deep Learning Framework
    created in Ruby
    • Interested in Red Data Tools, Red Chainer
    • online
    • en: https://gitter.im/red-data-tools/en
    • ja: https://gitter.im/red-data-tools/ja
    • offline
    • hold meetup every month at Speee, inc in Tokyo
    • https://speee.connpass.com/
    • I’m at the Speee booth at RubyKaigi2018

    View Slide

  49. Overview of the current
    status of Ruby’s data
    science supports

    View Slide

  50. The current status of

    Ruby’s data science support
    ‣ Red Arrow
    ‣ CRuby’s updates for data science
    ‣ SciRuby GSoC
    ‣ RubyData Workshop in RubyKaigi 2018

    View Slide

  51. Red Arrow
    ‣ Ruby binding of Apache Arrow
    ‣ It has become an official Ruby binding of Apache
    Arrow
    ‣ https://github.com/apache/arrow/tree/master/
    ruby

    View Slide

  52. 2 updates in CRuby for

    data science
    ‣ Enumerator::ArithmeticSequence was accepted
    ‣ Range#% was accepted

    View Slide

  53. Enumerator::ArithmeticSequence
    ‣ We will have an object that works like a slice object
    in Python
    ‣ Integer#step and Range#step returns such an
    object

    View Slide

  54. Range#%
    ‣ An alias to Range#step
    ‣ A range with step can be written as (1…10)%2
    ‣ It may be very useful in Numo::NArray, NMatrix,
    Daru::DataFrame, Arrow::Table, etc.

    View Slide

  55. SciRuby GSoC
    In GSoC 2018, SciRuby accepts 5 students, and then the
    following 4 projects are running:
    • Business Intelligence with daru
    • Advanced features in daru-views
    • NetworkX.rb: Ruby version of NetworkX
    • Ruby version of matplotlib
    The discussions are being held on RubyData’s discourse

    https://discourse.ruby-data.org/c/gsoc/gsoc2018

    View Slide

  56. RubyData Workshop in
    RubyKaigi 2018
    ‣ 3:50pm tomorrow in Room Shirakashi
    ‣ After afternoon break
    ‣ Contents
    • Data analysis with Ruby’s data tools
    • Data analysis with pycall and Python data tools
    • Introduction of Red Data Tools project

    View Slide

  57. Talk Summary

    View Slide

  58. Talk summary
    ‣ The development of high-level deep learning frameworks
    in Ruby is progressed day by day
    ‣ You will be able to do not only deep learning, but also
    GPGPU and distributed computation by these frameworks
    ‣ The development of tools for generic data science is also
    progressed day by day
    ‣ You can join these development projects

    View Slide

  59. Links
    mxnet.rb
    ‣ https://github.com/mrkn/mxnet.rb
    Red Chainer
    ‣ https://github.com/red-data-tools/red-chainer
    Red Data Tools
    ‣ http://red-data-tools.github.io/
    SciRuby GSoC
    ‣ https://discourse.ruby-data.org/c/gsoc/gsoc2018

    View Slide