Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learning Programming on Ruby

Deep Learning Programming on Ruby

Presented by @mrkn and @hatappi at RubyKaigi 2018

Kenta Murata

May 31, 2018
Tweet

More Decks by Kenta Murata

Other Decks in Technology

Transcript

  1. Contents 1. About us 2. Introduction of this session 3.

    Deep learning programming on Ruby 3.1. mxnet.rb 3.2. Red Chainer 4. Overview of Ruby’s current data science support 5. Summary of this talk
  2. About us (1) • Kenta Murata (@mrkn) • Full-time CRuby

    committer at Speee, Inc. • bigdecimal, enumerable-statistics, pycall.rb, mxnet.rb, etc. • Ruby, C/C++, Python, Julia, etc. • neovim, vscode (neovim client)
  3. About us (2) • Yusaku Hatanaka (@hatappi) • Speee, Inc

    • Red Data Tools member • Ruby, Go, TypeScript, etc. • I love soybeans a soybean =>
  4. There are several approaches ‣ mxnet.rb
 https://github.com/mrkn/mxnet.rb ‣ Red Chainer


    https://github.com/red-data-tools/red-chainer ‣ Tensorflow.rb
 https://github.com/somaticio/tensorflow.rb ‣ TensorStream
 https://github.com/jedld/tensor_stream
  5. Topics in this session ‣ mxnet.rb
 https://github.com/mrkn/mxnet.rb ‣ Red Chainer


    https://github.com/red-data-tools/red-chainer ‣ Tensorflow.rb
 https://github.com/somaticio/tensorflow.rb ‣ TensorStream
 https://github.com/jedld/tensor_stream
  6. What is mxnet.rb? ‣ Ruby binding library of Apache MXNet

    ‣ Since Nov 2017 ‣ You can write deep learning programs in Ruby by using mxnet.rb and MXNet runtime library ‣ It doesn’t depend on Python runtime ‣ You need only Ruby ‣ But `pip install mxnet` is currently easiest way to install MXNet runtime library
  7. Why I write mxnet.rb ‣ I want to write deep

    learning programs in Ruby ‣ Without dependency on Python (pycall.rb) ‣ There is Tensorflow.rb, but I don’t want to use Tensorflow C API ‣ I think Apache MXNet must be best for Ruby
  8. Why MXNet is best for Ruby? ‣ It already supports

    multiple languages ‣ Many stakeholders support MXNet development ‣ There are some good features that can compete with other frameworks
  9. Multi-language support ‣ Not only Python and C/C++ ‣ But

    also Julia, R, JavaScript, Perl, Matlab, Scala, Go ‣ Ruby will be supported soon (I’m working on it)
  10. Good Features ‣ Multiple programming paradigm for deep learning ‣

    Lower memory consumption than other frameworks ‣ Efficient multi-GPU computation ‣ Multi-node computation ‣ An Apache incubator project ‣ ONNX support
  11. Good Features ‣ Multiple programming paradigm for deep learning ‣

    Lower memory consumption than other frameworks ‣ Efficient multi-GPU computation ‣ Multi-node computation ‣ An Apache incubator project ‣ ONNX support
  12. # Computation is executed step by step a = MXNet::NDArray.ones([10])

    b = MXNet::NDArray.ones([10]) * 2 c = b * a d = c + 1 Imperative style
  13. # first generate computation graphs a = MXNet::Symbol.var(:a) b =

    MXNet::Symbol.var(:b) c = b * a # generate a computation graph d = c + 1 # ditto # execute the computation graph d.eval(a: MXNet::NDArray.ones([10]), b: MXNet::NDArray.ones([10]) * 2) Symbolic style * 1 + a b c d
  14. Imperative vs Symbolic • Imperative Programs Tend to be More

    Flexible • It enables us to write loop directly in the syntax of the programming language
 e.g. while, until, loop { … }, each { … }, etc. • Symbolic Programs Tend to be More Efficient • It can optimize memory usage automatically • It can optimize computation orders
  15. Computational graph optimization example * 1 + a b c

    d 1 op a b d op = a * b + 1 This optimization reduces both computation steps and memory consumption. Remove c
  16. Hybrid style • Mix both imperative and symbolic styles •

    In deep learning programming • Imperative style is helpful for writing parameter update routines • Gradient calculation should be performed in symbolically • In MXNet, Gluon API supports hybrid style programming
  17. Good Features ‣ Multiple programming paradigm for deep learning ‣

    Lower memory consumption than other frameworks ‣ Efficient multi-GPU computation ‣ Multi-node computation ‣ An Apache incubator project ‣ ONNX support
  18. MXNet vs Tensorflow • Investigated by Julien Simon
 https://medium.com/@julsimon/keras-shoot-out-tensorflow-vs-mxnet-51ae2b30a9c0 •

    Using Keras to compare them • Both MXNet and Tensorflow can be used as backends of Keras • Three metrics • Precision, Speed, and Memory consumption
  19. Good Features ‣ Multiple programming paradigm for deep learning ‣

    Lower memory consumption than other frameworks ‣ Efficient multi-GPU computation ‣ Multi-node computation ‣ An Apache incubator project ‣ ONNX support
  20. Multi-node computation • You can use MXNet as a framework

    for distributed scientific computation • Using Key-Value Store to exchange parameters among each thread in each machine • For example: • Distributed model training
 https://mxnet.incubator.apache.org/versions/master/faq/distributed_training.html
  21. MXNet is an Apache incubator project • There are a

    lot of tools for data science under Apache Foundation • Arrow • Hadoop • Kudu • Spark • etc.
  22. ONNX • Open Neural Network Exchange Format • Founded by

    Microsoft and Facebook • We can interchange learned models between different frameworks by ONNX • e.g. We can use Python and Keras for experimental, and we can use Ruby and MXNet for production
  23. Frameworks that support to interchange models by ONNX • MXNet

    • PyTorch • Chainer • Caffe2 • Tensorflow • etc.
  24. Current project status
 of mxnet.rb 2 developers ‣ Me (@mrkn)

    • Conference-driven development • Currently focusing on Gluon API ‣ Laurent Julliard (@ljulliar) • Currently focusing on the coverage of NDArray API Future plan ‣ I want to achieve 100% feature coverage
  25. We want more developers ‣ We are welcome to receive

    your pull-request ‣ I’ll make feature tables and some milestones so that you can find your commit chance more easily
  26. require 'mxnet' module MLPScratch ND = MXNet::NDArray class MLP def

    initialize(num_inputs: 784, num_outputs: 10, num_hidden_units: [256, 128, 64], ctx: nil) @layer_dims = [num_inputs, *num_hidden_units, num_outputs] @weight_scale = 0.01 @ctx = ctx || MXNet::Context.default @all_parameters = init_parameters end attr_reader :ctx, :all_parameters, :layer_dims private def rnorm(shape) ND.random_normal(shape: shape, scale: @weight_scale, ctx: @ctx) end private def init_parameters @weights = [] @biases = [] @layer_dims.each_cons(2) do |dims| @weights << rnorm(dims) @biases << rnorm([dims[1]]) end [*@weights, *@biases].each(&:attach_grad) DEMO
  27. require 'mxnet' module MLPScratch ND = MXNet::NDArray class MLP def

    initialize(num_inputs: 784, num_outputs: 10, num_hidden_units: [256, 128, 64], ctx: nil) @layer_dims = [num_inputs, *num_hidden_units, num_outputs] @weight_scale = 0.01 @ctx = ctx || MXNet::Context.default @all_parameters = init_parameters end attr_reader :ctx, :all_parameters, :layer_dims private def rnorm(shape) ND.random_normal(shape: shape, scale: @weight_scale, ctx: @ctx) end private def init_parameters @weights = [] @biases = [] @layer_dims.each_cons(2) do |dims| @weights << rnorm(dims) @biases << rnorm([dims[1]]) end [*@weights, *@biases].each(&:attach_grad) end private def relu(x) ND.maximum(x, ND.zeros_like(x)) end def forward(x) h = x n = @layer_dims.length (n - 2).times do |i| h_linear = ND.dot(h, @weights[i]) + @biases[i] h = relu(h_linear) end y_hat_linear = ND.dot(h, @weights[-1]) + @biases[-1] end private def softmax_cross_entropy(y_hat_linear, t) -ND.nansum(t * ND.log_softmax(y_hat_linear), axis: 0, exclude: true) end def loss(y_hat_linear, t) softmax_cross_entropy(y_hat_linear, t) end def predict(x) y_hat_linear = forward(x) ND.argmax(y_hat_linear, axis: 1) end end module_function def SGD(params, lr) params.each do |param| param[0..-1] = param - lr * param.grad end end def evaluate_accuracy(data_iter, model) num, den = 0.0, 0.0 data_iter.each_with_index do |batch, i| data = batch.data[0].as_in_context(model.ctx) data = data.reshape([-1, model.layer_dims[0]]) label = batch.label[0].as_in_context(model.ctx) predictions = model.predict(data) num += ND.sum(predictions == label) den += data.shape[0] end (num / den).as_scalar end def learning_loop(train_iter, test_iter, model, epochs: 10, learning_rate: 0.001, smoothing_constant: 0.01) epochs.times do |e| start = Time.now cumloss = 0.0 num_batches = 0 train_iter.each_with_index do |batch, i| data = batch.data[0].as_in_context(model.ctx) data = data.reshape([-1, model.layer_dims[0]]) label = batch.label[0].as_in_context(model.ctx) label_one_hot = ND.one_hot(label, depth: model.layer_dims[-1]) loss = MXNet::Autograd.record do y = model.forward(data) model.loss(y, label_one_hot) end loss.backward SGD(model.all_parameters, learning_rate) cumloss = ND.sum(loss).as_scalar num_batches += 1 end test_acc = evaluate_accuracy(test_iter, model) train_acc = evaluate_accuracy(train_iter, model) duration = Time.now - start puts "Epoch #{e}. Loss: #{cumloss / (train_iter.batch_size * num_batches)}, " + "train-acc: #{train_acc}, test-acc: #{test_acc} (#{duration} sec)" end end end
  28. Summary of mxnet.rb ‣ MXNet is a deep learning framework

    that is better for supporting in Ruby ‣ mxnet.rb is under development but some APIs has already been usable ‣ Contact me if you want to join the development
  29. Red Chainer • Deep learning framework
 it ported python's chainer

    with ruby • Use Numo::NArray for holding and computing matrices • One project in development under Red Data Tools
  30. Red Data Tools • Project providing data processing tool for

    Ruby • @ktou was launched in February 2017 • red-arrow, red-datasets, csv gem maintenance, etc
  31. Red Data Tools’s Policy 1. Collaborate across the Ruby community

    2. Acting rather than blaming 3. Continuous, iterative progress rather than a short, big project 4. The current lack of knowledge doesn't matter 5. Ignore criticism from outsiders 6. Fun!
  32. Features of Red Chainer 1. Define-by-Run 2. Provide high level

    API 3. Can be constructed like Ruby 4. OSS Project
  33. Define-by-Run • Define and Run • Build a calculation graph

    and run data • Define by Run • Build a calculation graph with data flowing
  34. Provide high level API • 2D Convolution • BatchNormalization •

    Linear • ReLU • Sigmoid • Softmax • Dropout • etc…
  35. OSS Project red-data-tools/red-chainer • You can see the source code

    at any time • You can start developing together anywhere you want to modify or API you want to add
  36. DEMO • Identify CIFAR-10(32x32 image datasets) with Red Chainer using

    CNN • Visualize the accuracy of each epoch with Rails using the graph and the identified image
  37. Future of Red Chainer • GPU compatible: sonots/cumo • Fast

    Numerical Computing and Deep Learning in Ruby with Cumo
 http://rubykaigi.org/2018/presentations/sonots.html#may31 • Support Apache Arrow • Develop around Red Chainer • red-datasets: provides common datasets • red-arrow: Apache Arrow Ruby binding
  38. Summary • introduced Red Chainer of Deep Learning Framework created

    in Ruby • Interested in Red Data Tools, Red Chainer • online • en: https://gitter.im/red-data-tools/en • ja: https://gitter.im/red-data-tools/ja • offline • hold meetup every month at Speee, inc in Tokyo • https://speee.connpass.com/ • I’m at the Speee booth at RubyKaigi2018
  39. The current status of
 Ruby’s data science support ‣ Red

    Arrow ‣ CRuby’s updates for data science ‣ SciRuby GSoC ‣ RubyData Workshop in RubyKaigi 2018
  40. Red Arrow ‣ Ruby binding of Apache Arrow ‣ It

    has become an official Ruby binding of Apache Arrow ‣ https://github.com/apache/arrow/tree/master/ ruby
  41. Enumerator::ArithmeticSequence ‣ We will have an object that works like

    a slice object in Python ‣ Integer#step and Range#step returns such an object
  42. Range#% ‣ An alias to Range#step ‣ A range with

    step can be written as (1…10)%2 ‣ It may be very useful in Numo::NArray, NMatrix, Daru::DataFrame, Arrow::Table, etc.
  43. SciRuby GSoC In GSoC 2018, SciRuby accepts 5 students, and

    then the following 4 projects are running: • Business Intelligence with daru • Advanced features in daru-views • NetworkX.rb: Ruby version of NetworkX • Ruby version of matplotlib The discussions are being held on RubyData’s discourse
 https://discourse.ruby-data.org/c/gsoc/gsoc2018
  44. RubyData Workshop in RubyKaigi 2018 ‣ 3:50pm tomorrow in Room

    Shirakashi ‣ After afternoon break ‣ Contents • Data analysis with Ruby’s data tools • Data analysis with pycall and Python data tools • Introduction of Red Data Tools project
  45. Talk summary ‣ The development of high-level deep learning frameworks

    in Ruby is progressed day by day ‣ You will be able to do not only deep learning, but also GPGPU and distributed computation by these frameworks ‣ The development of tools for generic data science is also progressed day by day ‣ You can join these development projects
  46. Links mxnet.rb ‣ https://github.com/mrkn/mxnet.rb Red Chainer ‣ https://github.com/red-data-tools/red-chainer Red Data

    Tools ‣ http://red-data-tools.github.io/ SciRuby GSoC ‣ https://discourse.ruby-data.org/c/gsoc/gsoc2018