DNN/GPU with Ruby #rubykaigi

45c0e67049c82c238143b82a1660a713?s=47 ainame
September 19, 2017

DNN/GPU with Ruby #rubykaigi

45c0e67049c82c238143b82a1660a713?s=128

ainame

September 19, 2017
Tweet

Transcript

  1. 2.

    ruby-dlib/ruby-dlib • Ruby binding for dlib (original author is mrkn-san)

    ◦ dlib is C++ based toolkit for machine learning ◦ using C extension ◦ $ gem install dlib • Face detector based on DNN (Deep Neural Network) ◦ High accuracy and better than OpenCV ◦ Works on GPU with CUDA SDK
  2. 4.

    image = Dlib::Image.load('./face.jpg') detector = Dlib::DNNFaceDetector.new('model.dat') rects = detector.detect(image) #=>

    [<Dlib::Rectangle>, <Dlib::Rectangle>] rects.each do |rect| image.draw_rectangle!(rect, [255, 0, 0, 3]) end image.save_jpeg('output.jpg')
  3. 9.

    Hack for “depend” file • “depend” file is where we

    should describe dependencies of each C file • “depend” file will be appended to end of Makefile So we can describe everything freely….
  4. 10.

    $ ruby ext/dlib/exconf.rb SHELL = /bin/sh # V=0 quiet, V=1

    verbose. other values don't work. V = 0 Q1 = $(V:1=) Q = $(Q1:0=@) ECHO1 = $(V:1=@:) ECHO = $(ECHO1:0=@echo) NULLCMD = : #### Start of system configuration section. #### srcdir = ext/dlib topdir = /usr/include/ruby-2.3.0 hdrdir = $(topdir) arch_hdrdir = /usr/include/x86_64-linux-gnu/ruby-2.3.0 Generate Makefile by mkmf.rb Makefile
  5. 11.

    datadir = $(datarootdir) datarootdir = $(prefix)/share libexecdir = $(prefix)/lib/ruby2.3 sbindir

    = $(exec_prefix)/sbin bindir = $(exec_prefix)/bin archdir = $(rubyarchdir) CC = gcc CXX = g++ LIBRUBY = $(LIBRUBY_SO) LIBRUBY_A = lib$(RUBY_SO_NAME)-static.a LIBRUBYARG_SHARED = -l$(RUBY_SO_NAME) LIBRUBYARG_STATIC = -l$(RUBY_SO_NAME)-static empty = OUTFLAG = -o $(empty) COUTFLAG = -o $(empty) RUBY_EXTCONF_H = cflags = $(optflags) $(debugflags) $(warnflags) cxxflags = $(optflags) $(debugflags) $(warnflags) Set compilers for C / C++
  6. 12.

    $(TARGET_SO): $(OBJS) Makefile $(ECHO) linking shared-object $(DLLIB) -$(Q)$(RM) $(@) $(Q)

    $(LDSHAREDXX) -o $@ $(OBJS) $(LIBPATH) $(DLDFLAGS) $(LOCAL_LIBS) $(LIBS) $(Q) $(POSTLINK) ### .SUFFIXES: .cu .o DLIB_SRCDIR = $(srcdir)/../dlib-19.4 DLIB_FUNCTIONS = \ geometry.inc \ rectangle.inc \ image.inc \ detector.inc \ find_candidate_object_locations.inc \ dnn_detector.inc \ cuda.inc OBJS += $(DLIB_OJBS) mkmf append “depend” file to end of Makefile Generated Makefile
  7. 13.

    CUDA_NVCC = /usr/local/cuda/bin/nvcc CUDA_FLAGS = $(CPPFLAGS) -I /usr/local/cuda/include -arch=sm_30 -D__STRICT_ANSI__

    -D_MWAITXINTRIN_H_INCLUDED -D_FORCE_INLINES -std=c++11 -Xcompiler -fPIC -Xcompiler -funwind-tables ……………… SRCS += $(DLIB_CUDA_SRCS) OBJS += $(DLIB_CUDA_OBJS) .SUFFIXES: .cu .cu.o: $(ECHO) compiling $@ $(Q) $(CUDA_NVCC) $(CUDA_FLAGS) -c -o $@ $< Absolute path is safer. Some envs doesn’t have correct PATH. Add a new suffix rule for CUDA
  8. 15.

    Empower DNN/Face Detector • Finally, face detector get the power

    of Ruby • Sidekiq is awesome gem for job queue system • Easy to scale out face detector with Sidekiq Sidekiq http://sidekiq.org/about
  9. 16.

    class FaceDetectionWorker include Sidekiq::Worker MODEL_PATH = Rails.root.join('vendor', 'mmod_human_face_detector.dat').to_s def perform(image_id)

    image = Image.find(id: image_id) frames = image.download { |file| detect(file) } frames.each { |f| Face.create!(image_id: image.id, x: f.left, y: f.top, width: f.width, height: f.height) } end def detect(file) detector = Dlib::DNNFaceDetector.new(MODEL_PATH) detector.detect(Dlib::Image.load(file.path)) ensure GC.start end end
  10. 18.

    class FaceDetectionWorker include Sidekiq::Worker MODEL_PATH = Rails.root.join('vendor', 'mmod_human_face_detector.dat').to_s def perform(image_id)

    image = Image.find(id: image_id) frames = image.download { |file| detect(file) } frames.each { |f| Face.create!(image_id: image.id, x: f.left, y: f.top, width: f.width, height: f.height) } end def detect(file) detector = Dlib::DNNFaceDetector.new(MODEL_PATH) detector.detect(Dlib::Image.load(file.path)) ensure GC.start end end Load data on GPU memory
  11. 22.

    CPU GPU GPU memory Model Tensor Image Tensor Main memory

    Dlib::DNNFaceDetector Load Dlib::Image
  12. 23.

    CPU GPU GPU memory Model Tensor Image Tensor Main memory

    Dlib::DNNFaceDetector Dlib::Image Detection
  13. 24.

    CPU GPU GPU memory Model Tensor Image Tensor Main memory

    Dlib::Image Dlib::DNNFaceDetector Out of scope
  14. 27.

    class FaceDetectionJob include Sidekiq::Worker MODEL_PATH = Rails.root.join('vendor', 'mmod_human_face_detector.dat').to_s def perform(image_id)

    image = Image.find(id: image_id) frames = image.download { |file| detect(file) } frames.each { |f| Face.create!(image_id: image.id, x: f.left, y: f.top, width: f.width, height: f.height) } end def detect(file) detector = Dlib::DNNFaceDetector.new(MODEL_PATH) detector.detect(Dlib::Image.load(file.path)) ensure GC.start end end Ensure clearing memories on GPU! A image obj keeps memory area of GPU.
  15. 28.
  16. 31.
  17. 32.

    Summary • Making a binding gem is good option to

    start small • mkmf.rb can support compiling with CUDA • Empower DNN to scale out with Ruby