The evolution of CNN

A182964bc0a261a5fc8bb207d660c743?s=47 yoppe
June 27, 2017

The evolution of CNN

A182964bc0a261a5fc8bb207d660c743?s=128

yoppe

June 27, 2017
Tweet

Transcript

  1. The evolution of CNN Yohei KIKUTA
 diracdiego@gmail.com https://www.linkedin.com/in/yohei-kikuta-983b29117/ 20170629 


    Talk about all things Machine Learning
  2. Deep Learning & CNN # of papers   

      2010 2011 2012 2013 2014 2015 2016 2017 1 1 0 13 74 293 653 476 6 7 22 89 188 651 1,304 1,147 deep learning CNN source: https://arxiv.org/ on 20170626 arXiv papers including “deep learning” (“CNN”) in titles or abstracts in Computer Science 2/17
  3. The evolution of CNN *This is a very limited chronology

    Neocognitoron: http://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf LeNet-5: http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf AlexNet: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks Network in Network: https://arxiv.org/abs/1312.4400 VGG: https://arxiv.org/abs/1409.1556 Inception(V3): https://arxiv.org/abs/1512.00567 ResNet: https://arxiv.org/abs/1512.03385 SqueezeNet: https://arxiv.org/abs/1602.07360 ENet: https://arxiv.org/abs/1606.02147 Deep Complex Networks: https://arxiv.org/abs/1705.09792 1980 1998 2012 2013 2014 2015 2015 2016 2016 2017 3/17
  4. Neocognitron source: www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf • Prototype of CNN • Hierarchical structure

    • S-cells (convolution)
 feature extraction • C-cells (avg. pooling)
 robustness to positional deviation • Self-organizing like training • NOT back propagation 4/17
  5. LeNet-5 source: yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf • Non-linearity sigmoid, tanh 2 8 -2

    3 1 7 2 1 -1 2 9 2 -2 0 3 3 4.5 2 -0.5 4.25 • Convolution feature extraction • Subsampling (avg. pooling) positional invariance, size reduction 5/17
  6. AlexNet source: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks • ReLU keep gradient alive • Max

    pooling better than average • Dropout generalization • GPU computation accelerate computation 2 8 -2 3 1 7 2 1 -1 2 9 2 -2 0 3 3 8 3 2 9 6/17
  7. Network In Network source: https://arxiv.org/abs/1312.4400 • MLP after convolution efficient

    non-linear combinations
 of feature maps • Global Average Pooling one feature map for each class
 no Fully Connected layers • Small model size 29 [MB] for ImageNet MLP Softmax 7/17
  8. VGG source: https://arxiv.org/abs/1409.1556 • Deep Model with basic building blocks

    • convolution • max pooling • activation (ReLU) • Fully Connected • softmax • Sequence of small convolutions • 3*3 spatial convolutions • Relatively large parameters • large channels at the early stages • many Fully Connected layers 8/17
  9. InceptionV3 source: https://arxiv.org/abs/1512.00567 • Inception module • parallel operations and

    concatenation
 capture different features efficiently • mainly 3*3 convolution
 coming from the VGG architecture • 1*1 convolution
 reduce the number of channels • Global Average Pooling and Fully Connected • balance accuracy and model size • Good performance! 9/17
  10. ResNet source: https://arxiv.org/abs/1512.03385 • Residual structure • shortcut (by-pass) connection


    keep gradient alive • 1*1 convolution
 reduce the number of channels • Very deep model • total 152 layers • more than 1000 layers 10/17
  11. SqeezeNet source: https://arxiv.org/abs/1602.07360 11/17 55*55*96 55*55*16 55*55*64 55*55*128 55*55*64 •

    Fire module squeeze channels to reduce 
 computational costs • Deep compression lighten model size
 sparse weight, weight quantization, Huffman coding • Small model for 6 bit data, the model size is 0.47 [MB] ! 1*1 squeeze 1*1 expand 3*3 expand
  12. ENet source: https://arxiv.org/abs/1606.02147 • Realtime segmentation model • downsampling at

    the early stages • asymmetric encoder-decoder structure • PReLU • small model ~ 1[MB] • Encoder can be used as CNN • Global Max Pooling encoder decoder input 3 × 512 × 512 12/17
  13. Deep Complex Networks 13/17 source: https://arxiv.org/abs/1705.09792 • Complex structure •

    convolution
 
 • batch normalization • Advantages of Complex value • biological & signal processing aspects
 can express firing rate & relative timing
 detailed description of objects • parameter efficient 
 2^(depth) efficient than real value
  14. Comparison by acc. vs. G-Ops. 14/17 source: https://arxiv.org/pdf/1605.07678.pdf

  15. Comparison by acc. / M-Params 15/17 source: https://medium.com/towards-data-science/neural-network-architectures-156e5bad51ba

  16. References 16/17

  17. [Review Papers] • On the Origin of Deep Learning: 


    https://arxiv.org/abs/1702.07800 • Recent Advances in Convolutional Neural Networks: 
 https://arxiv.org/abs/1512.07108 • Understanding Convolutional Neural Networks: 
 http://davidstutz.de/wordpress/wp-content/uploads/2014/07/seminar.pdf • AN ANALYSIS OF DEEP NEURAL NETWORK MODELS FOR PRACTICAL APPLICATIONS
 https://arxiv.org/abs/1605.07678 [Slides & Web pages] • Recent Progress on CNNs for Object Detection & Image Compression
 https://berkeley-deep-learning.github.io/cs294-131-s17/slides/sukthankar-UC-Berkeley-InvitedTalk-2017-02.pdf • CS231n: Convolutional Neural Networks for Visual Recognition: 
 http://cs231n.github.io/convolutional-networks/ [Blog posts] • Training ENet on ImageNet: 
 https://culurciello.github.io/tech/2016/06/20/training-enet.html • Neural Network Architectures:
 https://medium.com/towards-data-science/neural-network-architectures-156e5bad51ba 17/17