Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deep Learningによる画像認識の基礎・CNNの仕組み

kmotohas
October 21, 2019

Deep Learningによる画像認識の基礎・CNNの仕組み

Tableauデータサイエンス勉強会 第4回 - 画像認識技術とBIの巻-
https://techplay.jp/event/750555
2019-10-21

kmotohas

October 21, 2019
Tweet

More Decks by kmotohas

Other Decks in Technology

Transcript

  1. Kazuki Motohashi - Skymind K.K. ‣本橋 和貴 @kmotohas - スカイマインド株式会社

    • Deep Learning Engineer (前職ではDL+ROS) - 素粒⼦物理学実験(LHC-ATLAS実験)出⾝ • 博⼠(理学) - 好きな本︓詳説 Deep Learning ̶ 実務者のためのアプローチ 2 ࣗݾ঺հ
  2. Kazuki Motohashi - Skymind K.K. ‣ 原著 “Deep Learning ̶

    A Practitionerʼs Approach” は 2017年8⽉発売 ‣ JVM⾔語⽤ディープラーニング開発フレームワーク Deeplearning4j (DL4J) を⽤いた解説書 - 著者は DL4J の開発者 Adam Gibson、Skymind Inc を創業 - ソフトウェア/アプリケーション/システム・エンジニアなどがメイン ターゲット - ディープラーニングの基礎からHadoop/Sparkといったビッグデータ 分析基盤との連携まで解説 3 ೥݄೔ൃച
  3. 機械学習の"Hello World” 14 x = -2, -1, 0, 1, 2,

    3, 4 y = -3, -1, 1, 3, 5, 7, 9 y = f(x)
  4. 機械学習の"Hello World” 15 x = -2, -1, 0, 1, 2,

    3, 4 y = -3, -1, 1, 3, 5, 7, 9 y = f(x) = 2x + 1
  5. 機械学習のアプローチ 19 • 適当にモデルを初期化 (y_=ax+b) • 誤差(損失)を計算 (L=1/N Σ(y-y_)2) •

    誤差が⼩さくなるようにパラメータ (a, b) を少し更新 (a ← a - η ∂L/∂a) •誤差(損失)を計算 (L=1/N Σ(y-y_)2)
  6. 機械学習のアプローチ 20 • 適当にモデルを初期化 (y_=ax+b) • 誤差(損失)を計算 (L=1/N Σ(y-y_)2) •

    誤差が⼩さくなるようにパラメータ (a, b) を少し更新 (a ← a - η ∂L/∂a) • 誤差(損失)を計算 (L=1/N Σ(y-y_)2) •誤差が⼩さくなるようにパラメータ (a, b) を少し更新 (a ← a - η ∂L/∂a) • …
  7. 24

  8. Fashion MNIST Dataset • 7万画像 • 10カテゴリ • 28×28 pixels

    • 実験⽤データセット 26 IUUQTHJUIVCDPN[BMBOEPSFTFBSDIGBTIJPONOJTU
  9. 27

  10. 28

  11. 29 ʜ ʜʜʜʜ ʜ ʜ 'MBUUFO    

                        
  12. 30

  13. Dense Layer の⽋点 • ⼊⼒のベクトルの全要素の相関をみている > 住宅価格予測みたいな話ならまだいい > もう作ってる特徴量と特徴量の組み合わせ -

    例)東京墨⽥区 & 床⾯積 30m2 & 1K & ⾵呂トイレ別 & 新築 => 家賃⽉10万円 • 「画像の特徴量」を抽出してからDense Layerに渡せば効率的 31 ৞ࠐΈχϡʔϥϧωοτϫʔΫ $POWPMVUJPOBM/FVSBM/FUXPSL$//
  14. 畳み込み (Convolution) CURRENT_PIXEL_VALUE = 82 NEW_PIXEL_VALUE = (-1 * 144)

    + (0 * 60) + (-2 * 19) + (0.5 * 188) + (4.5 * 82) + (-1.5 * 32) + (1.5 * 156) + (2 * 55) + (-3 * 27) 33                   u ijm = K 1 X k=0 W 1 X p=0 H 1 X q=0 z(l 1) i+p,j+q,k h pqkm + b ijm <latexit sha1_base64="bp48ep/Dp5jA49Knv3mGnitdbqk=">AAACwnichVHLShxBFD12Xjp5OEk2ATfiYDCMDreNoAiCGBdCNj4yjqBj090ptaaf048BbfsH/IEsXBkIIfgZbvIDWfgJkp0jZOPCO91NQiJJbtNV55y659atKsO3ZRgRnfcpd+7eu/+gf6D08NHjJ4Plp8/WQy8OTFE3PdsLNgw9FLZ0RT2SkS02/EDojmGLhmG96a03OiIIpee+i/Z90XT0XVfuSFOPWNLKzVhLZMtJ57bC2NlO3k6oqZZYc5TmvJFx/ydfynib+cF2MmZPqK+Yyqo/3qq2x610j3PblpNWjbyqVq5QjbIYvg3UAlRQxLJX/owtvIcHEzEcCLiIGNvQEfK3CRUEn7UmEtYCRjJbF0hRYm/MWYIzdFYtHneZbRaqy7xXM8zcJu9i8x+wcxij9I2+UJe+0ild0PVfayVZjV4v+zwbuVf42uDRi7Uf/3U5PEfY++X6Z88RdjCT9Sq5dz9Teqcwc3/n4EN3bXZ1NHlJH+k7939C53TGJ3A7V+anFbF6jBI/gPrndd8G65M19XVtcmWqMr9QPEU/hjCCMb7vacxjCcuo875nuMAlusqi0lLaSpinKn2F5zl+C+XwBo4trvM=</latexit>
  15. 34         

                      https://www.bbkong.net/fs/alleyoop/molten_BGL7
  16. ResNet • 2015年のImageNetコンペ (ILSVRC) 優勝モデル • Residualモジュール(ショートカット機構)の導⼊ 55 http://image-net.org/challenges/talks/ilsvrc2015_deep_residual_learning_kaiminghe.pdf Revolution

    of Depth 3.57 6.7 7.3 11.7 16.4 25.8 28.2 ILSVRC'15 ResNet ILSVRC'14 GoogleNet ILSVRC'14 VGG ILSVRC'13 ILSVRC'12 AlexNet ILSVRC'11 ILSVRC'10 ImageNet Classification top-5 error (%) shallow 8 layers 19 layers 22 layers 152 layers Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. Dee Re id al Lea ning f Image Rec gni i n a Xi 8 layers
  17. 56 Revolution of Depth ResNet, 152 layers 1x1 conv, 64

    3x3 conv, 64 1x1 conv, 256 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x1 conv, 64 3x3 conv, 64 1x1 conv, 256 1x2 conv, 128, /2 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 256, /2 3x3 conv, 256 7x7 conv, 64, /2, pool/2 Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. Dee Re id al Lea ning f Image Rec gni i n a Xi (there was an animation here)
  18. 57 Revolution of Depth ResNet, 152 layers 1x1 conv, 512

    1x1 conv, 128 3x3 conv, 128 1x1 conv, 512 1x1 conv, 256, /2 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. Dee Re id al Lea ning f Image Rec gni i n a Xi (there was an animation here)
  19. 58 Revolution of Depth ResNet, 152 layers 1x1 conv, 256

    3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. Dee Re id al Lea ning f Image Rec gni i n a Xi (there was an animation here)
  20. 59 Revolution of Depth ResNet, 152 layers 3x3 conv, 256

    1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 256 3x3 conv, 256 1x1 conv, 1024 1x1 conv, 512, /2 3x3 conv, 512 1x1 conv, 2048 1x1 conv, 512 3x3 conv, 512 1x1 conv, 2048 1x1 conv, 512 3x3 conv, 512 1x1 conv, 2048 ave pool, fc 1000 Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. Dee Re id al Lea ning f Image Rec gni i n a Xi (there was an animation here)
  21. ResNet 60 Deep Residual Learning • Plaint net Kaiming He,

    Xiangyu Zhang, Shaoqing Ren, & Jian Sun. Dee Re id al Lea ning f Image Rec gni i n a Xi any two stacked layers () weight layer weight layer relu relu is any desired mapping, hope the 2 weight layers fit () Deep Residual Learning • Residual net Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian hop hop weight layer weight layer relu relu + identity http://image-net.org/challenges/talks/ilsvrc2015_deep_residual_learning_kaiminghe.pdf
  22. 61 CIFAR-10 experiments 0 1 2 3 4 5 6

    0 5 10 20 iter. (1e4) error (%) plain-20 plain-32 plain-44 plain-56 20-layer 32-layer 44-layer 56-layer CIFAR-10 plain nets 0 1 2 3 4 5 6 0 5 10 20 iter. (1e4) error (%) ResNet-20 ResNet-32 ResNet-44 ResNet-56 ResNet-110 CIFAR-10 ResNets 56-layer 44-layer 32-layer 20-layer 110-layer • Deep ResNets can be trained without difficulties • Deeper ResNets have lower training error, and also lower test error solid: test dashed: train Kaiming He, Xiangyu Zhang, Shaoqing Ren, & Jian Sun. Dee Re id al Lea ning f Image Rec gni i n a Xi
  23. まとめ • 機械学習ではデータからルールを確率的に学ぶ • 画像に対して全結合の重みを最適化するのは⾮効率 • CNNで画像の特徴量抽出⽅法まで⾃動で⾏うアプローチ • ResNetなど、さらに効率的に⾏うモデルが提案されている •

    何がうまくいくかは正直データによるので、経験的にうまくいきそうな アーキテクチャの組み合わせだけ⽤意してあとは勝⼿に選んで欲しい → Auto ML 66