DevFest Tokyo 2017 TensorFlow

Ece52fe9ce913851256726020707febd?s=47 Keiji ARIYAMA
October 08, 2017

DevFest Tokyo 2017 TensorFlow

DevFest Tokyo 2017で発表した「TensorFlowで
趣味の画像収集サーバーを作る 2017年10月号」の資料です。

SSD(Single Shot MultiBox Detector)によるイラスト顔検出について。

Ece52fe9ce913851256726020707febd?s=128

Keiji ARIYAMA

October 08, 2017
Tweet

Transcript

  1. C-LIS CO., LTD.

  2. C-LIS CO., LTD. ༗ࢁܓೋʢ,FJKJ"3*:"."ʣ $-*4$0 -5% Photo : Koji MORIGUCHI

    (AUN CREATIVE FIRM) "OESPJEΞϓϦ։ൃνϣοτσΩϧ ػցֶश͸ͪΐͬͱ΍ͬͨ͜ͱ͋Γ·͢ ΍ͬͯ·ͤΜ
  3. %FW'FTU5PLZP   5FOTPS'MPXͰ
 झຯͷը૾ऩूαʔόʔΛ࡞Δ ೥݄߸

  4.  લճ·Ͱͷ͓࿩

  5. 5FOTPS'MPXൃදʢ೥݄ʣ 

  6.  ษڧձ΍Ζ͏ͥ

  7.  (PPHMF%FWFMPQFS(SPVQ

  8.  IUUQTHEHLPCFEPPSLFFQFSKQFWFOUT

  9. Πϯλʔωοτ͔Β ޷Έͷը૾ΛࣗಈͰऩू͍ͨ͠

  10. © ࠜઇΕ͍ ؟ ڸ ͬ ່

  11. γεςϜߏ੒  Downloader Face Detection Megane Detection ֬ೝɾमਖ਼ ೝࣝ݁Ռ JSON

    ֶशʢ܇࿅ʣ λΠϜϥΠϯ ϝσΟΞ σʔληοτ ֶशʢ܇࿅ʣ TensorFlow 
  12. { "generator": "Region Cropper", "file_name": "haruki_g17.png", "regions": [ { "probability":

    1.0, "label": 2, "rect": { "left": 97.0, "top": 251.0, "right": 285.0, "bottom": 383.0 } }, { "probability": 1.0, "label": 2, "rect": { "left": 536.0, "top": 175.0, "right": 730.0, "bottom": 321.0 } } ] } Region Cropper: https://github.com/keiji/region_cropper 
  13. ؟ڸ່ͬ൑ఆ conv 3x3x3 32 conv 3x3x32 64 max_pooling lrn conv

    3x3x64
 128 conv 3x3x128 128 max_pooling lrn conv 3x3x128
 128 conv 3x3x128 256 fc 384 fc 256 output 2 
  14. ͷ൑ఆਫ਼౓ 1 0

  15. ٕज़ܥಉਓࢽΦϯϦʔΠϕϯτ ٕज़ॻయ̍ˍ̎  https://techbookfest.org/tbf01/#A-28 https://techbookfest.org/event/tbf02/circle/5705718560718848

  16. ੔ཧ଴ͪͷը૾   ຕ  ʢ2017.10.08 ࣌఺ʣ

  17. σʔληοτͷ੔උʢ੔ཧʣ͕
 ௚ۙͷ՝୊

  18. Πϥετإݕग़ʢ'BDF%FUFDUJPOʣ 

  19. γεςϜߏ੒  Downloader Face Detection Megane Detection ֬ೝɾमਖ਼ ೝࣝ݁Ռ JSON

    ֶशʢ܇࿅ʣ λΠϜϥΠϯ ϝσΟΞ σʔληοτ ֶशʢ܇࿅ʣ TensorFlow 
  20.  3$// 3FHJPOCBTFE$POWPMVUJPOBM/FVSBM/FUXPSLT

  21. 4MJEJOH8JOEPX 

  22. 4FMFDUJWF4FBSDI  4FMFDUJWF4FBSDIGPS0CKFDU3FDPHOJUJPO IUUQXXXIVQQFMFOOMQVCMJDBUJPOTTFMFDUJWF4FBSDI%SBGUQEG 848

  23. %FW'FTU5PLZP    5FOTPS'MPXͰ Πϥετإݕग़ثΛ࡞Δ

  24. ֶशσʔλͷ࡞੒

  25. ࠲ඪͷਖ਼ଇԽɹਖ਼نԽ 888x613 left : 536 → 0.6036 top : 175

    → 0.2854 right : 730 → 0.8220 bottom : 321 → 0.5236 0.1092 ← 97 : left 0.4094 ← 251 : top 0.3209 ← 285 : right 0.6247 ← 383 : bottom 
  26. ը૾ͷϦαΠζɾάϨʔεέʔϧԽ 256x256 left : 0.6036 top : 0.2854 right :

    0.8220 bottom : 0.5236 0.1092 : left 0.4094 : top 0.3209 : right 0.6247 : bottom 
  27. Ϩίʔυߏ଄ <JNBHF>ը૾σʔλʢόΠτྻʣ <MBCFMT>       

      <SFHJPOT>
         
                             
     
  28. XJUIUGQZUIPO@JP5'3FDPSE8SJUFS PVUQVU@QBUI PQUJPOT BTXSJUFS FYBNQMFUGUSBJO&YBNQMF GFBUVSFTUGUSBJO'FBUVSFT GFBUVSF\ GJMF@OBNF@CZUFT@GFBUVSF GJMF@OBNFFODPEF 

    JNBHF@CZUFT@GFBUVSF JNBHF@EBUB  SFHJPO@DPVOU@JOU@GFBUVSF SFHJPO@DPVOU  SFHJPOT@GMPBU@GFBUVSF GMBUUFO@SFHJPOT  MBCFMT@JOU@MJTU@GFBUVSF MBCFMT  ^  XSJUFSXSJUF FYBNQMF4FSJBMJ[F5P4USJOH  
  29. Ϟσϧ conv 5x5x1
 128 batch_normalization conv 1x1x128
 64 conv 3x3x64


    64 batch_normalization conv 1x1x64
 128 ReLU conv 1x1x128
 128 conv 3x3x128
 128 batch_normalization conv 1x1x128
 128 ReLU conv 1x1x256
 256 conv 3x3x256
 256 batch_normalization conv 1x1x256
 128 ReLU conv 2x2x128
 128 fc 2048 fc 4 256x256x1 Residual block 1 Residual block 2 Residual block 3 × 3 × 3 × 3 avg_pool 2x2 avg_pool 2x2 avg_pool 2x2 ReLU Sigmoid 
  30. ೖྗͱग़ྗ 256x256x1 Ϟσϧ Sigmoid 

  31. ReLU Sigmoid ׆ੑԽؔ਺ 

  32. 256x256 left : 0.6036 top : 0.2854 right : 0.8220

    bottom : 0.5236 0.1092 : left 0.4094 : top 0.3209 : right 0.6247 : bottom ࠲ඪͷ஋Ҭʢʣ 
  33. ReLU Sigmoid ׆ੑԽؔ਺ 

  34. ܇࿅ɾֶश ϛχόον ֶश཰ʢ"EBN0QUJNJ[FSʣ 

  35. ݕग़݁Ռ 

  36. ̎ਓҎ্͍Δ৔߹͸ʁ 

  37. 

  38. σʔλ಺༁  0 (18.57%) 1 (57.64%) 2 (11.93%) إͷ਺ ϑΝΠϧ਺

    ׂ߹ 0 3748 18.57% 1 11636 57.64% 2 2408 11.93% 3 844 4.18% 4 596 2.95% 5 325 1.61% 6 214 1.06% 7 108 0.53%
  39. ೖྗͱग़ྗ 256x256x1 Ϟσϧ Sigmoid 

  40. إྖҬͷݕग़

  41. إྖҬͷݕग़ dconv 3x3x1 64 dconv 3x3x64 64 dconv 3x3x256 128

    dconv 3x3x64 128 dconv 3x3x128 128 conv 3x3x1 64 conv 3x3x64 64 conv 3x3x128 256 conv 3x3x64 128 conv 3x3x128 128 
  42. إྖҬͷݕग़ 

  43. ... ৞ΈࠐΈ૚ʢDPOWEʣͷ໾ׂ  ಛ௃Ϛοϓ

  44. إྖҬͷݕग़ʢ৞ΈࠐΈΦʔτΤϯίʔμʣ dconv 3x3x3 64 dconv 3x3x64 64 dconv 3x3x256 128

    dconv 3x3x64 128 dconv 3x3x128 128 conv 3x3x1 64 conv 3x3x64 64 conv 3x3x128 256 conv 3x3x64 128 conv 3x3x128 128 
  45. ৞ΈࠐΈ૚ͷΈʹͨ͠৔߹ conv 3x3x1 64 conv 3x3x64 64 conv 3x3x128 256

    conv 3x3x64 128 conv 3x3x128 128 dconv 3x3x3 64 dconv 3x3x64 64 dconv 3x3x256 128 dconv 3x3x64 128 dconv 3x3x128 128 
  46. ৞ΈࠐΈ૚ͷΈʹͨ͠৔߹ conv 3x3x64 stride 1 conv 3x3x64
 stride 1 ReLU

    ReLU conv 3x3x128
 stride 1 conv 3x3x128
 stride 1 ReLU conv 3x3x256
 stride 1 conv 3x3x256
 stride 1 ReLU conv 1x1x1
 stride 1 256x256x1 max_pool 2x2 stride 2 max_pool 2x2 stride 2 ReLU ReLU Sigmoid 29x29x1 
  47. άϦουʢYʣ 

  48. άϦουͷͲ͜ʹإ͕͋Δ͔ 

  49. ࠲ඪͷதԝͷηϧΛબ୒ 

  50. ֤ηϧʹσʔλΛઃఆ  confidence : 1.0 center_x : 0.7128 center_y :

    0.4045 width : 0.2184 height : 0.2382 1.0 : confidence 0.2150 : center_x 0.5169 : center_y 0.2117 : width 0.2151 : height confidence : 0.0 center_x : 0.0 center_y : 0.0 width : 0.0 height : 0.0
  51. Ϩίʔυߏ଄ <JNBHF>ը૾σʔλʢόΠτྻʣ <CPY@MBCFMT>       

         <CPY@SFHJPOT>
                 
                             
     
  52. Ϟσϧ conv 3x3x64 stride 1 conv 3x3x64
 stride 1 ReLU

    ReLU conv 3x3x128
 stride 1 conv 3x3x128
 stride 1 ReLU conv 3x3x256
 stride 1 conv 3x3x256
 stride 1 ReLU conv 1x1x5
 stride 1 256x256x1 max_pool 2x2 stride 2 max_pool 2x2 stride 2 ReLU ReLU Sigmoid 29x29x5 
  53. ग़ྗ  confidence center_x center_y width height 29 × 29

    × (4 + 1) = 4,205
  54. ܇࿅ɾֶश ࠷దԽΞϧΰϦζϜ"EBN0QUJNJ[FS ֶश཰ 

  55. Yͷݶք إதԝҎ֎ͷηϧͷDPOpEFODF΋ߴ͍ DPOpEFODFͷߴ͍ηϧ͔Β
 ਖ਼͍͠࠲ඪ͕ಘΒΕͳ͍ 

  56. ৞ΈࠐΈ૚ͷ໾ׂ conv 3x3x64 stride 1 conv 3x3x64
 stride 1 ReLU

    ReLU conv 3x3x128
 stride 1 conv 3x3x128
 stride 1 ReLU conv 3x3x256
 stride 1 conv 3x3x256
 stride 1 ReLU conv 1x1x1
 stride 1 256x256x1 max_pool 2x2 stride 2 max_pool 2x2 stride 2 ReLU ReLU Sigmoid 29x29x1 
  57. ... લ૚͔ΒͷೖྗΛεΩϟϯͯ͠ಛ௃ϚοϓΛग़ྗ  ಛ௃Ϛοϓ

  58. ಛ௃Ϛοϓ͸ɺ খ͘͞ͳΔ΄ͲେҬͷಛ௃Λද͢ 

  59. ಛ௃Ϛοϓ͸ɺ খ͘͞ͳΔ΄ͲେҬͷಛ௃Λද͢ 

  60. దͨ͠αΠζͷηϧʹׂΓ౰ͯΔ 

  61. ৞ΈࠐΈΛ܁Γฦ͢աఔͰग़ྗ͢Δ  Single Shot MultiBox Detector https://arxiv.org/abs/1512.02325

  62.  Ϟσϧ

  63. conv 3x3x64 stride 1 conv 3x3x64
 stride 1 ReLU ReLU

    conv 3x3x128
 stride 1 conv 3x3x128
 stride 1 ReLU conv 3x3x256
 stride 1 conv 3x3x256
 stride 1 ReLU 256x256x1 max_pool 2x2 stride 2 max_pool 2x2 stride 2 ReLU ReLU 29x29x256
  64. conv 3x3x256
 stride 2 conv 3x3x256
 stride 2 conv 3x3x256


    stride 2 conv 1x1x5
 stride 1 Sigmoid conv 1x1x5
 stride 1 conv 1x1x5
 stride 1 Sigmoid conv 1x1x5
 stride 1 Sigmoid conv 1x1x5
 stride 1 Sigmoid 1x1x5 15x15x5 8x8x5 2x2x5 4x4x5 29x29x256 conv 1x1x5
 stride 1 Sigmoid conv 3x3x256
 stride 2 conv 3x3x256
 stride 2 Sigmoid 29x29x5
  65.  ֶशσʔλ

  66. ֶशσʔλ  confidence : 1.0 center_x : 0.7128 center_y :

    0.4045 width : 0.2184 height : 0.2382 1.0 : confidence 0.2150 : center_x 0.5169 : center_y 0.2117 : width 0.2151 : height confidence : 0.0 center_x : 0.0 center_y : 0.0 width : 0.0 height : 0.0
  67. ηϧͷׂΓ౰ͯج४ ÷ = Jaccard Overlap ҰఆҎ্ͷ஋ʢॏͳΓʣͷηϧʹׂΓ౰ͯ ↓ + - )

    ( 
  68. EFG@MBOE SFDU  XJEUISFDU<*/%&9@3*()5>SFDU<*/%&9@-&'5> IFJHIUSFDU<*/%&9@#0550.>SFDU<*/%&9@501> JGXJEUIPSIFJHIU SFUVSO SFUVSOXJEUI IFJHIU EFGKBDDBSE@PWFSMBQ

    SFDU SFDU  PWFSMBQ@MFGUNBY SFDU<*/%&9@-&'5> SFDU<*/%&9@-&'5>  PWFSMBQ@SJHIUNJO SFDU<*/%&9@3*()5> SFDU<*/%&9@3*()5>  PWFSMBQ@UPQNBY SFDU<*/%&9@501> SFDU<*/%&9@501>  PWFSMBQ@CPUUPNNJO SFDU<*/%&9@#0550.> SFDU<*/%&9@#0550.>  PWFSMBQ@MBOE <PWFSMBQ@MFGU PWFSMBQ@UPQ PWFSMBQ@SJHIU PWFSMBQ@CPUUPN>  VOJPO@MBOE SFDU  @MBOE SFDU PWFSMBQ SFUVSOPWFSMBQVOJPO 
  69. ޡࠩؔ਺ ֬৴౓ʢ$POpEFODFʣˠ.4&ʢ.FBO4RVBSFE&SSPSʣ ࠲ඪʢ3FHJPOʣˠ4NPPUI--PTT ʢ'BTU3$//IUUQTBSYJWPSHBCTʣ 

  70. def _smooth_l1_loss(x): with tf.name_scope("smooth_l1"): abs_x = tf.abs(x) less_mask = tf.cast(abs_x

    < 1.0, tf.float32) return less_mask * (0.5 * tf.square(x)) + (1.0 - less_mask) * (abs_x - 0.5) def _loc_loss(logits, groundtruth): offset = logits - groundtruth return _smooth_l1_loss(offset) 
  71. MSEʢMean Squared Errorʣ Smooth L1 Loss 

  72.  ֶशʹࣦഊʜʜ

  73. 1PTJUJWF4BNQMFͷׂ߹  841 29x29 225 15x15 64 8x8 1 4

    2x2 16 4x4 1,151
  74.  )BSE/FHBUJWF.JOJOH

  75. 1PTJUJWFαϯϓϧʹՃ͑ͯɺޡࠩͷେ͖ͳαϯϓϧΛબ୒ 

  76. /&("5*7&@$06/5@3"5& EFG@DBMD@MPTT@XJUI@IBSE@OFHBUJWF@NJOJOH MPTTFT QPTJUJWF@NBTL  QPTJUJWF@DPVOU  OFHBUJWF@DPVOUQPTJUJWF@DPVOU /&("5*7&@$06/5@3"5& CBUDI@TJ[FMPTTFTHFU@TIBQF

    <>WBMVF QPTJUJWF@MPTTFTMPTTFT QPTJUJWF@NBTL OFHBUJWF@MPTTFTMPTTFTQPTJUJWF@MPTTFT UPQ@OFHBUJWF@MPTTFT @UGOOUPQ@L UGSFTIBQF OFHBUJWF@MPTTFT  <CBUDI@TJ[F >  LUGDBTU OFHBUJWF@DPVOU UGJOU  MPTT UGSFEVDF@TVN QPTJUJWF@MPTTFTQPTJUJWF@DPVOU   UGSFEVDF@TVN UPQ@OFHBUJWF@MPTTFTOFHBUJWF@DPVOU  SFUVSOMPTT 
  77.  ֶशʹࣦഊʜʜ

  78. إ͸্ख͘ݕग़Ͱ͖͍ͯΔ͕ ࠲ඪ͕ਖ਼͘͠ͳ͍ 

  79.  EFGBVMU#PY ʢ"ODIPSʣ

  80. EFGBVMUCPYͷಋೖ  left : 0.5000 top : 0.2500 right :

    0.7500 bottom : 0.5000 0.0000 : left 0.5000 : top 0.2500 : right 0.7500 : bottom
  81. EFGBVMUCPYͱ࠲ඪ  left : 0.5000 top : 0.2500 right :

    0.7500 bottom : 0.5000 0.0000 : left 0.5000 : top 0.2500 : right 0.7500 : bottom left : 0.6036 top : 0.2854 right : 0.8220 bottom : 0.5236 0.1092 : left 0.4094 : top 0.3209 : right 0.6247 : bottom
  82. EFGBVMUCPYͱ࠲ඪͷ ࠩʢP⒎TFUʣΛઃఆ  left : 0.6036 top : 0.2854 right

    : 0.8220 bottom : 0.5236 left : 0.1036 top : 0.0354 right : 0.0720 bottom : 0.0236 0.1092 : left -0.0906 : top 0.0709 : right -0.1253 : bottom 0.1092 : left 0.4094 : top 0.3209 : right 0.6247 : bottom
  83. ग़ྗ   ʹҙຯ͕ੜ·ΕΔ 

  84. ׆ੑԽؔ਺ͷมߋʢग़ྗ૚ʣ left : 0.1036 top : 0.0354 right : 0.0720

    bottom : 0.0236 0.1092 : left -0.0906 : top 0.0709 : right -0.1253 : bottom 
  85. ׆ੑԽؔ਺ͷมߋʢग़ྗ૚ʣ Sigmoid Hyperbolic Tangent 

  86. ֶशɾ܇࿅   εςοϓʢόοναΠζ Ϛϧν(16ʣ
 ࠷దԽΞϧΰϦζϜ"EBN0QUJNJ[FS ֶश཰ ݮਰ TUFQ

  87. ؀ڥʢ͘͞ΒͷߴՐྗίϯϐϡʔςΟϯάʣ $169FPO$PSFʷ .FNPSZ(# 44%(# (F'PSDF(595*5"/9ʢ1BTDBMΞʔΩςΫνϟʣ(#ʷ (F'PSDF(595Jʢ1BTDBMΞʔΩςΫνϟʣ(#ʷ 

  88. ޡࠩͷมԽ 

  89. ݕূ 

  90. ݕূ 

  91.  ࠓޙͷ՝୊

  92. ΞεϖΫτൺ͕େ͖͍ը૾΁ͷରԠ 

  93. ΞεϖΫτൺͷҟͳΔEFGBVMUCPYͷಋೖ 

  94. ٕज़ॻయ̏ https://techbookfest.org/event/tbf03/ circle/5686003050217472 ೔࣌ɿ2017೥10݄22೔ʢ೔ʣ
 ɹɹɹ11࣌ʙ17࣌ ৔ॴɿΞΩόɾεΫΤΞ ओ࠵ɿTechbooster / ୡਓग़൛ձ Illustration:

    ࠜઇΕ͍ 
  95. ࢀߟ 4JOHMF4IPU.VMUJ#PY%FUFDUPS
 IUUQTBSYJWPSHBCT 44%4JOHMF4IPU.VMUJ#PY%FUFDUPSʢ೔ຊޠ༁ʣ2JJUB
 IUUQTRJJUBDPNEFUBJUFNTBFDFGDBB 5FOTPS'MPXͰإݕग़ثΛࣗ࡞͢Δ͗͢ΌʔΜϝϞ
 IUUQNFNPTVHZBODPNFOUSZ 5FOTPS'MPXͰ෺ମྖҬ༧ଌʢ3FHJPO1SPQPTBMʣΛࢼͯ͠ΈΔ
 IUUQXPSLQJMFTDPNUFOTPSqPXTTESFHJPOQSPQPTBM 

  96. ࢀߟ 'BTU3$//
 IUUQTBSYJWPSHBCT 'BTUFS3$//5PXBSET3FBM5*NF0CKFDU%FUFDUJPOXJUI3FHJPO1SPQPTBM/FUXPSLT
 IUUQTBSYJWPSHBCT :PV0OMZ-PPL0ODF6OJpFE 3FBM5JNF0CKFDU%FUFDUJPO
 IUUQTBSYJWPSHBCT +BDDBSE*OEFY
 IUUQTFOXJLJQFEJBPSHXJLJ+BDDBSE@JOEFY

    
  97. C-LIS CO., LTD. ຊࢿྉ͸ɺ༗ݶձࣾγʔϦεͷஶ࡞෺Ͱ͢ɻܝࡌ͞Ε͍ͯΔΠϥετ͸ɺಛʹهࡌ͕ͳ͍৔߹͸ࠜઇΕ͍ͷஶ࡞෺Ͱ͢ɻ ຊࢿྉͷશ෦ɺ·ͨ͸Ұ෦ʹ͍ͭͯɺஶ࡞ऀ͔ΒจॻʹΑΔڐ୚Λಘͣʹෳ੡͢Δ͜ͱ͸ې͡ΒΕ͍ͯ·͢ɻ 5IF"OESPJE4UVEJPJDPOJTSFQSPEVDFEPSNPEJpFEGSPNXPSLDSFBUFEBOETIBSFECZ(PPHMFBOEVTFEBDDPSEJOHUPUFSNTEFTDSJCFEJOUIF$SFBUJWF$PNNPOT"UUSJCVUJPO-JDFOTF ֤੡඼໊ɾϒϥϯυ໊ɺձ໊ࣾͳͲ͸ɺҰൠʹ֤ࣾͷ঎ඪ·ͨ͸ొ࿥঎ඪͰ͢ɻຊࢿྉதͰ͸ɺ˜ɺšɺäΛׂѪ͍ͯ͠·͢ɻ 5IF"OESPJESPCPUJTSFQSPEVDFEPSNPEJpFEGSPNXPSLDSFBUFEBOETIBSFECZ(PPHMFBOEVTFEBDDPSEJOHUPUFSNTEFTDSJCFEJOUIF$SFBUJWF$PNNPOT"UUSJCVUJPO-JDFOTF https://speakerdeck.com/keiji/devfest-tokyo-2017-tensorflow