Slide 1

Slide 1 text

MobileNetV2: Inverted Residuals and Linear Bo!lenecks ୈ46ճ ίϯϐϡʔλϏδϣϯษڧձ@ؔ౦ 20180701 ٠ా ངฏ (@yohei_kikuta) Event URL: https://kantocv.connpass.com/event/88613/, paper: https://arxiv.org/abs/1801.04381

Slide 2

Slide 2 text

ࣗݾ঺հ name: Yohei KIKUTA company: Cookpad Inc. twitter: @yohei_kikuta GitHub: yoheikikuta resume: https://github.com/yoheikikuta/resume blog: ݪཧతʹ͸Մೳ https://yoheikikuta.github.io/ 2

Slide 3

Slide 3 text

·ͱΊ 1. MobileNetV1 ͔Βൃలͤͨ͞Ϟσϧ 2. ઢܕ૚ͷؒʹνϟϯωϧ਺Λ֦େͯ͠ separable convolution ΛೖΕΔͱ͍͏ building block ΛఏҊ → ReLU ͷ(ඇ)ઢܗੑͱදݱྗΛߟ࡯ͨ݁͠Ռͷߏ଄ 3. ࣮ݧͰ΋ NASNet ΑΓߴ଎Ͱಉ౳Ҏ্ͷ݁Ռ 4. ML Kit Λ࢖ͬͯ mobile Ͱ࣮ࡍʹಈ͔ͯ͠Έͨ → ΫοΫύου։ൃऀϒϩά Blog URL: [https://techlife.cookpad.com/entry/2018/07/05/090000 3

Slide 4

Slide 4 text

ML Kit Λ࢖ͬͨ MobileNetV2 ʹΑΔྉཧ/ඇྉཧ൑ఆ https://techlife.cookpad.com/entry/2018/07/05/090000 GIF file: https://i.imgur.com/DRHVejp.gifv 4

Slide 5

Slide 5 text

എܠ 5

Slide 6

Slide 6 text

mobile ʹࡌΔܰྔͳϞσϧΛ࡞Γ͍ͨ ػցֶशͷ mobile Ҡߦ͕ਐΜͰ͍͖ͦ͏ ML Kit ΍ Create ML ͳͲͷొ৔Ͱػӡ͕ߴ·͍ͬͯΔ εϐʔυ΍ϓϥΠόγʔ΍εέʔϥϏϦςΟ౳ͷ؍఺Ͱॏཁ deep learning ʹ͓͍ͯ΋ architecture ୳ࡧͷҰͭͷํ޲ੑ ML Kit: https://developers.google.com/ml-kit/, Create ML: https://developer.apple.com/documentation/create_ml 6

Slide 7

Slide 7 text

mobile ༻ͷϞσϦϯάͷํ޲ੑ architecture ʹؔͯ͠͸ۭؒํ޲ͱνϟϯωϧํ޲ͷऔΓѻ͍ Λ޻෉͢Δͷ͕Ұͭͷைྲྀ → {separable, group, shuffle} convolution ͳͲ͕୅දత ଞʹ΋ܰྔԽͷٕज़͕͋Δ͕ຊ࿦จͷείʔϓ֎ʢซ༻΋Մʣ - ਺஋ਫ਼౓Λམͱ͢͜ͱʹΑΔσʔλαΠζͷ࡟ݮ - ྔࢠԽ΍ූ߸ԽʹΑΔσʔλαΠζͷ࡟ݮ - ৠཹͳͲΛ༻͍ͨΑΓখ͞ͳϞσϧ΁ͷస׵ 7

Slide 8

Slide 8 text

mobile ༻ͷϞσϦϯάͷํ޲ੑ architecture ͷ୳ࡧ͸൚༻తͳ΋ͷΛର৅ͱ͢Δ৔߹͕ଟ͍ ʢ࣮ݧʹΑΔݕূΛܦͯزڐ͔ଞͷཁૉʹ tune ͞ΕΔ͕ʣ ଞͷཁૉɺຊ࿦จͰ͸׆ੑԽؔ਺ɺʹґڌͨ͠ architecture Ͱ͋Ε͹ͦͷҙຯͰಛघ͕ͩΑΓޮ཰తʹͳΓಘΔ ຊ࿦จͰ͸ ReLU ͷಛੑʹجͮ͘ building block ΛఏҊ 8

Slide 9

Slide 9 text

MobileNetV2 ʹࢸΔྲྀΕͱͦͷपล 9

Slide 10

Slide 10 text

architecture ͷมભ ಛ௃తͳߏ଄Λ΋ͨΒͨ͠ CNN ͷϞσϧΛҰ෦঺հ - Network In Network: Gloval average pooling - VGG: Stacking of 3 3 convolution - ResNet: Residual connection - Inception(V3): Inception module - SqueezeNet: Fire module - ENet: Early stage down sampling - DenseNet: Dense convolution - Xception: Separable convolution - SENet: Squueze and excitation block 10

Slide 11

Slide 11 text

architecture ͷࣗಈ୳ࡧ convolution ͳͲͷجຊతͳԋࢉͷύλʔϯΛ͍͔ͭ͘४උ ͦΕΒΛ૊߹ͤͯ࠷దͳ building block Λ୳ࡧ - NASNet ڧԽֶशͷ࿮૊ΈͰ࠷దԽ - AmoebaNet ਐԽܭࢉͷ࿮૊ΈͰ࠷దԽ - DARTS ࿈ଓ؇࿨໰୊ͱͯ͠ޯ഑๏ϕʔεͰ࠷దԽ NASNet: https://arxiv.org/abs/1707.07012, AmoebaNet: https://arxiv.org/abs/1802.01548, DARTS: https://arxiv.org/abs/1806.09055 11

Slide 12

Slide 12 text

architecture ͷࣗಈ୳ࡧ ࣮ݧʹΑΔൺֱʢImageNet classificationʣ ਤ͸ https://arxiv.org/abs/1806.09055 ͔ΒҾ༻ 12

Slide 13

Slide 13 text

ܰྔͳ architecture ͷ୳ࡧ ࣗಈ୳ࡧ͸ڧྗ͕ܾͩΊΒΕͨԋࢉͷ࿮಺Ͱͷ૊߹ͤ → ܭࢉྔతʹઙ͍૚਺Ͱ૊Ή architecture ͸ௐ΂΍͍͢ ͜ΕΑΓߴੑೳͳϞσϧΛ࡞Δʹ͸৽͍͠ΞΠσΞ͕ඞཁ MobileNetV2 Ͱ͸ ReLU ͷಛ௃ʹண໨ͨ͠ߏ଄Λݕ౼ → ReLU ʹಛԽͨ͠৽͍͠ building block ΛߟҊ ൚༻తͰͳ͍͔΋͠ΕΜ͕ߏΘΜʂͱ͍͏੎͍Λײ͡Δ MobileNetV1 ͷվળΛߟ͑ͨ݁Ռͱͯ͠ḷΓண͍ͨΑ͏ʹࢥΘΕΔ 13

Slide 14

Slide 14 text

ReLU ͷಛ௃ 14

Slide 15

Slide 15 text

ReLU ͱ ReLU6 ͷఆٛ ReLU: ReLU6: ਤ͸ https://www.desmos.com/calculator/865rohnewg Ͱ࡞੒ 15

Slide 16

Slide 16 text

ReLU ͷදݱྗ ReLU( ) ͳΔม׵Ͱඇθϩͷ volume ͕࢒Δ৔߹Λߟ͑Δ ͷ಺෦ʹ map ͞ΕΔ఺͸ ͱ͍͏ઢܗม׵ͦͷ΋ͷ → ग़ྗͷඇθϩྖҬʹ͓͚Δදݱྗ͸ઢܗม׵ ReLU Ͱ௵ΕΔྖҬ΋͋ΔͷͰҰൠʹ৘ใ͕૕ࣦ͢Δ͕ɺ ௵ΕΔྖҬԼݶ͸ ͳΔ ReLU ม׵Ͱ ͷͱ͖ ূ໌͸ݪ࿦จͷ Appendix A ͷ Theorem 1 ͷ Proof 16

Slide 17

Slide 17 text

ReLU ͷલޙͰνϟϯωϧ਺Λे෼େ͖͘͢Ε͹৘ใ͸૕ࣦ͠ͳ͍ ௿࣍ݩ mfd. Λ dim = m ࣍ݩʹม׵ͯ͠ ReLU ͯ͠ݩʹ໭͢ m ͕খ͍͞ͱ৘ใ͕૕ࣦ͢Δ͕ɺେ͖͚Ε͹૕ࣦ͠ͳ͍ ҰํͰେ͖͗͢Δͱมܗ͕ஶ͍͠෦෼΋ݱΕΔ → νϟϯωϧ਺ͷ֦େ͸దਖ਼஋͕͋Δʢ࿦จͰ͸ 6 ഒʣ ਤ͸ https://arxiv.org/abs/1801.04381 ΑΓҾ༻, 6 ഒ͸͜ͷਤͰݴ͑͹ dim=12 ͱͳΔ͜ͱʹ஫ҙʢͦͷ৔߹΋ਤ͓ࣔͯ͘͠΂͖ؾ΋͢Δ͕ʣ 17

Slide 18

Slide 18 text

ReLU Λ࢖ͬͨߏ଄ͷॏཁͳ఺ͷ·ͱΊ ͜Ε·Ͱͷ؍ଌΛৼΓฦΔͱҎԼͷೋ఺͕ॏཁ - ReLU ʹΑΔม׵ޙʹඇθϩͱͳΔྖҬ͸ઢܗม׵ʹରԠ → linear layer ΛೖΕͯಛ௃ྔΛநग़͢Δͷ͕ྑͦ͞͏ - ReLU ʹΑΔ৘ใ૕ࣦ͸ม׵ޙͷνϟϯωϧ਺૿ՃͰ๷͛Δ → ී௨ͷ residual ͷνϟϯωϧ਺มԽͱ͸ٯύλʔϯ ͜ΕΛ࢖ͬͯ৽͍͠ building block ΛఏҊ͢Δ 18

Slide 19

Slide 19 text

MobileNetV2 ͷߏ଄ 19

Slide 20

Slide 20 text

Inverted residuals and linear bo!lenecks residual connection ͷ͋Δࣼઢͷ૚͸ linear activation ෯͸νϟϯωϧ਺ͰதؒͰେ͖͘ͳΔΑ͏ʹઃܭ தؒͷ૚Ͱ͸ separable convolution Λ࢖༻ ਤ͸ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 20

Slide 21

Slide 21 text

Inverted residuals and linear bo!lenecks skip connection ͸ s=1 ͷͱ͖ͷΈషΒΕΔ΋ͷͰɺs≠1 Ͱ͸ແ͠ 21

Slide 22

Slide 22 text

Inverted residuals and linear bo!lenecks (bo!leneck) ͷܭࢉྔ k×k convolution with s=1 Ͱ Λߟ͑Δ normal convolution: separable convlution: → bottleneck: = 22

Slide 23

Slide 23 text

Inverted residuals and linear bo!lenecks (bo!leneck) ͷܭࢉྔ ௨ৗͷ residual block ͱൺ΂ͯܭࢉྔ͕গͳ͍Θ͚Ͱ͸ͳ͍ bottleneck ͷೖྗͷνϟϯωϧ͸ൺֱతখ͘͞Ͱ͖Δ͕ɺ தؒͷ૚Ͱνϟϯωϧ਺Λ֦େ͢ΔͷͰҰൠʹ͸ඇࣗ໌ ʢೖྗ͕ݮΒͤΔͷ͸௿࣍ݩ mfd. ʹ৘ใ͕ॅΉͱ͍͏Ծఆʣ → ݁Ռͱͯ͠͸ܭࢉྔ multipy-adds (MAdd) ͕ݮΒͤΔ ύϥϝλௐ੔Λ͠ͳ͕Β architecture Λ࡞ͬͨΒܭࢉྔΛ཈্͑ͨͰਫ਼౓͕ߴ͍΋ͷ΋࡞Εͨͱ͍͏ఔ౓ͱࢥ͏ 23

Slide 24

Slide 24 text

Inverted residuals and linear bo!lenecks (bo!leneck) ͷϝϞϦޮ཰ ೖྗνϟϯωϧ਺ΛݮΒͤΔͷ͸ϝϞϦͷ؍఺͔Β͸༗ར ਪଌ࣌ʹඞཁͱͳΔ max ͷϝϞϦ͸ҎԼͷࣜͰܾ·Δ ೖྗͱग़ྗͷ࣍ݩ͕ॏཁʹͳΔͷͰ MobileNetV2 ͕༗ར ʢதؒͷ૚͸࢖͍ࣺͯͩ͠νϟϯωϧຖʹॲཧ͕Ͱ͖Δ ʣ ฒྻʹܭࢉ͢ΔͳΒෳ਺νϟϯωϧͷ৘ใΛಉ࣌ʹඞཁͱࢥ͏͕ɺCPU ͰͷܭࢉΛ૝ఆ͍ͯ͠Δͦ͠͏͍͏͜ͱͬΆ͍ʁ 24

Slide 25

Slide 25 text

MobileNetV2 ͷϞσϧ architecture σϑΥϧτͷϞσϧͷશମ૾͸ҎԼ( ͸܁Γฦ͠਺) νϟϯωϧ਺ ͸ width multiplier Ͱ ͱมߋ͠ಘΔ ਤ͸ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 25

Slide 26

Slide 26 text

MobileNetV2 ʹΑΔਪଌ࣌ͷඞཁϝϞϦαΠζ νϟϯωϧ਺ / ϝϞϦ[kb] ͷ࠷େ਺ͷൺֱ ಉछϞσϧͱൺֱ͔ͯ͠ͳΓখ͘͞ͳ͍ͬͯΔ ਤ͸ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 26

Slide 27

Slide 27 text

ੑೳධՁ 27

Slide 28

Slide 28 text

ImageNet classification ͷ݁Ռ NASNet ΍ ShuffleNet ͱൺ΂ͯಉ౳Ҏ্ͷੑೳͰ͔ͭ଎͍ ਤ͸ https://arxiv.org/abs/1801.04381 ΑΓҾ༻, V2 ͷ 1.0 ΍ 1.4 ͱ͍͏ͷ͸ width multiplier 28

Slide 29

Slide 29 text

COCO σʔληοτͰͷ object detection ͷ݁Ռ ͔ͳΓ͍ܰϞσϧͰѱ͘ͳ͍݁Ռ ʢMobileNetV1 Ҏ֎ͱͷൺֱ͕ෆे෼͕ͩ...ʣ ਤ͸ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 29

Slide 30

Slide 30 text

PASCAL VOC 2012 σʔληοτͰͷ semantic segmentation ͷ݁Ռ ͔ͳΓ͍ܰϞσϧͰѱ͘ͳ͍݁Ռ ʢMobileNetV1 Ҏ֎ͱͷൺֱ͕ෆे෼͕ͩ...ʣ ਤ͸ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 30

Slide 31

Slide 31 text

ઢܕੑͷॏཁੑͱ skip connection ͷுΓํʹؔ͢Δ࣮ݧʢImageNet classification) linear bottleneck ʹඇઢܗੑΛೖΕΔͱਫ਼౓͕མͪΔ skip connection ͸ bottleneck ؒʹுΔͷ͕ྑ͍ ʢલऀ: ReLU ʹؔ͢Δߟ࡯ͱ߹கɺޙऀ: ୯ʹ࣮ݧͷ݁Ռʣ ਤ͸ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 31

Slide 32

Slide 32 text

ML Kit Λ༻͍࣮ͨ૷ 32

Slide 33

Slide 33 text

ML Kit ͱ͸ Mobile app. ޲͚ʹػցֶशػೳΛ૊ΈࠐΉͨΊͷ SDK Firebase ͷػೳͱͯ͠ఏڙ͞Ε͍ͯΔ ݱ࣌఺ͰαϯϓϧͰ͸ͳͯࣗ͘෼Ͱ४උͨ͠ custom model Λಈ͔͢ͷ͸݁ߏେม... ؤுͬͯಈ͘Α͏ʹͨ͠ɿFirebase ML KitͰࣗ࡞ͷΧελϜ ϞσϧΛ࢖ͬͯྉཧɾඇྉཧը૾Λ൑ఆͰ͖ΔΑ͏ʹͨ͠ ML Kit: https://developers.google.com/ml-kit/, Firebase: https://firebase.google.com/ 33

Slide 34

Slide 34 text

·ͱΊ 34

Slide 35

Slide 35 text

·ͱΊʢ࠶ܝʣ 1. MobileNetV1 ͔Βൃలͤͨ͞Ϟσϧ 2. ઢܕ૚ͷؒʹνϟϯωϧ਺Λ֦େͯ͠ separable convolution ΛೖΕΔͱ͍͏ building block ΛఏҊ → ReLU ͷ(ඇ)ઢܗੑͱදݱྗΛߟ࡯ͨ݁͠Ռͷߏ଄ 3. ࣮ݧͰ΋ NASNet ΑΓߴ଎Ͱಉ౳Ҏ্ͷ݁Ռ 4. ML Kit Λ࢖ͬͯ mobile Ͱ࣮ࡍʹಈ͔ͯ͠Έͨ → ΫοΫύου։ൃऀϒϩά 35