Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
20180701_CVPR2018_reading_YoheiKIKUTA
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
yoppe
July 01, 2018
Science
1.3k
3
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
20180701_CVPR2018_reading_YoheiKIKUTA
Event HP:
https://kantocv.connpass.com/event/88613
yoppe
July 01, 2018
More Decks by yoppe
See All by yoppe
20211023_recsys2021_paper_reading_YoheiKikuta
diracdiego
1
520
20201121_oldpaperreading_computing_machinery_and_intelligence
diracdiego
0
190
20200906_ACL2020_metric_for_ordinal_classification_YoheiKikuta
diracdiego
1
1.3k
20191102_ACL2019_adversarial_examples_in_NLP_YoheiKIKUTA
diracdiego
2
1.5k
20190223_nlpaperchallenge_CV_4.3to5.5
diracdiego
2
860
20180414_WSDM2018_reading_YoheiKIKUTA
diracdiego
0
750
20180306_NIPS2017_DeepLearning
diracdiego
4
6k
20180215_MLKitchen7_YoheiKIKUTA
diracdiego
0
480
20180210_Cookpad_TechConf2018_YoheiKIKUTA
diracdiego
5
1.3k
Other Decks in Science
See All in Science
SHINOMIYA Nariyoshi
genomethica
0
150
TypeScript で WebAssembly を用いた 型安全なプラグイン設計
nagano
2
520
Conversation is the New Dashboard: 属人性を排除する第4世代BIツールの勢力図
shomaekawa
1
590
(2025) Balade en cyclotomie
mansuy
0
630
Non-Gaussian, nonlinear causal discovery with hidden variables and application
sshimizu2006
0
130
Accelerating operator Sinkhorn iteration with overrelaxation
tasusu
0
360
Distributional Regression
tackyas
0
540
JSAI2026企画セッションKS-14 インタビュー集『⼈⼯知能と哲学と四つの問い』が提起する⼈⼯知能のこれからの課題 趣旨説明 / JSAI2026 Special Session: A Collection of Interviews, “Artificial Intelligence, Philosophy, and Four Questions”
ykiyota
0
160
Testing the Longevity Bottleneck Hypothesis
chinson03
0
320
Cross-Media Technologies, Information Science and Human-Information Interaction
signer
PRO
3
32k
生成AIと司法書士の未来.pdf
tagtag
PRO
0
130
How we plan to publish 1,000 bio-logging datasets to GBIF and OBIS
peterdesmet
0
110
Featured
See All Featured
The Curse of the Amulet
leimatthew05
1
13k
Side Projects
sachag
455
43k
Balancing Empowerment & Direction
lara
6
1.2k
Design and Strategy: How to Deal with People Who Don’t "Get" Design
morganepeng
133
19k
Marketing Yourself as an Engineer | Alaka | Gurzu
gurzu
0
230
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.5k
Lessons Learnt from Crawling 1000+ Websites
charlesmeaden
PRO
1
1.3k
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.9k
The Invisible Side of Design
smashingmag
302
52k
How to optimise 3,500 product descriptions for ecommerce in one day using ChatGPT
katarinadahlin
PRO
1
3.6k
Crafting Experiences
bethany
1
180
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
1
250
Transcript
MobileNetV2: Inverted Residuals and Linear Bo!lenecks ୈ46ճ ίϯϐϡʔλϏδϣϯษڧձ@ؔ౦ 20180701 ٠ా
ངฏ (@yohei_kikuta) Event URL: https://kantocv.connpass.com/event/88613/, paper: https://arxiv.org/abs/1801.04381
ࣗݾհ name: Yohei KIKUTA company: Cookpad Inc. twitter: @yohei_kikuta GitHub:
yoheikikuta resume: https://github.com/yoheikikuta/resume blog: ݪཧతʹՄೳ https://yoheikikuta.github.io/ 2
·ͱΊ 1. MobileNetV1 ͔Βൃలͤͨ͞Ϟσϧ 2. ઢܕͷؒʹνϟϯωϧΛ֦େͯ͠ separable convolution ΛೖΕΔͱ͍͏ building
block ΛఏҊ → ReLU ͷ(ඇ)ઢܗੑͱදݱྗΛߟͨ݁͠Ռͷߏ 3. ࣮ݧͰ NASNet ΑΓߴͰಉҎ্ͷ݁Ռ 4. ML Kit Λͬͯ mobile Ͱ࣮ࡍʹಈ͔ͯ͠Έͨ → ΫοΫύου։ൃऀϒϩά Blog URL: [https://techlife.cookpad.com/entry/2018/07/05/090000 3
ML Kit Λͬͨ MobileNetV2 ʹΑΔྉཧ/ඇྉཧఆ https://techlife.cookpad.com/entry/2018/07/05/090000 GIF file: https://i.imgur.com/DRHVejp.gifv 4
എܠ 5
mobile ʹࡌΔܰྔͳϞσϧΛ࡞Γ͍ͨ ػցֶशͷ mobile Ҡߦ͕ਐΜͰ͍͖ͦ͏ ML Kit Create ML
ͳͲͷొͰػӡ͕ߴ·͍ͬͯΔ εϐʔυϓϥΠόγʔεέʔϥϏϦςΟͷ؍Ͱॏཁ deep learning ʹ͓͍ͯ architecture ୳ࡧͷҰͭͷํੑ ML Kit: https://developers.google.com/ml-kit/, Create ML: https://developer.apple.com/documentation/create_ml 6
mobile ༻ͷϞσϦϯάͷํੑ architecture ʹۭؔͯؒ͠ํͱνϟϯωϧํͷऔΓѻ͍ Λ͢Δͷ͕Ұͭͷைྲྀ → {separable, group, shuffle} convolution
ͳͲ͕දత ଞʹܰྔԽͷٕज़͕͋Δ͕ຊจͷείʔϓ֎ʢซ༻Մʣ - ਫ਼Λམͱ͢͜ͱʹΑΔσʔλαΠζͷݮ - ྔࢠԽූ߸ԽʹΑΔσʔλαΠζͷݮ - ৠཹͳͲΛ༻͍ͨΑΓখ͞ͳϞσϧͷస 7
mobile ༻ͷϞσϦϯάͷํੑ architecture ͷ୳ࡧ൚༻తͳͷΛରͱ͢Δ߹͕ଟ͍ ʢ࣮ݧʹΑΔݕূΛܦͯزڐ͔ଞͷཁૉʹ tune ͞ΕΔ͕ʣ ଞͷཁૉɺຊจͰ׆ੑԽؔɺʹґڌͨ͠ architecture Ͱ͋ΕͦͷҙຯͰಛघ͕ͩΑΓޮతʹͳΓಘΔ
ຊจͰ ReLU ͷಛੑʹجͮ͘ building block ΛఏҊ 8
MobileNetV2 ʹࢸΔྲྀΕͱͦͷपล 9
architecture ͷมભ ಛతͳߏΛͨΒͨ͠ CNN ͷϞσϧΛҰ෦հ - Network In Network: Gloval
average pooling - VGG: Stacking of 3 3 convolution - ResNet: Residual connection - Inception(V3): Inception module - SqueezeNet: Fire module - ENet: Early stage down sampling - DenseNet: Dense convolution - Xception: Separable convolution - SENet: Squueze and excitation block 10
architecture ͷࣗಈ୳ࡧ convolution ͳͲͷجຊతͳԋࢉͷύλʔϯΛ͍͔ͭ͘४උ ͦΕΒΛ߹ͤͯ࠷దͳ building block Λ୳ࡧ - NASNet
ڧԽֶशͷΈͰ࠷దԽ - AmoebaNet ਐԽܭࢉͷΈͰ࠷దԽ - DARTS ࿈ଓ؇ͱͯ͠ޯ๏ϕʔεͰ࠷దԽ NASNet: https://arxiv.org/abs/1707.07012, AmoebaNet: https://arxiv.org/abs/1802.01548, DARTS: https://arxiv.org/abs/1806.09055 11
architecture ͷࣗಈ୳ࡧ ࣮ݧʹΑΔൺֱʢImageNet classificationʣ ਤ https://arxiv.org/abs/1806.09055 ͔ΒҾ༻ 12
ܰྔͳ architecture ͷ୳ࡧ ࣗಈ୳ࡧڧྗ͕ܾͩΊΒΕͨԋࢉͷͰͷ߹ͤ → ܭࢉྔతʹઙ͍ͰΉ architecture ௐ͍͢ ͜ΕΑΓߴੑೳͳϞσϧΛ࡞Δʹ৽͍͠ΞΠσΞ͕ඞཁ MobileNetV2
Ͱ ReLU ͷಛʹணͨ͠ߏΛݕ౼ → ReLU ʹಛԽͨ͠৽͍͠ building block ΛߟҊ ൚༻తͰͳ͍͔͠ΕΜ͕ߏΘΜʂͱ͍͏͍Λײ͡Δ MobileNetV1 ͷվળΛߟ͑ͨ݁Ռͱͯ͠ḷΓண͍ͨΑ͏ʹࢥΘΕΔ 13
ReLU ͷಛ 14
ReLU ͱ ReLU6 ͷఆٛ ReLU: ReLU6: ਤ https://www.desmos.com/calculator/865rohnewg Ͱ࡞ 15
ReLU ͷදݱྗ ReLU( ) ͳΔมͰඇθϩͷ volume ͕Δ߹Λߟ͑Δ ͷ෦ʹ map ͞ΕΔ
ͱ͍͏ઢܗมͦͷͷ → ग़ྗͷඇθϩྖҬʹ͓͚Δදݱྗઢܗม ReLU Ͱ௵ΕΔྖҬ͋ΔͷͰҰൠʹใ͕ࣦ͢Δ͕ɺ ௵ΕΔྖҬԼݶ ͳΔ ReLU มͰ ͷͱ͖ ূ໌ݪจͷ Appendix A ͷ Theorem 1 ͷ Proof 16
ReLU ͷલޙͰνϟϯωϧΛेେ͖͘͢Εใࣦ͠ͳ͍ ࣍ݩ mfd. Λ dim = m ࣍ݩʹมͯ͠ ReLU
ͯ͠ݩʹ͢ m ͕খ͍͞ͱใ͕ࣦ͢Δ͕ɺେ͖͚Εࣦ͠ͳ͍ ҰํͰେ͖͗͢Δͱมܗ͕ஶ͍͠෦ݱΕΔ → νϟϯωϧͷ֦େదਖ਼͕͋ΔʢจͰ 6 ഒʣ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻, 6 ഒ͜ͷਤͰݴ͑ dim=12 ͱͳΔ͜ͱʹҙʢͦͷ߹ਤ͓͖ࣔͯ͘͠ؾ͢Δ͕ʣ 17
ReLU Λͬͨߏͷॏཁͳͷ·ͱΊ ͜Ε·Ͱͷ؍ଌΛৼΓฦΔͱҎԼͷೋ͕ॏཁ - ReLU ʹΑΔมޙʹඇθϩͱͳΔྖҬઢܗมʹରԠ → linear layer ΛೖΕͯಛྔΛநग़͢Δͷ͕ྑͦ͞͏
- ReLU ʹΑΔใࣦมޙͷνϟϯωϧ૿ՃͰ͛Δ → ී௨ͷ residual ͷνϟϯωϧมԽͱٯύλʔϯ ͜ΕΛͬͯ৽͍͠ building block ΛఏҊ͢Δ 18
MobileNetV2 ͷߏ 19
Inverted residuals and linear bo!lenecks residual connection ͷ͋Δࣼઢͷ linear activation
෯νϟϯωϧͰதؒͰେ͖͘ͳΔΑ͏ʹઃܭ தؒͷͰ separable convolution Λ༻ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 20
Inverted residuals and linear bo!lenecks skip connection s=1 ͷͱ͖ͷΈషΒΕΔͷͰɺs≠1
Ͱແ͠ 21
Inverted residuals and linear bo!lenecks (bo!leneck) ͷܭࢉྔ k×k convolution with
s=1 Ͱ Λߟ͑Δ normal convolution: separable convlution: → bottleneck: = 22
Inverted residuals and linear bo!lenecks (bo!leneck) ͷܭࢉྔ ௨ৗͷ residual block
ͱൺͯܭࢉྔ͕গͳ͍Θ͚Ͱͳ͍ bottleneck ͷೖྗͷνϟϯωϧൺֱతখ͘͞Ͱ͖Δ͕ɺ தؒͷͰνϟϯωϧΛ֦େ͢ΔͷͰҰൠʹඇࣗ໌ ʢೖྗ͕ݮΒͤΔͷ࣍ݩ mfd. ʹใ͕ॅΉͱ͍͏Ծఆʣ → ݁Ռͱͯ͠ܭࢉྔ multipy-adds (MAdd) ͕ݮΒͤΔ ύϥϝλௐΛ͠ͳ͕Β architecture Λ࡞ͬͨΒܭࢉྔΛ্͑ͨͰਫ਼͕ߴ͍ͷ࡞Εͨͱ͍͏ఔͱࢥ͏ 23
Inverted residuals and linear bo!lenecks (bo!leneck) ͷϝϞϦޮ ೖྗνϟϯωϧΛݮΒͤΔͷϝϞϦͷ؍͔Β༗ར ਪଌ࣌ʹඞཁͱͳΔ max
ͷϝϞϦҎԼͷࣜͰܾ·Δ ೖྗͱग़ྗͷ࣍ݩ͕ॏཁʹͳΔͷͰ MobileNetV2 ͕༗ར ʢதؒͷ͍ࣺͯͩ͠νϟϯωϧຖʹॲཧ͕Ͱ͖Δ ʣ ฒྻʹܭࢉ͢ΔͳΒෳνϟϯωϧͷใΛಉ࣌ʹඞཁͱࢥ͏͕ɺCPU ͰͷܭࢉΛఆ͍ͯ͠Δͦ͠͏͍͏͜ͱͬΆ͍ʁ 24
MobileNetV2 ͷϞσϧ architecture σϑΥϧτͷϞσϧͷશମ૾ҎԼ( ܁Γฦ͠) νϟϯωϧ width multiplier Ͱ
ͱมߋ͠ಘΔ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 25
MobileNetV2 ʹΑΔਪଌ࣌ͷඞཁϝϞϦαΠζ νϟϯωϧ / ϝϞϦ[kb] ͷ࠷େͷൺֱ ಉछϞσϧͱൺֱ͔ͯ͠ͳΓখ͘͞ͳ͍ͬͯΔ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻
26
ੑೳධՁ 27
ImageNet classification ͷ݁Ռ NASNet ShuffleNet ͱൺͯಉҎ্ͷੑೳͰ͔͍ͭ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻,
V2 ͷ 1.0 1.4 ͱ͍͏ͷ width multiplier 28
COCO σʔληοτͰͷ object detection ͷ݁Ռ ͔ͳΓ͍ܰϞσϧͰѱ͘ͳ͍݁Ռ ʢMobileNetV1 Ҏ֎ͱͷൺֱ͕ෆे͕ͩ...ʣ ਤ https://arxiv.org/abs/1801.04381
ΑΓҾ༻ 29
PASCAL VOC 2012 σʔληοτͰͷ semantic segmentation ͷ݁Ռ ͔ͳΓ͍ܰϞσϧͰѱ͘ͳ͍݁Ռ ʢMobileNetV1 Ҏ֎ͱͷൺֱ͕ෆे͕ͩ...ʣ
ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 30
ઢܕੑͷॏཁੑͱ skip connection ͷுΓํʹؔ͢Δ࣮ݧʢImageNet classification) linear bottleneck ʹඇઢܗੑΛೖΕΔͱਫ਼͕མͪΔ skip connection
bottleneck ؒʹுΔͷ͕ྑ͍ ʢલऀ: ReLU ʹؔ͢Δߟͱ߹கɺޙऀ: ୯ʹ࣮ݧͷ݁Ռʣ ਤ https://arxiv.org/abs/1801.04381 ΑΓҾ༻ 31
ML Kit Λ༻͍࣮ͨ 32
ML Kit ͱ Mobile app. ͚ʹػցֶशػೳΛΈࠐΉͨΊͷ SDK Firebase ͷػೳͱͯ͠ఏڙ͞Ε͍ͯΔ ݱ࣌ͰαϯϓϧͰͳͯࣗ͘Ͱ४උͨ͠
custom model Λಈ͔͢ͷ݁ߏେม... ؤுͬͯಈ͘Α͏ʹͨ͠ɿFirebase ML KitͰࣗ࡞ͷΧελϜ ϞσϧΛͬͯྉཧɾඇྉཧը૾ΛఆͰ͖ΔΑ͏ʹͨ͠ ML Kit: https://developers.google.com/ml-kit/, Firebase: https://firebase.google.com/ 33
·ͱΊ 34
·ͱΊʢ࠶ܝʣ 1. MobileNetV1 ͔Βൃలͤͨ͞Ϟσϧ 2. ઢܕͷؒʹνϟϯωϧΛ֦େͯ͠ separable convolution ΛೖΕΔͱ͍͏ building
block ΛఏҊ → ReLU ͷ(ඇ)ઢܗੑͱදݱྗΛߟͨ݁͠Ռͷߏ 3. ࣮ݧͰ NASNet ΑΓߴͰಉҎ্ͷ݁Ռ 4. ML Kit Λͬͯ mobile Ͱ࣮ࡍʹಈ͔ͯ͠Έͨ → ΫοΫύου։ൃऀϒϩά 35