Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
NIPS2017reading_3Dreconstruction
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
望月紅葉さんと幸せな家庭を築きたい
January 27, 2018
Research
0
1.5k
NIPS2017reading_3Dreconstruction
望月紅葉さんと幸せな家庭を築きたい
January 27, 2018
Tweet
Share
More Decks by 望月紅葉さんと幸せな家庭を築きたい
See All by 望月紅葉さんと幸せな家庭を築きたい
shadow-detection-with-conditional-generative-adversarial-networks
momijifullmoon
0
160
unsupervised-learning-of-depth-and-ego-motion-from-monocular-video-using-3d-geometric-constraints
momijifullmoon
0
470
ABEJA Innovation Meetup NIPS PointNet++
momijifullmoon
1
500
Other Decks in Research
See All in Research
Akamaiのキャッシュ効率を支えるAdaptSizeについての論文を読んでみた
bootjp
1
440
Earth AI: Unlocking Geospatial Insights with Foundation Models and Cross-Modal Reasoning
satai
3
480
Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification
satai
3
590
情報技術の社会実装に向けた応用と課題:ニュースメディアの事例から / appmech-jsce 2025
upura
0
310
令和最新技術で伝統掲示板を再構築: HonoX で作る型安全なスレッドフロート型掲示板 / かろっく@calloc134 - Hono Conference 2025
calloc134
0
550
姫路市 -都市OSの「再実装」-
hopin
0
1.6k
POI: Proof of Identity
katsyoshi
0
140
都市交通マスタープランとその後への期待@熊本商工会議所・熊本経済同友会
trafficbrain
0
120
Community Driveプロジェクト(CDPJ)の中間報告
smartfukushilab1
0
170
ローテーション別のサイドアウト戦略 ~なぜあのローテは回らないのか?~
vball_panda
0
280
AI Agentの精度改善に見るML開発との共通点 / commonalities in accuracy improvements in agentic era
shimacos
4
1.3k
Thirty Years of Progress in Speech Synthesis: A Personal Perspective on the Past, Present, and Future
ktokuda
0
170
Featured
See All Featured
Optimizing for Happiness
mojombo
379
71k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
194
17k
The untapped power of vector embeddings
frankvandijk
1
1.6k
How to audit for AI Accessibility on your Front & Back End
davetheseo
0
180
Introduction to Domain-Driven Design and Collaborative software design
baasie
1
590
Being A Developer After 40
akosma
91
590k
Lightning Talk: Beautiful Slides for Beginners
inesmontani
PRO
1
440
Why You Should Never Use an ORM
jnunemaker
PRO
61
9.7k
Technical Leadership for Architectural Decision Making
baasie
2
250
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.8k
The Cost Of JavaScript in 2023
addyosmani
55
9.5k
So, you think you're a good person
axbom
PRO
2
1.9k
Transcript
̏࣍ݩ෮ݩʹؔͯ͠ Learning a Multi-View Stereo Machine NIPS2017จಡΈձˏΫοΫύου 1 ಛʹදه͕ͳ͍ݶΓɺҎԼͷࢿྉ͔ΒҾ༻ https://arxiv.org/pdf/1708.05375.pdf
Learning a Multi-View Stereo Machine ▸ චऀ • Abhishek Kar,
Christian Häne, Jitendra Malik ʢUC Berkeley) ▸ ֓ཁ • Multi View StereoʢMVSʣʹΑΔີͳ3࣍ݩ෮ݩΛDeep LearningͰEnd2Endʹֶश • MVSΛ”ֶशͰ͖Δ”ͷͰແ͍͔ͱ͍͏ٙʹ͑Δ 2
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ 2. Ϛονϯά 3.
̏࣍ݩ෮ݩ 4. Τϥʔͷআڈ 3
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ 2. Ϛονϯά 3.
̏࣍ݩ෮ݩ 4. Τϥʔͷআڈ ==> DeepԿͰશͯղܾͰ͖ͦ͏ 4
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ɹ← CNNͰ͍͚Δ 2. Ϛονϯά
3. ̏࣍ݩ෮ݩ 4. Τϥʔͷআڈ 5
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ 2. Ϛονϯάɹ← CNNͱRNNͰ͍͚Δ
3. ̏࣍ݩ෮ݩ 4. Τϥʔͷআڈ 6
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ 2. Ϛονϯά 3.
̏࣍ݩ෮ݩɹ← DeconvͰ͍͚Δ 4. Τϥʔͷআڈ 7
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ 2. Ϛονϯά 3.
̏࣍ݩ෮ݩ 4. Τϥʔͷআڈɹ← Encoder-DecoderͰ͍͚Δ 8
DeepԿͰࡾ࣍ݩ෮ݩ ▸ 3DR2N2(ECCV2016) • ෳը૾ΛΤϯίʔυ͠ɺLSTMͰϚονϯά 9 http://3d-r2n2.stanford.edu
DeepԿͰࡾ࣍ݩ෮ݩ ▸ 3D Shape Reconstruction by Modeling 2.5D Sketch (NIPS2017)
• ϦΞϧͷը૾͔Β2.5DͷεέονΛى͜͠ɺ2.5DεέονΛͱʹ 3DshapeਪఆΛEnd2EndֶशͰ͢Δ 10 https://arxiv.org/pdf/1711.03129.pdf
͢༰ ▸ શମ૾ ▸ ख๏ ▸ ࣮ݧ ▸ ·ͱΊ 11
શମ૾ 12 http://bair.berkeley.edu/blog/2017/09/05/unified-3d/
શମ૾ 13 Learnt Stereo Machines
ख๏ ▸ Image Encoder • Encoder-DecoderܕʢU-netʣͷઃܭ • Ϛονϯάʹ༻͍Δ̎DͷಛϚοϓ࡞ • ࣍ݩ2DnಛϚο
14
ख๏ ▸ Unplojection ▸ 2࣍ݩͷಛϚοϓ3࣍ݩͷຊདྷ͋Δ͖ಛϚοϓ͔ΒࣹӨ ▸ 3࣍ݩάϦουʹٯࣹӨ 15 http://bair.berkeley.edu/blog/2017/09/05/unified-3d/
ख๏ ▸ Unplojection ▸ 2࣍ݩͷಛϚοϓ3࣍ݩͷຊདྷ͋Δ͖ಛϚοϓ͔ΒࣹӨ ▸ 3࣍ݩάϦουʹٯࣹӨ 16 http://bair.berkeley.edu/blog/2017/09/05/unified-3d/
ख๏ ▸ Unplohection ▸ 2࣍ݩͷಛϚοϓ3࣍ݩͷຊདྷ͋Δ͖ಛϚοϓ͔ΒࣹӨ ▸ 3࣍ݩάϦουʹٯࣹӨ 17 http://bair.berkeley.edu/blog/2017/09/05/unified-3d/
ख๏ ▸ Unplohection ▸ 2࣍ݩͷಛϚοϓ3࣍ݩͷຊདྷ͋Δ͖ಛϚοϓ͔ΒࣹӨ ▸ 3࣍ݩάϦουʹٯࣹӨ 18 http://bair.berkeley.edu/blog/2017/09/05/unified-3d/
ख๏ ▸ Recurrent Grid Fusion • 3࣍ݩͷಛϚοϓͷϚονϯάΛGated Recurrent Unit(GRU)Ͱ •
GRUʹ͍࣋ͬͯͨ͘Ίɺ3D convolutionΛ༻ • ͜ͷաఔ͕MVSͷܭࢉϚονϯάΛ୲ • ֶशͷࡍը૾ͷೖྗॱΛϥϯμϜʹೖΕସ͑Δ 19
ख๏ ▸ 3D Grid Reasoning • GRUͰ̏࣍ݩάϦουʹͨ͠ΒϊΠζ͕ଟ͔ͬͨɻ • 3U-netͰEncode Decode͢ΔͱFilteringͰ͖Δ
20
ख๏ ▸ Differentiable Projection • Depthͷ෮ݩʹL1 loss(high frequency informationͷͨΊ) •
Voxelͷ෮ݩʹvoxel͝ͱͷcross entropy loss 21
࣮ݧ ▸ σʔληοτ • ShapeNetσʔλΛར༻ • ̏࣍ݩCADϞσϧͷެ։σʔληοτ 22 https://shapenet.cs.stanford.edu/shrec17/
࣮ݧ • ೖྗը૾ ▸ ShapeNetͷ3DϞσϧΛϨϯμϦϯάͯ͠224x224x3 ▸ ̍ࢹ͋ͨΓ̐ຕ ▸ Χϝϥϙʔζ •
Ξτϓοτ ▸ Depth: 224x224x3 ▸ Voxel: 32x32x32 23
࣮ݧ ▸ ݁Ռ 24 3DR2N2ͱൺɺࡉ͔͍෮ݩ͕Մೳ
࣮ݧ ▸ ݁Ռ 25 3DR2N2ͱൺɺগͳ͍ຕͰ෮ݩ͕Մೳ ຕ૿͑Δͱੑೳ্͕͕Δ
࣮ݧ ▸ ݁Ռ 26 stereo matchingͰ෮ݩ͠ͳ͍ ૭෮ݩՄೳ
࣮ݧ ▸ ݁Ռ 27 stereo matchingʹൺ গͳ͍ຕͰ෮ݩ͕Մೳ චऀᐌ͘ CNNͷίϯςΫετΛݟΔྗ ैདྷͷstereo
matchingΛ͙྇ DepthMapͷਪఆ݁ՌΛෳΈ߹Θͤͯ̏࣍ݩ෮ݩͨ͠
·ͱΊ ▸ Learnt Stereo MachinesΛఏҊ ▸ ෳࢹ͔Βͷೖྗը૾Λݩʹɺ DepthMapͱVoxelͷਪఆ͕Մೳͱͳͬͨ ▸ ՝
• ग़ྗVoxel͕32x32x32ͱখ͍͞ 28