Link
Embed
Share
Beginning
This slide
Copy link URL
Copy link URL
Copy iframe embed code
Copy iframe embed code
Copy javascript embed code
Copy javascript embed code
Share
Tweet
Share
Tweet
Slide 1
Slide 1 text
̏࣍ݩ෮ݩʹؔͯ͠ Learning a Multi-View Stereo Machine NIPS2017จಡΈձˏΫοΫύου 1 ಛʹදه͕ͳ͍ݶΓɺҎԼͷࢿྉ͔ΒҾ༻ https://arxiv.org/pdf/1708.05375.pdf
Slide 2
Slide 2 text
Learning a Multi-View Stereo Machine ▸ චऀ • Abhishek Kar, Christian Häne, Jitendra Malik ʢUC Berkeley) ▸ ֓ཁ • Multi View StereoʢMVSʣʹΑΔີͳ3࣍ݩ෮ݩΛDeep LearningͰEnd2Endʹֶश • MVSΛ”ֶशͰ͖Δ”ͷͰແ͍͔ͱ͍͏ٙʹ͑Δ 2
Slide 3
Slide 3 text
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ 2. Ϛονϯά 3. ̏࣍ݩ෮ݩ 4. Τϥʔͷআڈ 3
Slide 4
Slide 4 text
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ 2. Ϛονϯά 3. ̏࣍ݩ෮ݩ 4. Τϥʔͷআڈ ==> DeepԿͰશͯղܾͰ͖ͦ͏ 4
Slide 5
Slide 5 text
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ɹ← CNNͰ͍͚Δ 2. Ϛονϯά 3. ̏࣍ݩ෮ݩ 4. Τϥʔͷআڈ 5
Slide 6
Slide 6 text
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ 2. Ϛονϯάɹ← CNNͱRNNͰ͍͚Δ 3. ̏࣍ݩ෮ݩ 4. Τϥʔͷআڈ 6
Slide 7
Slide 7 text
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ 2. Ϛονϯά 3. ̏࣍ݩ෮ݩɹ← DeconvͰ͍͚Δ 4. Τϥʔͷআڈ 7
Slide 8
Slide 8 text
എܠ ▸ Multi View Stereoͱ 1. ಛநग़ 2. Ϛονϯά 3. ̏࣍ݩ෮ݩ 4. Τϥʔͷআڈɹ← Encoder-DecoderͰ͍͚Δ 8
Slide 9
Slide 9 text
DeepԿͰࡾ࣍ݩ෮ݩ ▸ 3DR2N2(ECCV2016) • ෳը૾ΛΤϯίʔυ͠ɺLSTMͰϚονϯά 9 http://3d-r2n2.stanford.edu
Slide 10
Slide 10 text
DeepԿͰࡾ࣍ݩ෮ݩ ▸ 3D Shape Reconstruction by Modeling 2.5D Sketch (NIPS2017) • ϦΞϧͷը૾͔Β2.5DͷεέονΛى͜͠ɺ2.5DεέονΛͱʹ 3DshapeਪఆΛEnd2EndֶशͰ͢Δ 10 https://arxiv.org/pdf/1711.03129.pdf
Slide 11
Slide 11 text
͢༰ ▸ શମ૾ ▸ ख๏ ▸ ࣮ݧ ▸ ·ͱΊ 11
Slide 12
Slide 12 text
શମ૾ 12 http://bair.berkeley.edu/blog/2017/09/05/unified-3d/
Slide 13
Slide 13 text
શମ૾ 13 Learnt Stereo Machines
Slide 14
Slide 14 text
ख๏ ▸ Image Encoder • Encoder-DecoderܕʢU-netʣͷઃܭ • Ϛονϯάʹ༻͍Δ̎DͷಛϚοϓ࡞ • ࣍ݩ2DnಛϚο 14
Slide 15
Slide 15 text
ख๏ ▸ Unplojection ▸ 2࣍ݩͷಛϚοϓ3࣍ݩͷຊདྷ͋Δ͖ಛϚοϓ͔ΒࣹӨ ▸ 3࣍ݩάϦουʹٯࣹӨ 15 http://bair.berkeley.edu/blog/2017/09/05/unified-3d/
Slide 16
Slide 16 text
ख๏ ▸ Unplojection ▸ 2࣍ݩͷಛϚοϓ3࣍ݩͷຊདྷ͋Δ͖ಛϚοϓ͔ΒࣹӨ ▸ 3࣍ݩάϦουʹٯࣹӨ 16 http://bair.berkeley.edu/blog/2017/09/05/unified-3d/
Slide 17
Slide 17 text
ख๏ ▸ Unplohection ▸ 2࣍ݩͷಛϚοϓ3࣍ݩͷຊདྷ͋Δ͖ಛϚοϓ͔ΒࣹӨ ▸ 3࣍ݩάϦουʹٯࣹӨ 17 http://bair.berkeley.edu/blog/2017/09/05/unified-3d/
Slide 18
Slide 18 text
ख๏ ▸ Unplohection ▸ 2࣍ݩͷಛϚοϓ3࣍ݩͷຊདྷ͋Δ͖ಛϚοϓ͔ΒࣹӨ ▸ 3࣍ݩάϦουʹٯࣹӨ 18 http://bair.berkeley.edu/blog/2017/09/05/unified-3d/
Slide 19
Slide 19 text
ख๏ ▸ Recurrent Grid Fusion • 3࣍ݩͷಛϚοϓͷϚονϯάΛGated Recurrent Unit(GRU)Ͱ • GRUʹ͍࣋ͬͯͨ͘Ίɺ3D convolutionΛ༻ • ͜ͷաఔ͕MVSͷܭࢉϚονϯάΛ୲ • ֶशͷࡍը૾ͷೖྗॱΛϥϯμϜʹೖΕସ͑Δ 19
Slide 20
Slide 20 text
ख๏ ▸ 3D Grid Reasoning • GRUͰ̏࣍ݩάϦουʹͨ͠ΒϊΠζ͕ଟ͔ͬͨɻ • 3U-netͰEncode Decode͢ΔͱFilteringͰ͖Δ 20
Slide 21
Slide 21 text
ख๏ ▸ Differentiable Projection • Depthͷ෮ݩʹL1 loss(high frequency informationͷͨΊ) • Voxelͷ෮ݩʹvoxel͝ͱͷcross entropy loss 21
Slide 22
Slide 22 text
࣮ݧ ▸ σʔληοτ • ShapeNetσʔλΛར༻ • ̏࣍ݩCADϞσϧͷެ։σʔληοτ 22 https://shapenet.cs.stanford.edu/shrec17/
Slide 23
Slide 23 text
࣮ݧ • ೖྗը૾ ▸ ShapeNetͷ3DϞσϧΛϨϯμϦϯάͯ͠224x224x3 ▸ ̍ࢹ͋ͨΓ̐ຕ ▸ Χϝϥϙʔζ • Ξτϓοτ ▸ Depth: 224x224x3 ▸ Voxel: 32x32x32 23
Slide 24
Slide 24 text
࣮ݧ ▸ ݁Ռ 24 3DR2N2ͱൺɺࡉ͔͍෮ݩ͕Մೳ
Slide 25
Slide 25 text
࣮ݧ ▸ ݁Ռ 25 3DR2N2ͱൺɺগͳ͍ຕͰ෮ݩ͕Մೳ ຕ૿͑Δͱੑೳ্͕͕Δ
Slide 26
Slide 26 text
࣮ݧ ▸ ݁Ռ 26 stereo matchingͰ෮ݩ͠ͳ͍ ૭෮ݩՄೳ
Slide 27
Slide 27 text
࣮ݧ ▸ ݁Ռ 27 stereo matchingʹൺ গͳ͍ຕͰ෮ݩ͕Մೳ චऀᐌ͘ CNNͷίϯςΫετΛݟΔྗ ैདྷͷstereo matchingΛ͙྇ DepthMapͷਪఆ݁ՌΛෳΈ߹Θͤͯ̏࣍ݩ෮ݩͨ͠
Slide 28
Slide 28 text
·ͱΊ ▸ Learnt Stereo MachinesΛఏҊ ▸ ෳࢹ͔Βͷೖྗը૾Λݩʹɺ DepthMapͱVoxelͷਪఆ͕Մೳͱͳͬͨ ▸ ՝ • ग़ྗVoxel͕32x32x32ͱখ͍͞ 28