Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ニューラル3次元復元入門

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.

 ニューラル3次元復元入門

Avatar for Shunsuke Saito

Shunsuke Saito

April 12, 2023
Tweet

Other Decks in Research

Transcript

  1. ੪౻ ൏հʢ͍͞ͱ͏ ͠ΎΜ͚͢ʣ • ϖϯγϧόχΞେֶ٬һݚڀһ (2014-2015) • ೆΧϦϑΥϧχΞେֶ PhD (2015-2020)

    • Πϯλʔϯ: FAIR, FRL, Adobe,ϚοΫεϓϥϯΫݚڀॴͳͲ • Reality Labs Research ݚڀһ (2020-) Computational Body Building 
 (SIGGRAPH 2015) PIFu/PIFuHD 
 (ICCV 2019, CVPR 2020) SCANimate 
 (CVPR 2021)
  2. χϡʔϥϧࡾ࣍ݩ෮ݩͷϑϨʔϜϫʔΫ • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ •

    ϘΫηϧ • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ Τϯίʔμʔ σίʔμʔ ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ
  3. χϡʔϥϧࡾ࣍ݩ෮ݩͷϑϨʔϜϫʔΫ • ϘΫηϧ • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ •

    χϡʔϥϧ৔ σίʔμʔ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ Τϯίʔμʔ ೖྗσʔλ
  4. • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ • ϘΫηϧ

    • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ σίʔμʔ ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ ΤϯίʔμʔϨεࡾ࣍ݩ෮ݩ • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ γʔϯಛԽܕͷ3࣍ݩ෮ݩ
  5. • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ • ϘΫηϧ

    • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ Τϯίʔμʔ σίʔμʔ ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ ग़ྗσʔλɾσίʔμʔ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ
  6. • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ Τϯίʔμʔ ೖྗσʔλ

    ग़ྗσʔλɾσίʔμʔ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ • ϘΫηϧ • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ σίʔμʔ ग़ྗσʔλ
  7. ϘΫηϧ • ໨ඪܗঢ়Λ3࣍ݩ֨ࢠঢ়ʹ֨ೲ • Occupancy • ූ߸෇͖ڑ཭ؔ਺ʢSDF) • TSDF •

    3D Convolution͕ͦͷ··࢖͑Δ • ϝϞϦ࢖༻ྔ͕ϘτϧωοΫ: O(d3) [Choy2016; Maturana2015; Qi2016; Wu2015] Image credit [Mescheder2019]
  8. ਂ౓Ϛοϓ • ը૾ม׵(image-to-image translation)ͷҰछ: 
 RGB→Depth • ը૾ݚڀͷ࠷৽ٕज़͕Ԡ༻͠΍͍͢ 
 ʢυϝΠϯసҠɺGANͳͲʣ

    • ൚Խੑೳ͕ߴ͍Ұํɺ 
 ΧςΰϦ͝ͱͷਫ਼ࡉͳ෮ݩ͸ෆ޲͖ ̍ຕը૾͔Βͷߴղ૾ͳਂ౓Ϛοϓਪఆ [Miangoleh2021]
  9. ఺܈Λ༻͍ͨଟࢹ఺εςϨΦ [Chen2020] CNNʹΑΔ 
 ଟ૚ہॴಛ௃ྔ CNN ૈ͍ਂ౓Ϛοϓ ਖ਼ղ஋ ࢒ࠩ ఺܈্Ͱͷվྑ

    ਫ਼ࡉͳਂ౓Ϛοϓ ఺܈্Ͱͷ 
 ಛ௃ྔαϯϓϧ ܁Γฦ͠ʹΑΔ࠷దԽ ఺܈ͷԠ༻ྫ
  10. ϝογϡ • CGͰ͸࠷΋Ұൠతͳܗঢ়දݱ 
 →ϨϯμϦϯάΤϯδϯͱͷ૬ੑ΋ྑ͍ • ෳ਺ͷσίʔσΟϯάํ๏͕ଘࡏ͢Δ • Fully Connected

    (MLP) • Graph Convolution • AtlasNet • ৄࡉදݱͷֶश΍τϙϩδʔมԽ͕ࠔ೉ 3D ϞʔϑΝϒϧϞσϧ [Blanz1998]
  11. ϝογϡ જࡏม਺ શ௖఺ͷू߹ มܗޙͷ3࣍ݩ࠲ඪ Ξτϥε [Groueix2018; Yang2018] MLP z MLP

    z ैདྷͷܗঢ়දݱ: 
 [Fan2017] f(z) = X, ℝZ → ℝn×3 AtlasNet: f(z, P) = p, ℝZ × ℝ2 → ℝ3 P ςΫενϟۭؒͷ 
 ೚ҙͷ఺
  12. ϝογϡ มܗޙͷ3࣍ݩ࠲ඪ Ξτϥε [Groueix2018; Yang2018] MLP z P • ܗঢ়શମͷ௖఺࠲ඪͷ෼෍Λֶश͢Δ

    ୅ΘΓʹɺ֤ฏ໘ͷ“มܗ”ͱֶͯ͠शʂ ˠςΫενϟϚοϐϯάͷཁྖ • ද໘ܗঢ়ͷ࿈ଓੑΛߟྀ • ղ૾౓͕ݻఆ͞Εͳ͘ͳͬͨʂ • ෳ਺ͷΞτϥεΛֶश͢Δ͜ͱͰ 
 τϙϩδʔมԽʹରԠ AtlasNet: f(z, P) = p, ℝZ × ℝ2 → ℝ3 ςΫενϟۭؒͷ 
 ೚ҙͷ఺
  13. • 3࣍ݩܗঢ়Λؔ਺஋ͷϨϕϧηοτͰදݱ • Occupancy • SDF/TSDF • ϘΫηϧͱҧ͍ղ૾౓ͷ੍໿͕ͳ͍ • ֶशϕʔεͷ3࣍ݩ෮ݩʹ͓͚Δ

    
 େ͖ͳϒϨΠΫεϧʔ • ϝογϡ౳ͷཅతͳܗঢ়நग़ͷͨΊʹ͸ ϚʔνϯΩϡʔϒ๏͕ඞཁ f(x, y, z) := x2 + y2 + z2 − r2 χϡʔϥϧ৔ʢӄؔ਺ۂ໘ʣ Image credit [Mescheder2019]
  14. มܗޙͷ3࣍ݩ࠲ඪ MLP z P Neural Implicit: f(z, P) = SDF,

    ℝZ × ℝ3 → ℝ MLP z P ςΫενϟۭؒͷ 
 ೚ҙͷ఺ AtlasNet: f(z, P) = p, ℝZ × ℝ2 → ℝ3 3࣍ݩ্ͷ 
 ೚ҙͷ఺ ࢀর఺ͷ 
 ූ߸෇͖ڑ཭ؔ਺ Neural Implicit [Chen/Park/Mescheder2019] χϡʔϥϧ৔ʢӄؔ਺ۂ໘ʣ
  15. σίʔμʔ:ܗঢ়දݱ·ͱΊ ఺܈ ϝογϡ ϘΫηϧ χϡʔϥϧ৔ ղ૾౓ ✅/❌ ✅ ❌ ✅

    τϙϩδʔ ✅ ✅/❌ ✅ ✅ εϐʔυ ✅ ✅ ✅/❌ ❌ ϨϯμϦϯά ❌ ✅ ✅/❌ ✅ • ΫΦϦςΟˠχϡʔϥϧ৔ • ܗঢ়มԽͷগͳ͍υϝΠϯʢྫɿإʣˠϝογϡ • ࠓޙͷτϨϯυɿϋΠϒϦουදݱʢྫɿ఺܈×χϡʔϥϧ৔ʣ
  16. • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ • ϘΫηϧ

    • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ Τϯίʔμʔ σίʔμʔ ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ ग़ྗσʔλɾσίʔμʔ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ
  17. • ϘΫηϧ • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔

    σίʔμʔ ग़ྗσʔλ • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ Τϯίʔμʔ ೖྗσʔλ ग़ྗσʔλɾσίʔμʔ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ ଛࣦؔ਺ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ
  18. ଛࣦؔ਺ɿٯϨϯμϦϯά • ਖ਼ղܗঢ়͕༩͑ΒΕͳ͍৔߹ɺ 
 ը૾܈͔ΒٯϨϯμϦϯά໰୊Λղ͘͜ͱΛߟ͑Δ • ֤ܗঢ়දݱʹର͠ɺ༷ʑͳඍ෼ՄೳϨϯμϥ͕ଘࡏ • ఺܈ →Pulser

    [Lassner2021]ͳͲ • ϘΫηϧˠPTN [Yan2016]ͳͲ • ϝογϡˠOpenDR [Loper2014], NMR [Kato2019], Softras [Liu2019a]ͳͲ • ӄؔ਺ˠ[Liu2019b], IDR [Yariv2020], NeRF [Mildenhall2020]ͳͲ ϝογϡʹ͓͚ΔٯϨϯμϦϯά[Kato2018]
  19. • ୯؟ը૾ • ϘΫηϧ • ఺܈ 2D CNN 
 (େҬಛ௃ʣ

    3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ χϡʔϥϧ୯؟෮ݩ૲૑ظ [Wu2015] [Fan2017] • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ)
  20. • ϝογϡ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ਖ਼ଇԽ • ୯؟ը૾

    2D CNN 
 (େҬಛ௃ʣ 3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ϝογϡදݱͷ୆಄ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ Pixel2Mesh [Wang2018]
  21. • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ • ϘΫηϧ • ϝογϡ •

    ఺܈ • ୯؟ը૾ 2D CNN 
 (େҬಛ௃ʣ 3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ඍ෼ՄೳϨϯμϦϯάͷ༂ਐ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ఺܈ [Wang2019] 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ϘΫηϧ [Yan2016] ϝογϡ [Kato2018]
  22. • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • χϡʔϥϧ৔ 
 (ӄؔ਺ද໘) • ୯؟ը૾

    2D CNN 
 (େҬಛ௃ʣ 3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ χϡʔϥϧ৔େരൃ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ DeepSDF [Park2019] Occupancy Networks 
 [Mescheder2019] IM-Net [Chen2019]
  23. • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • χϡʔϥϧ৔ 
 (ӄؔ਺ද໘) • ୯؟ը૾

    2D CNN 
 (େҬಛ௃ʣ 3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ہॴχϡʔϥϧ৔ʹΑΔ൚Խੑೳ޲্ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ଛࣦؔ਺ 2D CNN 
 (ہॴಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ PIFu [Saito2019]
  24. • χϡʔϥϧ৔ 
 (NeRF) • ୯؟ը૾ 2D CNN 
 (େҬಛ௃ʣ

    3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ χϡʔϥϧ৔ɼඍ෼ՄೳϨϯμϦϯάͱग़ձ͏ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ଛࣦؔ਺ 2D CNN 
 (ہॴಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) PixelNeRF [Yu2021]
  25. • χϡʔϥϧ৔ 
 (NeRF) • ୯؟ը૾ 2D CNN 
 (େҬಛ௃ʣ

    3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ہॴಛ௃ྔͷઌ΁ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ଛࣦؔ਺ 2D CNN 
 (ہॴಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) ViT 
 (ඇہॴಛ௃ʣ ViT-NeRF [Lin2022]
  26. ·ͱΊ • ֤σʔλදݱͷಛੑΛཧղ͠ ΤϯίʔμʔɺσίʔμʔΛσβΠϯ͢Δ • ਖ਼ղܗঢ়ͷ༗ແΛߟྀ͠ɺద੾ʹଛࣦؔ਺Λఆٛ͢Δ • ୯؟ը૾ • ਂ౓෇͖ը૾

    • ෳ਺ը૾ • ఺܈ɾεΩϟϯ • ϘΫηϧ • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ Τϯίʔμʔ σίʔμʔ ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ
  27. Ҿ༻Ϧετᶃ • [Blanz1999] Blanz, Volker, and Thomas Vetter. "A morphable

    model for the synthesis of 3D faces." Proceedings of the 26th annual conference on Computer graphics and interactive techniques. 1999. • [Chen2019] Chen, Zhiqin, and Hao Zhang. "Learning implicit fields for generative shape modeling." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. • [Choy2016] Choy, Christopher B., et al. "3d-r2n2: A unified approach for single and multi-view 3d object reconstruction." European conference on computer vision. Springer, Cham, 2016. • [Cosmo2020] Cosmo, Luca, et al. "Limp: Learning latent shape representations with metric preservation priors." Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer International Publishing, 2020. • [Dai2020] Dai, Angela, Christian Diller, and Matthias Nießner. "Sg-nn: Sparse generative neural networks for self-supervised scene completion of rgb-d scans." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. • [Dosovitskiy2021] Alexey Dosovitskiy et al. An image is worth 16x16 words: Trans- formers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. • [Furukawa2009] Furukawa, Yasutaka, et al. "Manhattan-world stereo." 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009. • [Fan2017] Fan, Haoqiang, Hao Su, and Leonidas J. Guibas. "A point set generation network for 3d object reconstruction from a single image." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. • [Graham2017] Graham, Benjamin, and Laurens van der Maaten. "Submanifold sparse convolutional networks." arXiv preprint arXiv:1706.01307 (2017). • [Groueix2018] Groueix, Thibault, et al. "A papier-mâché approach to learning 3d surface generation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. • [He2016] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. • [Jackson2017] Jackson, Aaron S., et al. "Large pose 3D face reconstruction from a single image via direct volumetric CNN regression." Proceedings of the IEEE International Conference on Computer Vision. 2017. • [Kato2018] Kato, Hiroharu, Yoshitaka Ushiku, and Tatsuya Harada. "Neural 3d mesh renderer." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. • [Lassner2021] Lassner, Christoph, and Michael Zollhofer. "Pulsar: Efficient Sphere-based Neural Rendering." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
  28. Ҿ༻Ϧετᶄ • [Lin2022] Kai-En Lin, Lin Yen-Chen, Wei-Sheng Lai, Tsung-Yi

    Lin, Yi-Chang Shih, and Ravi Ramamoorthi. Vision transformer for nerf-based view synthesis from a single input image. arXiv preprint arXiv:2207.05736, 2022. • [Ling2022] Selena Ling, Nicholas Sharp, and Alec Jacobson. Vectoradam for rotation equiv- ariant geometry optimization. arXiv preprint arXiv:2205.13599, 2022. • [Liu2019a] Liu, Shichen, et al. "Soft rasterizer: A differentiable renderer for image-based 3d reasoning." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. • [Liu2019b] Liu, Shichen, et al. "Learning to infer implicit surfaces without 3d supervision." NeurIPS 2019. • [Liu2022] Hsueh-Ti Derek Liu, Francis Williams, Alec Jacobson, Sanja Fidler, and Or Litany. Learning smooth neural functions via lipschitz regularization. SIGGRAPH, 2022. • [Loper2014] Loper, Matthew M., and Michael J. Black. "OpenDR: An approximate differentiable renderer." European Conference on Computer Vision. Springer, Cham, 2014. • [Ma2021] Ma, Qianli, et al. "SCALE: Modeling clothed humans with a surface codec of articulated local elements." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021. • [Maturana2015] Maturana, Daniel, and Sebastian Scherer. "Voxnet: A 3d convolutional neural network for real-time object recognition." 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2015. • [Mescheder2019] Mescheder, Lars, et al. "Occupancy networks: Learning 3d reconstruction in function space." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. • [Mildenhall2020] Mildenhall, Ben, et al. "Nerf: Representing scenes as neural radiance fields for view synthesis." European conference on computer vision. Springer, Cham, 2020. • [Miangoleh2021] Miangoleh, S. Mahdi H., et al. "Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021. • [Mueller2022] Thomas Mueller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding. arXiv preprint arXiv:2201.05989, 2022. • [Newell2016] Newell, Alejandro, Kaiyu Yang, and Jia Deng. "Stacked hourglass networks for human pose estimation." European conference on computer vision. Springer, Cham, 2016. • [Nicolet2021] Baptiste Nicolet, Alec Jacobson, and Wenzel Jakob. Large steps in inverse rendering of geometry. ACM Transactions on Graphics (TOG), Vol. 40, No. 6, pp. 1–13, 2021. • [Park2019] Park, Jeong Joon, et al. "Deepsdf: Learning continuous signed distance functions for shape representation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. • [Peng2020] Peng, Songyou, et al. "Convolutional occupancy networks." Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer International Publishing, 2020. • [Qi2016] Qi, Charles R., et al. "Volumetric and multi-view cnns for object classification on 3d data." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
  29. Ҿ༻Ϧετᶅ • [Qi2017] Qi, Charles R., et al. "Pointnet: Deep

    learning on point sets for 3d classification and segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. • [Qi2017b] Qi, Charles R., et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space." arXiv preprint arXiv:1706.02413 (2017) • [Ranjan2018] Ranjan, Anurag, et al. "Generating 3D faces using convolutional mesh autoencoders." Proceedings of the European Conference on Computer Vision (ECCV). 2018. • [Riegler2017] Riegler, Gernot, Ali Osman Ulusoy, and Andreas Geiger. "Octnet: Learning deep 3d representations at high resolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. • [Saito2018] Saito, Shunsuke, et al. "3D hair synthesis using volumetric variational autoencoders." ACM Transactions on Graphics (TOG) 37.6 (2018): 1-12. • [Saito2019] Saito, Shunsuke, et al. "Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. • [Saito2020] Saito, Shunsuke, et al. "Pifuhd: Multi-level pixel-aligned implicit function for high-resolution 3d human digitization." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. • [Saito2021] Saito, Shunsuke, et al. "SCANimate: Weakly supervised learning of skinned clothed avatar networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021. • [Simonyan2014] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014). • [Tancik2020] Tancik, Matthew, et al. "Fourier features let networks learn high frequency functions in low dimensional domains." arXiv preprint arXiv:2006.10739 (2020). • [Yan2016] Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, and Honglak Lee. Perspec- tive transformer nets: Learning single-view 3d object reconstruction without 3d supervision. Advances in neural information processing systems, Vol. 29, , 2016. • [Yariv2020] Yariv, Lior, et al. "Multiview neural surface reconstruction by disentangling geometry and appearance." arXiv preprint arXiv:2003.09852 (2020). • [Yao2018] Yao, Yao, et al. "Mvsnet: Depth inference for unstructured multi-view stereo." Proceedings of the European Conference on Computer Vision (ECCV). 2018. • [Yan2016] Yan, Xinchen, et al. "Perspective transformer nets: Learning single-view 3d object reconstruction without 3d supervision." arXiv preprint arXiv:1612.00814 (2016). • [Yang2018] Yang, Yaoqing, et al. "Foldingnet: Point cloud auto-encoder via deep grid deformation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. • [Yu2021] Alex Yu, Vickie Ye, Matthew Tancik, and Angjoo Kanazawa. pixelnerf: Neu- ral radiance fields from one or few images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. • [Wang2018] Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. Pixel2mesh: Generating 3d mesh models from single rgb images. In Proceedings of the European conference on computer vision (ECCV), pp. 52– 67, 2018. • [Wang2019] Wang Yifan, Felice Serena, Shihao Wu, Cengiz O ̈ztireli, and Olga Sorkine- Hornung. Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (TOG), Vol. 38, No. 6, pp. 1–14, 2019. • [Wu2015] Wu, Zhirong, et al. "3d shapenets: A deep representation for volumetric shapes." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.