Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ニューラル3次元復元入門

 ニューラル3次元復元入門

Shunsuke Saito

April 12, 2023
Tweet

Other Decks in Research

Transcript

  1. ੪౻ ൏հʢ͍͞ͱ͏ ͠ΎΜ͚͢ʣ • ϖϯγϧόχΞେֶ٬һݚڀһ (2014-2015) • ೆΧϦϑΥϧχΞେֶ PhD (2015-2020)

    • Πϯλʔϯ: FAIR, FRL, Adobe,ϚοΫεϓϥϯΫݚڀॴͳͲ • Reality Labs Research ݚڀһ (2020-) Computational Body Building 
 (SIGGRAPH 2015) PIFu/PIFuHD 
 (ICCV 2019, CVPR 2020) SCANimate 
 (CVPR 2021)
  2. χϡʔϥϧࡾ࣍ݩ෮ݩͷϑϨʔϜϫʔΫ • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ •

    ϘΫηϧ • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ Τϯίʔμʔ σίʔμʔ ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ
  3. χϡʔϥϧࡾ࣍ݩ෮ݩͷϑϨʔϜϫʔΫ • ϘΫηϧ • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ •

    χϡʔϥϧ৔ σίʔμʔ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ Τϯίʔμʔ ೖྗσʔλ
  4. • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ • ϘΫηϧ

    • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ σίʔμʔ ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ ΤϯίʔμʔϨεࡾ࣍ݩ෮ݩ • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ γʔϯಛԽܕͷ3࣍ݩ෮ݩ
  5. • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ • ϘΫηϧ

    • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ Τϯίʔμʔ σίʔμʔ ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ ग़ྗσʔλɾσίʔμʔ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ
  6. • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ Τϯίʔμʔ ೖྗσʔλ

    ग़ྗσʔλɾσίʔμʔ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ • ϘΫηϧ • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ σίʔμʔ ग़ྗσʔλ
  7. ϘΫηϧ • ໨ඪܗঢ়Λ3࣍ݩ֨ࢠঢ়ʹ֨ೲ • Occupancy • ූ߸෇͖ڑ཭ؔ਺ʢSDF) • TSDF •

    3D Convolution͕ͦͷ··࢖͑Δ • ϝϞϦ࢖༻ྔ͕ϘτϧωοΫ: O(d3) [Choy2016; Maturana2015; Qi2016; Wu2015] Image credit [Mescheder2019]
  8. ਂ౓Ϛοϓ • ը૾ม׵(image-to-image translation)ͷҰछ: 
 RGB→Depth • ը૾ݚڀͷ࠷৽ٕज़͕Ԡ༻͠΍͍͢ 
 ʢυϝΠϯసҠɺGANͳͲʣ

    • ൚Խੑೳ͕ߴ͍Ұํɺ 
 ΧςΰϦ͝ͱͷਫ਼ࡉͳ෮ݩ͸ෆ޲͖ ̍ຕը૾͔Βͷߴղ૾ͳਂ౓Ϛοϓਪఆ [Miangoleh2021]
  9. ఺܈Λ༻͍ͨଟࢹ఺εςϨΦ [Chen2020] CNNʹΑΔ 
 ଟ૚ہॴಛ௃ྔ CNN ૈ͍ਂ౓Ϛοϓ ਖ਼ղ஋ ࢒ࠩ ఺܈্Ͱͷվྑ

    ਫ਼ࡉͳਂ౓Ϛοϓ ఺܈্Ͱͷ 
 ಛ௃ྔαϯϓϧ ܁Γฦ͠ʹΑΔ࠷దԽ ఺܈ͷԠ༻ྫ
  10. ϝογϡ • CGͰ͸࠷΋Ұൠతͳܗঢ়දݱ 
 →ϨϯμϦϯάΤϯδϯͱͷ૬ੑ΋ྑ͍ • ෳ਺ͷσίʔσΟϯάํ๏͕ଘࡏ͢Δ • Fully Connected

    (MLP) • Graph Convolution • AtlasNet • ৄࡉදݱͷֶश΍τϙϩδʔมԽ͕ࠔ೉ 3D ϞʔϑΝϒϧϞσϧ [Blanz1998]
  11. ϝογϡ જࡏม਺ શ௖఺ͷू߹ มܗޙͷ3࣍ݩ࠲ඪ Ξτϥε [Groueix2018; Yang2018] MLP z MLP

    z ैདྷͷܗঢ়දݱ: 
 [Fan2017] f(z) = X, ℝZ → ℝn×3 AtlasNet: f(z, P) = p, ℝZ × ℝ2 → ℝ3 P ςΫενϟۭؒͷ 
 ೚ҙͷ఺
  12. ϝογϡ มܗޙͷ3࣍ݩ࠲ඪ Ξτϥε [Groueix2018; Yang2018] MLP z P • ܗঢ়શମͷ௖఺࠲ඪͷ෼෍Λֶश͢Δ

    ୅ΘΓʹɺ֤ฏ໘ͷ“มܗ”ͱֶͯ͠शʂ ˠςΫενϟϚοϐϯάͷཁྖ • ද໘ܗঢ়ͷ࿈ଓੑΛߟྀ • ղ૾౓͕ݻఆ͞Εͳ͘ͳͬͨʂ • ෳ਺ͷΞτϥεΛֶश͢Δ͜ͱͰ 
 τϙϩδʔมԽʹରԠ AtlasNet: f(z, P) = p, ℝZ × ℝ2 → ℝ3 ςΫενϟۭؒͷ 
 ೚ҙͷ఺
  13. • 3࣍ݩܗঢ়Λؔ਺஋ͷϨϕϧηοτͰදݱ • Occupancy • SDF/TSDF • ϘΫηϧͱҧ͍ղ૾౓ͷ੍໿͕ͳ͍ • ֶशϕʔεͷ3࣍ݩ෮ݩʹ͓͚Δ

    
 େ͖ͳϒϨΠΫεϧʔ • ϝογϡ౳ͷཅతͳܗঢ়நग़ͷͨΊʹ͸ ϚʔνϯΩϡʔϒ๏͕ඞཁ f(x, y, z) := x2 + y2 + z2 − r2 χϡʔϥϧ৔ʢӄؔ਺ۂ໘ʣ Image credit [Mescheder2019]
  14. มܗޙͷ3࣍ݩ࠲ඪ MLP z P Neural Implicit: f(z, P) = SDF,

    ℝZ × ℝ3 → ℝ MLP z P ςΫενϟۭؒͷ 
 ೚ҙͷ఺ AtlasNet: f(z, P) = p, ℝZ × ℝ2 → ℝ3 3࣍ݩ্ͷ 
 ೚ҙͷ఺ ࢀর఺ͷ 
 ූ߸෇͖ڑ཭ؔ਺ Neural Implicit [Chen/Park/Mescheder2019] χϡʔϥϧ৔ʢӄؔ਺ۂ໘ʣ
  15. σίʔμʔ:ܗঢ়දݱ·ͱΊ ఺܈ ϝογϡ ϘΫηϧ χϡʔϥϧ৔ ղ૾౓ ✅/❌ ✅ ❌ ✅

    τϙϩδʔ ✅ ✅/❌ ✅ ✅ εϐʔυ ✅ ✅ ✅/❌ ❌ ϨϯμϦϯά ❌ ✅ ✅/❌ ✅ • ΫΦϦςΟˠχϡʔϥϧ৔ • ܗঢ়มԽͷগͳ͍υϝΠϯʢྫɿإʣˠϝογϡ • ࠓޙͷτϨϯυɿϋΠϒϦουදݱʢྫɿ఺܈×χϡʔϥϧ৔ʣ
  16. • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ • ϘΫηϧ

    • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ Τϯίʔμʔ σίʔμʔ ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ ग़ྗσʔλɾσίʔμʔ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ
  17. • ϘΫηϧ • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔

    σίʔμʔ ग़ྗσʔλ • ୯؟ը૾ • ਂ౓෇͖ը૾ • ෳ਺ը૾ • ఺܈ɾεΩϟϯ Τϯίʔμʔ ೖྗσʔλ ग़ྗσʔλɾσίʔμʔ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ ଛࣦؔ਺ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ
  18. ଛࣦؔ਺ɿٯϨϯμϦϯά • ਖ਼ղܗঢ়͕༩͑ΒΕͳ͍৔߹ɺ 
 ը૾܈͔ΒٯϨϯμϦϯά໰୊Λղ͘͜ͱΛߟ͑Δ • ֤ܗঢ়දݱʹର͠ɺ༷ʑͳඍ෼ՄೳϨϯμϥ͕ଘࡏ • ఺܈ →Pulser

    [Lassner2021]ͳͲ • ϘΫηϧˠPTN [Yan2016]ͳͲ • ϝογϡˠOpenDR [Loper2014], NMR [Kato2019], Softras [Liu2019a]ͳͲ • ӄؔ਺ˠ[Liu2019b], IDR [Yariv2020], NeRF [Mildenhall2020]ͳͲ ϝογϡʹ͓͚ΔٯϨϯμϦϯά[Kato2018]
  19. • ୯؟ը૾ • ϘΫηϧ • ఺܈ 2D CNN 
 (େҬಛ௃ʣ

    3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ χϡʔϥϧ୯؟෮ݩ૲૑ظ [Wu2015] [Fan2017] • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ)
  20. • ϝογϡ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ਖ਼ଇԽ • ୯؟ը૾

    2D CNN 
 (େҬಛ௃ʣ 3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ϝογϡදݱͷ୆಄ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ Pixel2Mesh [Wang2018]
  21. • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ • ϘΫηϧ • ϝογϡ •

    ఺܈ • ୯؟ը૾ 2D CNN 
 (େҬಛ௃ʣ 3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ඍ෼ՄೳϨϯμϦϯάͷ༂ਐ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ఺܈ [Wang2019] 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ϘΫηϧ [Yan2016] ϝογϡ [Kato2018]
  22. • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • χϡʔϥϧ৔ 
 (ӄؔ਺ද໘) • ୯؟ը૾

    2D CNN 
 (େҬಛ௃ʣ 3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ χϡʔϥϧ৔େരൃ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ DeepSDF [Park2019] Occupancy Networks 
 [Mescheder2019] IM-Net [Chen2019]
  23. • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • χϡʔϥϧ৔ 
 (ӄؔ਺ද໘) • ୯؟ը૾

    2D CNN 
 (େҬಛ௃ʣ 3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ہॴχϡʔϥϧ৔ʹΑΔ൚Խੑೳ޲্ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ଛࣦؔ਺ 2D CNN 
 (ہॴಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ PIFu [Saito2019]
  24. • χϡʔϥϧ৔ 
 (NeRF) • ୯؟ը૾ 2D CNN 
 (େҬಛ௃ʣ

    3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ χϡʔϥϧ৔ɼඍ෼ՄೳϨϯμϦϯάͱग़ձ͏ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ଛࣦؔ਺ 2D CNN 
 (ہॴಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) PixelNeRF [Yu2021]
  25. • χϡʔϥϧ৔ 
 (NeRF) • ୯؟ը૾ 2D CNN 
 (େҬಛ௃ʣ

    3D CNN ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ہॴಛ௃ྔͷઌ΁ 2D CNN 
 (ہॴಛ௃ʣ Graph Conv. ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ 2D CNN 
 (େҬಛ௃ʣ MLP ೖྗσʔλ ଛࣦؔ਺ 2D CNN 
 (ہॴಛ௃ʣ MLP ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) ViT 
 (ඇہॴಛ௃ʣ ViT-NeRF [Lin2022]
  26. ·ͱΊ • ֤σʔλදݱͷಛੑΛཧղ͠ ΤϯίʔμʔɺσίʔμʔΛσβΠϯ͢Δ • ਖ਼ղܗঢ়ͷ༗ແΛߟྀ͠ɺద੾ʹଛࣦؔ਺Λఆٛ͢Δ • ୯؟ը૾ • ਂ౓෇͖ը૾

    • ෳ਺ը૾ • ఺܈ɾεΩϟϯ • ϘΫηϧ • ਂ౓Ϛοϓʢ2.5Dʣ • ఺܈ • ϝογϡ • χϡʔϥϧ৔ • ڭࢣ͋Γֶश 
 (࠶ߏ੒ଛࣦ) • ࣗݾڭࢣ͋Γֶश (ٯϨϯμϦϯά) • ਖ਼ଇԽ Τϯίʔμʔ σίʔμʔ ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺ ਪ࿦ ύϥϝʔλʔߋ৽ʢSGDʣ
  27. Ҿ༻Ϧετᶃ • [Blanz1999] Blanz, Volker, and Thomas Vetter. "A morphable

    model for the synthesis of 3D faces." Proceedings of the 26th annual conference on Computer graphics and interactive techniques. 1999. • [Chen2019] Chen, Zhiqin, and Hao Zhang. "Learning implicit fields for generative shape modeling." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. • [Choy2016] Choy, Christopher B., et al. "3d-r2n2: A unified approach for single and multi-view 3d object reconstruction." European conference on computer vision. Springer, Cham, 2016. • [Cosmo2020] Cosmo, Luca, et al. "Limp: Learning latent shape representations with metric preservation priors." Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer International Publishing, 2020. • [Dai2020] Dai, Angela, Christian Diller, and Matthias Nießner. "Sg-nn: Sparse generative neural networks for self-supervised scene completion of rgb-d scans." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. • [Dosovitskiy2021] Alexey Dosovitskiy et al. An image is worth 16x16 words: Trans- formers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. • [Furukawa2009] Furukawa, Yasutaka, et al. "Manhattan-world stereo." 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009. • [Fan2017] Fan, Haoqiang, Hao Su, and Leonidas J. Guibas. "A point set generation network for 3d object reconstruction from a single image." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. • [Graham2017] Graham, Benjamin, and Laurens van der Maaten. "Submanifold sparse convolutional networks." arXiv preprint arXiv:1706.01307 (2017). • [Groueix2018] Groueix, Thibault, et al. "A papier-mâché approach to learning 3d surface generation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. • [He2016] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. • [Jackson2017] Jackson, Aaron S., et al. "Large pose 3D face reconstruction from a single image via direct volumetric CNN regression." Proceedings of the IEEE International Conference on Computer Vision. 2017. • [Kato2018] Kato, Hiroharu, Yoshitaka Ushiku, and Tatsuya Harada. "Neural 3d mesh renderer." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. • [Lassner2021] Lassner, Christoph, and Michael Zollhofer. "Pulsar: Efficient Sphere-based Neural Rendering." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.
  28. Ҿ༻Ϧετᶄ • [Lin2022] Kai-En Lin, Lin Yen-Chen, Wei-Sheng Lai, Tsung-Yi

    Lin, Yi-Chang Shih, and Ravi Ramamoorthi. Vision transformer for nerf-based view synthesis from a single input image. arXiv preprint arXiv:2207.05736, 2022. • [Ling2022] Selena Ling, Nicholas Sharp, and Alec Jacobson. Vectoradam for rotation equiv- ariant geometry optimization. arXiv preprint arXiv:2205.13599, 2022. • [Liu2019a] Liu, Shichen, et al. "Soft rasterizer: A differentiable renderer for image-based 3d reasoning." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. • [Liu2019b] Liu, Shichen, et al. "Learning to infer implicit surfaces without 3d supervision." NeurIPS 2019. • [Liu2022] Hsueh-Ti Derek Liu, Francis Williams, Alec Jacobson, Sanja Fidler, and Or Litany. Learning smooth neural functions via lipschitz regularization. SIGGRAPH, 2022. • [Loper2014] Loper, Matthew M., and Michael J. Black. "OpenDR: An approximate differentiable renderer." European Conference on Computer Vision. Springer, Cham, 2014. • [Ma2021] Ma, Qianli, et al. "SCALE: Modeling clothed humans with a surface codec of articulated local elements." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021. • [Maturana2015] Maturana, Daniel, and Sebastian Scherer. "Voxnet: A 3d convolutional neural network for real-time object recognition." 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2015. • [Mescheder2019] Mescheder, Lars, et al. "Occupancy networks: Learning 3d reconstruction in function space." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. • [Mildenhall2020] Mildenhall, Ben, et al. "Nerf: Representing scenes as neural radiance fields for view synthesis." European conference on computer vision. Springer, Cham, 2020. • [Miangoleh2021] Miangoleh, S. Mahdi H., et al. "Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021. • [Mueller2022] Thomas Mueller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding. arXiv preprint arXiv:2201.05989, 2022. • [Newell2016] Newell, Alejandro, Kaiyu Yang, and Jia Deng. "Stacked hourglass networks for human pose estimation." European conference on computer vision. Springer, Cham, 2016. • [Nicolet2021] Baptiste Nicolet, Alec Jacobson, and Wenzel Jakob. Large steps in inverse rendering of geometry. ACM Transactions on Graphics (TOG), Vol. 40, No. 6, pp. 1–13, 2021. • [Park2019] Park, Jeong Joon, et al. "Deepsdf: Learning continuous signed distance functions for shape representation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2019. • [Peng2020] Peng, Songyou, et al. "Convolutional occupancy networks." Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer International Publishing, 2020. • [Qi2016] Qi, Charles R., et al. "Volumetric and multi-view cnns for object classification on 3d data." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
  29. Ҿ༻Ϧετᶅ • [Qi2017] Qi, Charles R., et al. "Pointnet: Deep

    learning on point sets for 3d classification and segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. • [Qi2017b] Qi, Charles R., et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space." arXiv preprint arXiv:1706.02413 (2017) • [Ranjan2018] Ranjan, Anurag, et al. "Generating 3D faces using convolutional mesh autoencoders." Proceedings of the European Conference on Computer Vision (ECCV). 2018. • [Riegler2017] Riegler, Gernot, Ali Osman Ulusoy, and Andreas Geiger. "Octnet: Learning deep 3d representations at high resolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. • [Saito2018] Saito, Shunsuke, et al. "3D hair synthesis using volumetric variational autoencoders." ACM Transactions on Graphics (TOG) 37.6 (2018): 1-12. • [Saito2019] Saito, Shunsuke, et al. "Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019. • [Saito2020] Saito, Shunsuke, et al. "Pifuhd: Multi-level pixel-aligned implicit function for high-resolution 3d human digitization." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020. • [Saito2021] Saito, Shunsuke, et al. "SCANimate: Weakly supervised learning of skinned clothed avatar networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021. • [Simonyan2014] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014). • [Tancik2020] Tancik, Matthew, et al. "Fourier features let networks learn high frequency functions in low dimensional domains." arXiv preprint arXiv:2006.10739 (2020). • [Yan2016] Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, and Honglak Lee. Perspec- tive transformer nets: Learning single-view 3d object reconstruction without 3d supervision. Advances in neural information processing systems, Vol. 29, , 2016. • [Yariv2020] Yariv, Lior, et al. "Multiview neural surface reconstruction by disentangling geometry and appearance." arXiv preprint arXiv:2003.09852 (2020). • [Yao2018] Yao, Yao, et al. "Mvsnet: Depth inference for unstructured multi-view stereo." Proceedings of the European Conference on Computer Vision (ECCV). 2018. • [Yan2016] Yan, Xinchen, et al. "Perspective transformer nets: Learning single-view 3d object reconstruction without 3d supervision." arXiv preprint arXiv:1612.00814 (2016). • [Yang2018] Yang, Yaoqing, et al. "Foldingnet: Point cloud auto-encoder via deep grid deformation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018. • [Yu2021] Alex Yu, Vickie Ye, Matthew Tancik, and Angjoo Kanazawa. pixelnerf: Neu- ral radiance fields from one or few images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021. • [Wang2018] Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. Pixel2mesh: Generating 3d mesh models from single rgb images. In Proceedings of the European conference on computer vision (ECCV), pp. 52– 67, 2018. • [Wang2019] Wang Yifan, Felice Serena, Shihao Wu, Cengiz O ̈ztireli, and Olga Sorkine- Hornung. Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics (TOG), Vol. 38, No. 6, pp. 1–14, 2019. • [Wu2015] Wu, Zhirong, et al. "3d shapenets: A deep representation for volumetric shapes." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.