Upgrade to Pro — share decks privately, control downloads, hide ads and more …

ニューラル3次元復元入門

 ニューラル3次元復元入門

Shunsuke Saito

April 12, 2023
Tweet

Other Decks in Research

Transcript

  1. χϡʔϥϧ3࣍ݩ෮ݩೖ໳
    ੪౻ ൏հ
    ୈ188ճCGɾୈ32ճDCCɾୈ231ճCVIM߹ಉݚڀൃදձ

    View Slide

  2. ੪౻ ൏հʢ͍͞ͱ͏ ͠ΎΜ͚͢ʣ
    • ϖϯγϧόχΞେֶ٬һݚڀһ (2014-2015)


    • ೆΧϦϑΥϧχΞେֶ PhD (2015-2020)


    • Πϯλʔϯ: FAIR, FRL, Adobe,ϚοΫεϓϥϯΫݚڀॴͳͲ


    • Reality Labs Research ݚڀһ (2020-)
    Computational Body Building

    (SIGGRAPH 2015)
    PIFu/PIFuHD

    (ICCV 2019, CVPR 2020)
    SCANimate

    (CVPR 2021)

    View Slide

  3. ΰʔϧ


    • χϡʔϥϧࡾ࣍ݩ෮ݩͷϑϨʔϜϫʔΫͷཧղ


    • ֤࠷৽ݚڀΛϑϨʔϜϫʔΫʹ౰ͯ͸ΊΒΕΔ


    • ֤ݚڀྖҬͷτϨϯυΛ཈͑Δ
    ͜ͷνϡʔτϦΞϧʹ͍ͭͯ

    View Slide

  4. • Hand-craftedͳࣄલ෼෍͕ෆཁ


    • σʔλͦͷ΋ͷ͔Βෳࡶͳࣄલ෼෍ΛಘΔ͜ͱ͕Ͱ͖Δ
    ͳͥσʔλυϦϒϯͳ3࣍ݩ෮ݩʁ
    PIFuHD [Saito2020]
    ϚϯϋολϯϫʔϧυԾઆ [Furukawa2009]

    View Slide

  5. χϡʔϥϧࡾ࣍ݩ෮ݩͷϑϨʔϜϫʔΫ
    • ୯؟ը૾


    • ਂ౓෇͖ը૾


    • ෳ਺ը૾


    • ఺܈ɾεΩϟϯ
    • ϘΫηϧ


    • ਂ౓Ϛοϓʢ2.5Dʣ


    • ఺܈


    • ϝογϡ


    • χϡʔϥϧ৔
    • ڭࢣ͋Γֶश

    (࠶ߏ੒ଛࣦ)


    • ࣗݾڭࢣ͋Γֶश
    (ٯϨϯμϦϯά)


    • ਖ਼ଇԽ
    Τϯίʔμʔ σίʔμʔ
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    ਪ࿦
    ύϥϝʔλʔߋ৽ʢSGDʣ

    View Slide

  6. χϡʔϥϧࡾ࣍ݩ෮ݩͷϑϨʔϜϫʔΫ
    • ϘΫηϧ


    • ਂ౓Ϛοϓʢ2.5Dʣ


    • ఺܈


    • ϝογϡ


    • χϡʔϥϧ৔
    σίʔμʔ
    ग़ྗσʔλ ଛࣦؔ਺
    ਪ࿦
    ύϥϝʔλʔߋ৽ʢSGDʣ
    • ڭࢣ͋Γֶश

    (࠶ߏ੒ଛࣦ)


    • ࣗݾڭࢣ͋Γֶश
    (ٯϨϯμϦϯά)


    • ਖ਼ଇԽ
    • ୯؟ը૾


    • ਂ౓෇͖ը૾


    • ෳ਺ը૾


    • ఺܈ɾεΩϟϯ
    Τϯίʔμʔ
    ೖྗσʔλ

    View Slide

  7. • Ԡ༻ઌʹΑܾͬͯ·Δ͜ͱ͕ଟ͍


    • ྫɿखܰͳ3࣍ݩ෮ݩˠ୯؟ը૾͕ೖྗ


    • ೖྗɿը૾σʔλͱʢ෦෼తͳʣ3࣍ݩσʔλʹ෼ྨͰ͖Δ


    • ద੾ͳΤϯίʔμʔΛબ୒͢Δඞཁ͕͋Δ


    • SOTAͷΞʔΩςΫνϟΛબ୒͢Δͷ͕جຊ
    ೖྗσʔλʹ͍ͭͯ

    View Slide

  8. • େҬಛ௃ʢશମܗঢ়ΛϕΫτϧͰදݱʣ


    • ΧςΰϦ಺ͷܗঢ়ͷྨࣅੑ͕ߴ͍


    • ڧ੍͍໿ΛՃ͍͑ͨʢະ؍ଌͷ෦෼͕େ͖͍৔߹ͳͲʣ


    • ҙຯ্ͷฤूΛߦ͍͍ͨ


    • ہॴಛ௃ʢۭؒํ޲ͷ޿͕ΓΛอ࣋ͨ͠ಛ௃ྔʣ


    • ֶशσʔλ͕ݶΒΕ͍ͯΔ


    • ਫ਼ࡉͳܗঢ়Λ෮ݩ͍ͨ͠


    • ہॴతͳฤूΛՃ͍͑ͨ
    େҬಛ௃vsہॴಛ௃

    View Slide

  9. • ը૾σʔλͷ৔߹͸࠷৽ͷը૾ΤϯίʔμʔΛ࢖͏ͷ͕جຊ

    e.g., VGG[Simonyan2014], ResNet[He2016], Hourglass[Newell2016]


    • λεΫʹԠͯ͡ޮՌΛൃش͢ΔΞʔΩςΫνϟ͕ҧ͏͜ͱ΋͋Δ

    e.g., Hourglass →ϙʔζਪఆɺVGG→ը෩సࣸ
    Τϯίʔμʔɿ୯؟ը૾ɺਂ౓෇͖ը૾

    View Slide

  10. • τϨϯυɿඇہॴతͳΤϯίʔμʔʢViT [Dosovitskiy2021]ͳͲʣ
    Τϯίʔμʔɿ୯؟ը૾ɺਂ౓෇͖ը૾
    Lin
    [Lin2022]

    View Slide

  11. • Χϝϥύϥϝʔλ͕ط஌ͷ৔߹ɺ

    زԿతؔ܎ΛωοτϫʔΫʹ૊ΈࠐΉ


    • ྫɿϗϞάϥϑΟʔ [Yao2018]
    Τϯίʔμʔɿෳ਺ࢹ఺ը૾
    https://medium.com/@NegativeMind//2d-3d෮ݩٕज़Ͱ࢖ΘΕΔ༻ޠ·ͱΊ-27403689da1b

    View Slide

  12. • Kinect΍LiDARͳͲ͔ΒಘΒΕΔೖྗ͕ओ


    • ը૾΍ϝογϡͱҟͳΓɺ఺܈͸௖఺਺͕มಈͨ͠Γॱং͕ͳ͍


    • ఺܈ɾεΩϟϯͷಛੑʹରԠͨ͠ΞʔΩςΫνϟ͕ඞཁʹͳͬͯ͘Δ
    Τϯίʔμʔɿ఺܈ɺεΩϟϯσʔλ
    PointNet [Qi2017a]

    View Slide

  13. Τϯίʔμʔɿ఺܈ɺεΩϟϯσʔλ
    PointNet [Qi2017a]

    View Slide

  14. Τϯίʔμʔɿ఺܈ɺεΩϟϯσʔλ
    PointNet [Qi2017a]

    View Slide

  15. Τϯίʔμʔɿ఺܈ɺεΩϟϯσʔλ
    PointNet [Qi2017a]
    https://github.com/ThibaultGROUEIX/AtlasNet/blob/master/model/model_blocks.py
    x: ೖྗಛ௃ྔʢ௖఺࠲ඪɺ๏ઢͳͲʣ
    MLPͰ֤఺ͷxΛજࡏม਺ʹม׵
    ֤఺ͷજࡏม਺Λmax poolingͰ౷߹
    ౷߹͞Εͨજࡏม਺ʹ

    ͞ΒʹMLPΛ͔͚ͯ

    ࠷ऴతͳಛ௃ྔΛಘΔ

    View Slide

  16. Τϯίʔμʔɿ఺܈ɺεΩϟϯσʔλ
    PointNet [Qi2017a]ͷ໰୊఺
    • શମͷಛ௃͕̍ճͷMax poolingͰ౷߹ˠ֊૚తͳߏ଄ཧղ͕ࠔ೉


    • ֊૚తͳMax poolingͷಋೖ (PointNet++ [Qi2017b])

    View Slide

  17. Τϯίʔμʔɿ఺܈ɺεΩϟϯσʔλ
    Sparse Convolution
    3D Convolution [Wu2015]: O(kdmn)

    ϧʔϜαΠζͷεΩϟϯʹద༻ෆՄ
    Sparse 3D Convolution [Graham2017]

    େن໛γʔϯͷεΩϟϯ͕ॲཧՄೳʹ

    View Slide

  18. Τϯίʔμʔɿ఺܈ɺεΩϟϯσʔλ
    Ԡ༻ྫɿSparse Convolution
    େن໛ͳεΩϟϯͷิ׬ [Dai2020]

    View Slide

  19. Τϯίʔμʔɿ఺܈ɺεΩϟϯσʔλ
    PointNet + 2D Convolutions [Peng2020]
    ఺܈ΛPointNetͰॲཧ͠ಛ௃ۭؒʹϚοϐϯάͨ͠ͷͪ


    2࣍ݩฏ໘܈ʢTri-plane)ʹసࣸͯ͠৞ΈࠐΈωοτϫʔΫͰॲཧ

    View Slide

  20. Τϯίʔμʔɿ఺܈ɺεΩϟϯσʔλ
    τϨϯυᶃɿ3࣍ݩੜ੒ϞσϧͷͨΊͷTri-planeදݱ
    EG3D [Chan2022]

    View Slide

  21. Τϯίʔμʔɿ఺܈ɺεΩϟϯσʔλ
    τϨϯυᶃɿ3࣍ݩੜ੒ϞσϧͷͨΊͷTri-planeදݱ
    EG3D [Chan2022]

    View Slide

  22. Τϯίʔμʔɿ఺܈ɺεΩϟϯσʔλ
    τϨϯυᶄɿճసෆมɾಉมΤϯίʔμʔ
    Vector Neurons [Deng2022]
    ௨ৗͷશ݁߹૚


    εΧϥʔ
    Vector Neurons


    3࣍ݩϕΫτϧ

    View Slide

  23. Τϯίʔμʔʹ͓͚Δࠓޙͷ՝୊
    ߴղ૾౓ɾಈత෺ମ΁ͷରԠ

    View Slide

  24. • ୯؟ը૾


    • ਂ౓෇͖ը૾


    • ෳ਺ը૾


    • ఺܈ɾεΩϟϯ
    • ϘΫηϧ


    • ਂ౓Ϛοϓʢ2.5Dʣ


    • ఺܈


    • ϝογϡ


    • χϡʔϥϧ৔
    σίʔμʔ
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    ਪ࿦
    ύϥϝʔλʔߋ৽ʢSGDʣ
    ΤϯίʔμʔϨεࡾ࣍ݩ෮ݩ
    • ࣗݾڭࢣ͋Γֶश
    (ٯϨϯμϦϯά)


    • ਖ਼ଇԽ
    γʔϯಛԽܕͷ3࣍ݩ෮ݩ

    View Slide

  25. Instant-NGP [Mueller2022]
    ΤϯίʔμʔϨεࡾ࣍ݩ෮ݩ
    τϨϯυᶃɿσʔλߏ଄ͷվળʹΑΔ࠷దԽʹΑΔߴ଎෮ݩ

    View Slide

  26. Nerfies [Park2021]
    ΤϯίʔμʔϨεࡾ࣍ݩ෮ݩ
    τϨϯυᶄɿมܗͷಉֶ࣌शˠಈత෺ମ΁ͷରԠ
    BANMO [Yang2022]

    View Slide

  27. • ୯؟ը૾


    • ਂ౓෇͖ը૾


    • ෳ਺ը૾


    • ఺܈ɾεΩϟϯ
    • ϘΫηϧ


    • ਂ౓Ϛοϓʢ2.5Dʣ


    • ఺܈


    • ϝογϡ


    • χϡʔϥϧ৔
    Τϯίʔμʔ σίʔμʔ
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    ਪ࿦
    ύϥϝʔλʔߋ৽ʢSGDʣ
    ग़ྗσʔλɾσίʔμʔ
    • ڭࢣ͋Γֶश

    (࠶ߏ੒ଛࣦ)


    • ࣗݾڭࢣ͋Γֶश
    (ٯϨϯμϦϯά)


    • ਖ਼ଇԽ

    View Slide

  28. • ୯؟ը૾


    • ਂ౓෇͖ը૾


    • ෳ਺ը૾


    • ఺܈ɾεΩϟϯ
    Τϯίʔμʔ
    ೖྗσʔλ
    ग़ྗσʔλɾσίʔμʔ
    ଛࣦؔ਺
    ਪ࿦
    ύϥϝʔλʔߋ৽ʢSGDʣ
    • ڭࢣ͋Γֶश

    (࠶ߏ੒ଛࣦ)


    • ࣗݾڭࢣ͋Γֶश
    (ٯϨϯμϦϯά)


    • ਖ਼ଇԽ
    • ϘΫηϧ


    • ਂ౓Ϛοϓʢ2.5Dʣ


    • ఺܈


    • ϝογϡ


    • χϡʔϥϧ৔
    σίʔμʔ
    ग़ྗσʔλ

    View Slide

  29. ϘΫηϧ
    • ໨ඪܗঢ়Λ3࣍ݩ֨ࢠঢ়ʹ֨ೲ


    • Occupancy


    • ූ߸෇͖ڑ཭ؔ਺ʢSDF)


    • TSDF


    • 3D Convolution͕ͦͷ··࢖͑Δ


    • ϝϞϦ࢖༻ྔ͕ϘτϧωοΫ: O(d3)
    [Choy2016; Maturana2015; Qi2016; Wu2015]
    Image credit [Mescheder2019]

    View Slide

  30. ϘΫηϧ
    Ԡ༻ྫɿ̍ຕը૾͔ΒͷNon-parametricͳ3࣍ݩإ෮ݩ [Jackson2017]

    View Slide

  31. ϘΫηϧ
    Ԡ༻ྫɿ൅ܕͷύϥϝʔλԽٴͼը૾͔Βͷਪఆ [Saito2018]

    View Slide

  32. [Saito2018]

    View Slide

  33. [Saito2018]

    View Slide

  34. ϘΫηϧ
    8෼໦ߏ଄Λ༻͍ͨޮ཰తͳ3࣍ݩܗঢ়෮ݩ [Reigler2017, Tatarchenko2017]

    View Slide

  35. ਂ౓Ϛοϓ
    • ը૾ม׵(image-to-image translation)ͷҰछ:

    RGB→Depth


    • ը૾ݚڀͷ࠷৽ٕज़͕Ԡ༻͠΍͍͢

    ʢυϝΠϯసҠɺGANͳͲʣ


    • ൚Խੑೳ͕ߴ͍Ұํɺ

    ΧςΰϦ͝ͱͷਫ਼ࡉͳ෮ݩ͸ෆ޲͖
    ̍ຕը૾͔Βͷߴղ૾ͳਂ౓Ϛοϓਪఆ
    [Miangoleh2021]

    View Slide

  36. ਂ౓Ϛοϓ
    Ԡ༻ྫɿ֦ࢄϞσϧΛ༻͍ͨଟࢹ఺εςϨΦ [Shao2022]

    View Slide

  37. ఺܈ [Fan2017]
    • ໨ඪܗঢ়Λ௖఺ͷू߹ͱͯ͠දݱ


    • શ௖఺Λಉ࣌ʹग़ྗ͢ΔΞϓϩʔν͕ओྲྀ


    • τϙϩδʔͷมԽʹॊೈͰେ͖ͳมܗʹ΋ରԠՄ


    • ఺܈͔ΒϨϯμϦϯά౳ͷͨΊʹϝογϡԽ͢Δ
    ͜ͱ͕೉͘͠ߴ඼࣭ͳܗঢ়ग़ྗʹෆ޲͖
    Image credit [Mescheder2019]

    View Slide

  38. ఺܈ [Fan2017]
    ը૾͔Βજࡏม਺Λճؼ͢ΔΤϯίʔμʔͱ


    જࡏม਺͔Β఺܈Λు͖ग़͢σίʔμʔΛֶश͢Δ

    View Slide

  39. ఺܈ͷԠ༻ྫ
    ਖ਼ن෼෍ʹԊͬͯ

    αϯϓϧ͞Εͨ఺܈
    λʔήοτ3࣍ݩܗঢ়
    ࿈ଓਖ਼نԽྲྀ
    ࿈ଓਖ਼نԽྲྀʹΑΔ఺܈ϞσϦϯά [Yang2020]

    View Slide

  40. ࿈ଓਖ਼نԽྲྀʹΑΔ఺܈ϞσϦϯά [Yang2020]
    ֶश࣌ʢΦʔτΤϯίʔμʔʣ ਪ࿦ʢαϯϓϦϯάʣ
    ఺܈ͷԠ༻ྫ

    View Slide

  41. ఺܈Λ༻͍ͨଟࢹ఺εςϨΦ [Chen2020]
    ఺܈ͷԠ༻ྫ

    View Slide

  42. ఺܈Λ༻͍ͨଟࢹ఺εςϨΦ [Chen2020]
    CNNʹΑΔ

    ଟ૚ہॴಛ௃ྔ
    CNN ૈ͍ਂ౓Ϛοϓ ਖ਼ղ஋
    ࢒ࠩ
    ఺܈্Ͱͷվྑ
    ਫ਼ࡉͳਂ౓Ϛοϓ
    ఺܈্Ͱͷ

    ಛ௃ྔαϯϓϧ
    ܁Γฦ͠ʹΑΔ࠷దԽ
    ఺܈ͷԠ༻ྫ

    View Slide

  43. ఺܈ͷԠ༻ྫ
    ఺܈Λ༻͍ͨNeRF [Xu2022]
    ߴਫ਼౓ˍߴ଎ͳֶशΛ࣮ݱ

    View Slide

  44. ϝογϡ
    • CGͰ͸࠷΋Ұൠతͳܗঢ়දݱ

    →ϨϯμϦϯάΤϯδϯͱͷ૬ੑ΋ྑ͍


    • ෳ਺ͷσίʔσΟϯάํ๏͕ଘࡏ͢Δ


    • Fully Connected (MLP)


    • Graph Convolution


    • AtlasNet


    • ৄࡉදݱͷֶश΍τϙϩδʔมԽ͕ࠔ೉ 3D ϞʔϑΝϒϧϞσϧ [Blanz1998]

    View Slide

  45. ϝογϡ
    Graph Convolution [Ranjan2020]
    શ݁߹Ͱͳ͘ɺ֊૚తͳܗঢ়ͷֶश͕Ͱ͖ΔͷͰ

    গͳ͍ύϥϝʔλʔͰΑΓදݱྗͷ͋ΔϞσϧ͕࣮ݱͰ͖Δ

    View Slide

  46. ϝογϡ
    જࡏม਺
    શ௖఺ͷू߹ มܗޙͷ3࣍ݩ࠲ඪ
    Ξτϥε [Groueix2018; Yang2018]
    MLP
    z
    MLP
    z
    ैདྷͷܗঢ়දݱ:

    [Fan2017]
    f(z) = X, ℝZ → ℝn×3
    AtlasNet:
    f(z, P) = p, ℝZ × ℝ2 → ℝ3
    P
    ςΫενϟۭؒͷ

    ೚ҙͷ఺

    View Slide

  47. ϝογϡ
    มܗޙͷ3࣍ݩ࠲ඪ
    Ξτϥε [Groueix2018; Yang2018]
    MLP
    z
    P
    • ܗঢ়શମͷ௖఺࠲ඪͷ෼෍Λֶश͢Δ
    ୅ΘΓʹɺ֤ฏ໘ͷ“มܗ”ͱֶͯ͠शʂ
    ˠςΫενϟϚοϐϯάͷཁྖ


    • ද໘ܗঢ়ͷ࿈ଓੑΛߟྀ


    • ղ૾౓͕ݻఆ͞Εͳ͘ͳͬͨʂ


    • ෳ਺ͷΞτϥεΛֶश͢Δ͜ͱͰ

    τϙϩδʔมԽʹରԠ
    AtlasNet:
    f(z, P) = p, ℝZ × ℝ2 → ℝ3
    ςΫενϟۭؒͷ

    ೚ҙͷ఺

    View Slide

  48. ϝογϡ
    Ξτϥε [Groueix2018; Yang2018]
    • ܗঢ়શମͷ௖఺࠲ඪͷ෼෍Λֶश͢Δ
    ୅ΘΓʹɺ֤ฏ໘ͷ“มܗ”ͱֶͯ͠शʂ
    ˠςΫενϟϚοϐϯάͷཁྖ


    • ද໘ܗঢ়ͷ࿈ଓੑΛߟྀ


    • ղ૾౓͕ݻఆ͞Εͳ͘ͳͬͨʂ


    • ෳ਺ͷΞτϥεΛֶश͢Δ͜ͱͰ

    τϙϩδʔมԽʹରԠ

    View Slide

  49. ϝογϡ
    Ξτϥε [Groueix2018; Yang2018]
    • ܗঢ়શମͷ௖఺࠲ඪͷ෼෍Λֶश͢Δ
    ୅ΘΓʹɺ֤ฏ໘ͷ“มܗ”ͱֶͯ͠शʂ
    ˠςΫενϟϚοϐϯάͷཁྖ


    • ද໘ܗঢ়ͷ࿈ଓੑΛߟྀ


    • ղ૾౓͕ݻఆ͞Εͳ͘ͳͬͨʂ


    • ෳ਺ͷΞτϥεΛֶश͢Δ͜ͱͰ

    τϙϩδʔมԽʹରԠ

    View Slide

  50. ϝογϡʗΞτϥε
    Ԡ༻ྫɿϦΪϯάΛߟྀͨ͠Ξτϥε܈ʹΑΔணҥΞόλʔ[Ma2021]

    View Slide

  51. ϝογϡʗΞτϥε
    [Ma2021]
    Ԡ༻ྫɿϦΪϯάΛߟྀͨ͠Ξτϥε܈ʹΑΔணҥΞόλʔ[Ma2021]

    View Slide

  52. • 3࣍ݩܗঢ়Λؔ਺஋ͷϨϕϧηοτͰදݱ


    • Occupancy


    • SDF/TSDF


    • ϘΫηϧͱҧ͍ղ૾౓ͷ੍໿͕ͳ͍


    • ֶशϕʔεͷ3࣍ݩ෮ݩʹ͓͚Δ

    େ͖ͳϒϨΠΫεϧʔ


    • ϝογϡ౳ͷཅతͳܗঢ়நग़ͷͨΊʹ͸
    ϚʔνϯΩϡʔϒ๏͕ඞཁ
    f(x, y, z) := x2 + y2 + z2 − r2
    χϡʔϥϧ৔ʢӄؔ਺ۂ໘ʣ
    Image credit [Mescheder2019]

    View Slide

  53. มܗޙͷ3࣍ݩ࠲ඪ
    MLP
    z
    P
    Neural Implicit:
    f(z, P) = SDF, ℝZ × ℝ3 → ℝ
    MLP
    z
    P
    ςΫενϟۭؒͷ

    ೚ҙͷ఺
    AtlasNet:
    f(z, P) = p, ℝZ × ℝ2 → ℝ3
    3࣍ݩ্ͷ

    ೚ҙͷ఺
    ࢀর఺ͷ

    ූ߸෇͖ڑ཭ؔ਺
    Neural Implicit [Chen/Park/Mescheder2019]
    χϡʔϥϧ৔ʢӄؔ਺ۂ໘ʣ

    View Slide

  54. Neural Implicit [Chen/Park/Mescheder2019]
    χϡʔϥϧ৔ʢӄؔ਺ۂ໘ʣ

    View Slide

  55. ըૉ୯Ґͷӄؔ਺දݱʢPIFu) [Saito2019/2020]
    RC
    • ࡉ෦ͷσΟςʔϧ͕ࣦΘΕͨΓɺଟ༷ͳܗঢ়ͷόϦΤʔγϣϯʹରԠͰ͖ͳ͍


    • ෳ਺ࢹ఺ͷը૾Λ੔߹ੑΛอͬͨ··౷߹͢Δ͜ͱ͕ࠔ೉
    େҬతͳΤϯίʔσΟϯά
    MLP
    χϡʔϥϧ৔ʢӄؔ਺ۂ໘ʣ

    View Slide

  56. େҬతͳΤϯίʔσΟϯά
    • ࡉ෦ͷσΟςʔϧ͕ࣦΘΕͨΓɺଟ༷ͳܗঢ়ͷόϦΤʔγϣϯʹରԠͰ͖ͳ͍


    • ෳ਺ࢹ఺ͷը૾Λ੔߹ੑΛอͬͨ··౷߹͢Δ͜ͱ͕ࠔ೉
    RC
    ըૉ୯Ґͷӄؔ਺දݱʢPIFu) [Saito2019/2020]
    MLP
    χϡʔϥϧ৔ʢӄؔ਺ۂ໘ʣ

    View Slide

  57. RW×H×C
    ըૉ୯Ґͷӄؔ਺දݱʢPIFu) [Saito2019/2020]
    • ہॴతͳը૾ಛ௃ྔΛ࢖͏͜ͱͰɺগͳ͍σʔλ͔ΒͰ΋ߴਫ਼౓ͳ෮ݩΛ࣮ݱ


    • 3࣍ݩ্ۭؒͰಛ௃Λ౷߹Ͱ͖ΔͷͰ೚ҙͷೖྗࢹ఺ʹରԠ͕Մೳ
    ըૉϨϕϧͰͷΤϯίʔσΟϯά
    MLP
    χϡʔϥϧ৔ʢӄؔ਺ۂ໘ʣ

    View Slide

  58. [Saito2019]

    View Slide

  59. PIFuHD [Saito2020] PIFu [Saito2019]

    View Slide

  60. [Saito2020]

    View Slide

  61. σίʔμʔ:ܗঢ়දݱ·ͱΊ
    ఺܈ ϝογϡ ϘΫηϧ χϡʔϥϧ৔
    ղ૾౓ ✅/❌ ✅ ❌ ✅
    τϙϩδʔ ✅ ✅/❌ ✅ ✅
    εϐʔυ ✅ ✅ ✅/❌ ❌
    ϨϯμϦϯά ❌ ✅ ✅/❌ ✅
    • ΫΦϦςΟˠχϡʔϥϧ৔


    • ܗঢ়มԽͷগͳ͍υϝΠϯʢྫɿإʣˠϝογϡ


    • ࠓޙͷτϨϯυɿϋΠϒϦουදݱʢྫɿ఺܈×χϡʔϥϧ৔ʣ

    View Slide

  62. • ୯؟ը૾


    • ਂ౓෇͖ը૾


    • ෳ਺ը૾


    • ఺܈ɾεΩϟϯ
    • ϘΫηϧ


    • ਂ౓Ϛοϓʢ2.5Dʣ


    • ఺܈


    • ϝογϡ


    • χϡʔϥϧ৔
    Τϯίʔμʔ σίʔμʔ
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    ਪ࿦
    ύϥϝʔλʔߋ৽ʢSGDʣ
    ग़ྗσʔλɾσίʔμʔ
    • ڭࢣ͋Γֶश

    (࠶ߏ੒ଛࣦ)


    • ࣗݾڭࢣ͋Γֶश
    (ٯϨϯμϦϯά)


    • ਖ਼ଇԽ

    View Slide

  63. • ϘΫηϧ


    • ਂ౓Ϛοϓʢ2.5Dʣ


    • ఺܈


    • ϝογϡ


    • χϡʔϥϧ৔
    σίʔμʔ
    ग़ྗσʔλ
    • ୯؟ը૾


    • ਂ౓෇͖ը૾


    • ෳ਺ը૾


    • ఺܈ɾεΩϟϯ
    Τϯίʔμʔ
    ೖྗσʔλ
    ग़ྗσʔλɾσίʔμʔ
    ਪ࿦
    ύϥϝʔλʔߋ৽ʢSGDʣ
    ଛࣦؔ਺
    • ڭࢣ͋Γֶश

    (࠶ߏ੒ଛࣦ)


    • ࣗݾڭࢣ͋Γֶश
    (ٯϨϯμϦϯά)


    • ਖ਼ଇԽ

    View Slide

  64. ଛࣦؔ਺ɿڭࢣ͋Γֶश
    • ໨ඪܗঢ়ٴͼͦͷରԠ͕༩͑ΒΕ͍ͯΔ৔߹͸ɺσίʔμʔͷग़ྗ
    ݁Ռͱਖ਼ղ஋ͷޡࠩΛଛࣦؔ਺ʹͰ͖Δ


    • ܗঢ়͸͋Δ͕ରԠ͕༩͑ΒΕ͍ͯͳ͍৔߹


    • ྫɿChamfer Distance

    View Slide

  65. ଛࣦؔ਺ɿٯϨϯμϦϯά
    • ਖ਼ղܗঢ়͕༩͑ΒΕͳ͍৔߹ɺ

    ը૾܈͔ΒٯϨϯμϦϯά໰୊Λղ͘͜ͱΛߟ͑Δ


    • ֤ܗঢ়දݱʹର͠ɺ༷ʑͳඍ෼ՄೳϨϯμϥ͕ଘࡏ


    • ఺܈ →Pulser [Lassner2021]ͳͲ


    • ϘΫηϧˠPTN [Yan2016]ͳͲ


    • ϝογϡˠOpenDR [Loper2014], NMR [Kato2019], Softras [Liu2019a]ͳͲ


    • ӄؔ਺ˠ[Liu2019b], IDR [Yariv2020], NeRF [Mildenhall2020]ͳͲ
    ϝογϡʹ͓͚ΔٯϨϯμϦϯά[Kato2018]

    View Slide

  66. • ਖ਼ଇԽ߲Λ૊Έ߹ΘͤΔ͜ͱͰܗঢ়ʹ੍໿Λ͔͚Δ͜ͱ͕Ͱ͖Δ


    • Ill-posedͳ໰୊ઃఆͰ͸ಛʹ༗ޮ
    ଛࣦؔ਺ɿਖ਼ଇԽ߲
    ଌ஍ઢ੍໿ʢLIMP [Cosmo2020])
    ӄؔ਺ͷද໘๏ઢͷLpϊϧϜͷ૯࿨Λ੍໿߲ʹ

    [Liu2019b]

    View Slide

  67. ଛࣦؔ਺ɿਖ਼ଇԽ߲
    Ԡ༻ྫɿԁ؀੍໿Λ׆༻ͨ͠4DεΩϟϯ͔ΒͷΞόλʔֶश [Saito2021]
    LBS−1
    xs
    xc

    View Slide

  68. ଛࣦؔ਺ɿਖ਼ଇԽ߲
    Ԡ༻ྫɿԁ؀੍໿Λ׆༻ͨ͠4DεΩϟϯ͔ΒͷΞόλʔֶश [Saito2021]
    LBS−1
    LBS
    xs
    xc
    xp

    View Slide

  69. ଛࣦؔ਺ɿਖ਼ଇԽ߲
    Ԡ༻ྫɿԁ؀੍໿Λ׆༻ͨ͠4DεΩϟϯ͔ΒͷΞόλʔֶश [Saito2021]
    LBS−1
    LBS
    xs
    xc
    xp
    ಉ͡ܗঢ়ʹҰக͢Δ͸ͣ
    xs = LBS(LBS−1(xs))

    View Slide

  70. [Saito2021]

    View Slide

  71. ଛࣦؔ਺ɿਖ਼ଇԽ߲
    χϡʔϥϧ৔ͷϦϓγοπ࿈ଓਖ਼نԽ [Liu2022]
    τϨϯυᶃɿதؒ૚ͷਖ਼ଇԽ

    View Slide

  72. ଛࣦؔ਺ɿਖ਼ଇԽ߲
    ޯ഑ͷϥϓϥγΞϯਖ਼ଇԽ [Nicolet2021]
    τϨϯυᶄɿޯ഑ͷਖ਼ଇԽ

    View Slide

  73. ଛࣦؔ਺ɿਖ਼ଇԽ߲
    ճసಉมͳOptimizerʢVectorAdam [Ling2022]ʣ
    τϨϯυᶅɿOptimizerͷਖ਼ଇԽ

    View Slide

  74. ϑϨʔϜϫʔΫͰΈΔ୯؟෮ݩ

    View Slide

  75. • ୯؟ը૾
    • ϘΫηϧ


    • ఺܈
    2D CNN

    (େҬಛ௃ʣ
    3D CNN
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    χϡʔϥϧ୯؟෮ݩ૲૑ظ
    [Wu2015]
    [Fan2017]
    • ڭࢣ͋Γֶश

    (࠶ߏ੒ଛࣦ)

    View Slide

  76. • ϝογϡ
    • ڭࢣ͋Γֶश

    (࠶ߏ੒ଛࣦ)


    • ਖ਼ଇԽ
    • ୯؟ը૾
    2D CNN

    (େҬಛ௃ʣ
    3D CNN
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    ϝογϡදݱͷ୆಄
    2D CNN

    (ہॴಛ௃ʣ
    Graph Conv.
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    Pixel2Mesh [Wang2018]

    View Slide

  77. • ࣗݾڭࢣ͋Γֶश
    (ٯϨϯμϦϯά)


    • ਖ਼ଇԽ
    • ϘΫηϧ


    • ϝογϡ


    • ఺܈
    • ୯؟ը૾
    2D CNN

    (େҬಛ௃ʣ
    3D CNN
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    ඍ෼ՄೳϨϯμϦϯάͷ༂ਐ
    2D CNN

    (ہॴಛ௃ʣ
    Graph Conv.
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    ఺܈ [Wang2019]
    2D CNN

    (େҬಛ௃ʣ
    MLP
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    ϘΫηϧ [Yan2016]
    ϝογϡ [Kato2018]

    View Slide

  78. • ڭࢣ͋Γֶश

    (࠶ߏ੒ଛࣦ)
    • χϡʔϥϧ৔

    (ӄؔ਺ද໘)
    • ୯؟ը૾
    2D CNN

    (େҬಛ௃ʣ
    3D CNN
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    χϡʔϥϧ৔େരൃ
    2D CNN

    (ہॴಛ௃ʣ
    Graph Conv.
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    2D CNN

    (େҬಛ௃ʣ
    MLP
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    2D CNN

    (େҬಛ௃ʣ
    MLP
    ೖྗσʔλ
    DeepSDF [Park2019]
    Occupancy Networks

    [Mescheder2019]
    IM-Net [Chen2019]

    View Slide

  79. • ڭࢣ͋Γֶश

    (࠶ߏ੒ଛࣦ)
    • χϡʔϥϧ৔

    (ӄؔ਺ද໘)
    • ୯؟ը૾
    2D CNN

    (େҬಛ௃ʣ
    3D CNN
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    ہॴχϡʔϥϧ৔ʹΑΔ൚Խੑೳ޲্
    2D CNN

    (ہॴಛ௃ʣ
    Graph Conv.
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    2D CNN

    (େҬಛ௃ʣ
    MLP
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    2D CNN

    (େҬಛ௃ʣ
    MLP
    ೖྗσʔλ ଛࣦؔ਺
    2D CNN

    (ہॴಛ௃ʣ
    MLP
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    PIFu [Saito2019]

    View Slide

  80. • χϡʔϥϧ৔

    (NeRF)
    • ୯؟ը૾
    2D CNN

    (େҬಛ௃ʣ
    3D CNN
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    χϡʔϥϧ৔ɼඍ෼ՄೳϨϯμϦϯάͱग़ձ͏
    2D CNN

    (ہॴಛ௃ʣ
    Graph Conv.
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    2D CNN

    (େҬಛ௃ʣ
    MLP
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    2D CNN

    (େҬಛ௃ʣ
    MLP
    ೖྗσʔλ ଛࣦؔ਺
    2D CNN

    (ہॴಛ௃ʣ
    MLP
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    • ࣗݾڭࢣ͋Γֶश
    (ٯϨϯμϦϯά)
    PixelNeRF [Yu2021]

    View Slide

  81. • χϡʔϥϧ৔

    (NeRF)
    • ୯؟ը૾
    2D CNN

    (େҬಛ௃ʣ
    3D CNN
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    ہॴಛ௃ྔͷઌ΁
    2D CNN

    (ہॴಛ௃ʣ
    Graph Conv.
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    2D CNN

    (େҬಛ௃ʣ
    MLP
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    2D CNN

    (େҬಛ௃ʣ
    MLP
    ೖྗσʔλ ଛࣦؔ਺
    2D CNN

    (ہॴಛ௃ʣ
    MLP
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    • ࣗݾڭࢣ͋Γֶश
    (ٯϨϯμϦϯά)
    ViT

    (ඇہॴಛ௃ʣ
    ViT-NeRF [Lin2022]

    View Slide

  82. ·ͱΊ
    • ֤σʔλදݱͷಛੑΛཧղ͠ ΤϯίʔμʔɺσίʔμʔΛσβΠϯ͢Δ


    • ਖ਼ղܗঢ়ͷ༗ແΛߟྀ͠ɺద੾ʹଛࣦؔ਺Λఆٛ͢Δ
    • ୯؟ը૾


    • ਂ౓෇͖ը૾


    • ෳ਺ը૾


    • ఺܈ɾεΩϟϯ
    • ϘΫηϧ


    • ਂ౓Ϛοϓʢ2.5Dʣ


    • ఺܈


    • ϝογϡ


    • χϡʔϥϧ৔
    • ڭࢣ͋Γֶश

    (࠶ߏ੒ଛࣦ)


    • ࣗݾڭࢣ͋Γֶश
    (ٯϨϯμϦϯά)


    • ਖ਼ଇԽ
    Τϯίʔμʔ σίʔμʔ
    ೖྗσʔλ ग़ྗσʔλ ଛࣦؔ਺
    ਪ࿦
    ύϥϝʔλʔߋ৽ʢSGDʣ

    View Slide

  83. Ҿ༻Ϧετᶃ
    • [Blanz1999] Blanz, Volker, and Thomas Vetter. "A morphable model for the synthesis of 3D faces." Proceedings of the 26th annual conference on Computer graphics and
    interactive techniques. 1999.


    • [Chen2019] Chen, Zhiqin, and Hao Zhang. "Learning implicit fields for generative shape modeling." Proceedings of the IEEE/CVF Conference on Computer Vision and
    Pattern Recognition. 2019.


    • [Choy2016] Choy, Christopher B., et al. "3d-r2n2: A unified approach for single and multi-view 3d object reconstruction." European conference on computer vision.
    Springer, Cham, 2016.


    • [Cosmo2020] Cosmo, Luca, et al. "Limp: Learning latent shape representations with metric preservation priors." Computer Vision–ECCV 2020: 16th European Conference,
    Glasgow, UK, August 23–28, 2020, Proceedings, Part III 16. Springer International Publishing, 2020.


    • [Dai2020] Dai, Angela, Christian Diller, and Matthias Nießner. "Sg-nn: Sparse generative neural networks for self-supervised scene completion of rgb-d scans." Proceedings
    of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.


    • [Dosovitskiy2021] Alexey Dosovitskiy et al. An image is worth 16x16 words: Trans- formers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.


    • [Furukawa2009] Furukawa, Yasutaka, et al. "Manhattan-world stereo." 2009 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2009.


    • [Fan2017] Fan, Haoqiang, Hao Su, and Leonidas J. Guibas. "A point set generation network for 3d object reconstruction from a single image." Proceedings of the IEEE
    conference on computer vision and pattern recognition. 2017.


    • [Graham2017] Graham, Benjamin, and Laurens van der Maaten. "Submanifold sparse convolutional networks." arXiv preprint arXiv:1706.01307 (2017).


    • [Groueix2018] Groueix, Thibault, et al. "A papier-mâché approach to learning 3d surface generation." Proceedings of the IEEE conference on computer vision and pattern
    recognition. 2018.


    • [He2016] He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.


    • [Jackson2017] Jackson, Aaron S., et al. "Large pose 3D face reconstruction from a single image via direct volumetric CNN regression." Proceedings of the IEEE
    International Conference on Computer Vision. 2017.


    • [Kato2018] Kato, Hiroharu, Yoshitaka Ushiku, and Tatsuya Harada. "Neural 3d mesh renderer." Proceedings of the IEEE conference on computer vision and pattern
    recognition. 2018.


    • [Lassner2021] Lassner, Christoph, and Michael Zollhofer. "Pulsar: Efficient Sphere-based Neural Rendering." Proceedings of the IEEE/CVF Conference on Computer Vision
    and Pattern Recognition. 2021.

    View Slide

  84. Ҿ༻Ϧετᶄ
    • [Lin2022] Kai-En Lin, Lin Yen-Chen, Wei-Sheng Lai, Tsung-Yi Lin, Yi-Chang Shih, and Ravi Ramamoorthi. Vision transformer for nerf-based view synthesis from a single input image. arXiv
    preprint arXiv:2207.05736, 2022.


    • [Ling2022] Selena Ling, Nicholas Sharp, and Alec Jacobson. Vectoradam for rotation equiv- ariant geometry optimization. arXiv preprint arXiv:2205.13599, 2022.


    • [Liu2019a] Liu, Shichen, et al. "Soft rasterizer: A differentiable renderer for image-based 3d reasoning." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.


    • [Liu2019b] Liu, Shichen, et al. "Learning to infer implicit surfaces without 3d supervision." NeurIPS 2019.


    • [Liu2022] Hsueh-Ti Derek Liu, Francis Williams, Alec Jacobson, Sanja Fidler, and Or Litany. Learning smooth neural functions via lipschitz regularization. SIGGRAPH, 2022.


    • [Loper2014] Loper, Matthew M., and Michael J. Black. "OpenDR: An approximate differentiable renderer." European Conference on Computer Vision. Springer, Cham, 2014.


    • [Ma2021] Ma, Qianli, et al. "SCALE: Modeling clothed humans with a surface codec of articulated local elements." Proceedings of the IEEE/CVF Conference on Computer Vision and
    Pattern Recognition. 2021.


    • [Maturana2015] Maturana, Daniel, and Sebastian Scherer. "Voxnet: A 3d convolutional neural network for real-time object recognition." 2015 IEEE/RSJ International Conference on
    Intelligent Robots and Systems (IROS). IEEE, 2015.


    • [Mescheder2019] Mescheder, Lars, et al. "Occupancy networks: Learning 3d reconstruction in function space." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
    Recognition. 2019.


    • [Mildenhall2020] Mildenhall, Ben, et al. "Nerf: Representing scenes as neural radiance fields for view synthesis." European conference on computer vision. Springer, Cham, 2020.


    • [Miangoleh2021] Miangoleh, S. Mahdi H., et al. "Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging." Proceedings of the
    IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021.


    • [Mueller2022] Thomas Mueller, Alex Evans, Christoph Schied, and Alexander Keller. Instant neural graphics primitives with a multiresolution hash encoding. arXiv preprint
    arXiv:2201.05989, 2022.


    • [Newell2016] Newell, Alejandro, Kaiyu Yang, and Jia Deng. "Stacked hourglass networks for human pose estimation." European conference on computer vision. Springer, Cham, 2016.


    • [Nicolet2021] Baptiste Nicolet, Alec Jacobson, and Wenzel Jakob. Large steps in inverse rendering of geometry. ACM Transactions on Graphics (TOG), Vol. 40, No. 6, pp. 1–13, 2021.


    • [Park2019] Park, Jeong Joon, et al. "Deepsdf: Learning continuous signed distance functions for shape representation." Proceedings of the IEEE/CVF Conference on Computer Vision
    and Pattern Recognition. 2019.


    • [Peng2020] Peng, Songyou, et al. "Convolutional occupancy networks." Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part
    III 16. Springer International Publishing, 2020.


    • [Qi2016] Qi, Charles R., et al. "Volumetric and multi-view cnns for object classification on 3d data." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

    View Slide

  85. Ҿ༻Ϧετᶅ
    • [Qi2017] Qi, Charles R., et al. "Pointnet: Deep learning on point sets for 3d classification and segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.


    • [Qi2017b] Qi, Charles R., et al. "Pointnet++: Deep hierarchical feature learning on point sets in a metric space." arXiv preprint arXiv:1706.02413 (2017)


    • [Ranjan2018] Ranjan, Anurag, et al. "Generating 3D faces using convolutional mesh autoencoders." Proceedings of the European Conference on Computer Vision (ECCV). 2018.


    • [Riegler2017] Riegler, Gernot, Ali Osman Ulusoy, and Andreas Geiger. "Octnet: Learning deep 3d representations at high resolutions." Proceedings of the IEEE conference on computer vision and
    pattern recognition. 2017.


    • [Saito2018] Saito, Shunsuke, et al. "3D hair synthesis using volumetric variational autoencoders." ACM Transactions on Graphics (TOG) 37.6 (2018): 1-12.


    • [Saito2019] Saito, Shunsuke, et al. "Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization." Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019.


    • [Saito2020] Saito, Shunsuke, et al. "Pifuhd: Multi-level pixel-aligned implicit function for high-resolution 3d human digitization." Proceedings of the IEEE/CVF Conference on Computer Vision and
    Pattern Recognition. 2020.


    • [Saito2021] Saito, Shunsuke, et al. "SCANimate: Weakly supervised learning of skinned clothed avatar networks." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern
    Recognition. 2021.


    • [Simonyan2014] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).


    • [Tancik2020] Tancik, Matthew, et al. "Fourier features let networks learn high frequency functions in low dimensional domains." arXiv preprint arXiv:2006.10739 (2020).


    • [Yan2016] Xinchen Yan, Jimei Yang, Ersin Yumer, Yijie Guo, and Honglak Lee. Perspec- tive transformer nets: Learning single-view 3d object reconstruction without 3d supervision. Advances in neural
    information processing systems, Vol. 29, , 2016.


    • [Yariv2020] Yariv, Lior, et al. "Multiview neural surface reconstruction by disentangling geometry and appearance." arXiv preprint arXiv:2003.09852 (2020).


    • [Yao2018] Yao, Yao, et al. "Mvsnet: Depth inference for unstructured multi-view stereo." Proceedings of the European Conference on Computer Vision (ECCV). 2018.


    • [Yan2016] Yan, Xinchen, et al. "Perspective transformer nets: Learning single-view 3d object reconstruction without 3d supervision." arXiv preprint arXiv:1612.00814 (2016).


    • [Yang2018] Yang, Yaoqing, et al. "Foldingnet: Point cloud auto-encoder via deep grid deformation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.


    • [Yu2021] Alex Yu, Vickie Ye, Matthew Tancik, and Angjoo Kanazawa. pixelnerf: Neu- ral radiance fields from one or few images. In Proceedings of the IEEE/CVF Conference on Computer Vision and
    Pattern Recognition (CVPR), 2021.


    • [Wang2018] Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. Pixel2mesh: Generating 3d mesh models from single rgb images. In Proceedings of the European
    conference on computer vision (ECCV), pp. 52– 67, 2018.


    • [Wang2019] Wang Yifan, Felice Serena, Shihao Wu, Cengiz O ̈ztireli, and Olga Sorkine- Hornung. Differentiable surface splatting for point-based geometry processing. ACM Transactions on Graphics
    (TOG), Vol. 38, No. 6, pp. 1–14, 2019.


    • [Wu2015] Wu, Zhirong, et al. "3d shapenets: A deep representation for volumetric shapes." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.

    View Slide