Upgrade to Pro — share decks privately, control downloads, hide ads and more …

OpenTalks.AI - Виктор Лемпицкий, Моделирование 3Д сцен: новые подходы в 2020 году

Ad8ae7af280edaecb09bd73a551b5e5f?s=47 OpenTalks.AI
February 04, 2021

OpenTalks.AI - Виктор Лемпицкий, Моделирование 3Д сцен: новые подходы в 2020 году

Ad8ae7af280edaecb09bd73a551b5e5f?s=128

OpenTalks.AI

February 04, 2021
Tweet

Transcript

  1. 3D Scene Modeling with AI: What was new in 2020

    Victor Lempitsky, Samsung AI Center Moscow Skolkovo Institute of Science and Technology (Skoltech)
  2. 3D scene modeling Input photographs/frames + camera parameters Scene representation

    / model New view of the scene image-based modeling rendering / new view synthesis
  3. Classic pipeline Input photographs/frames + camera parameters Mesh(es)+Texture(s) Pros: highly

    optimized and widely supported rendering Cons: modeling is tough / brittle. Modeling some geometry and photometry is difficult 3D reconstruction pipeline Graphics rendering engine New view of the scene
  4. Neural rendering approach Differentiable Rendering Differentiable representation Pros: • Higher

    modeling power, better realism • Use of sophisticated losses (perceptual, adversarial) Cons: • Differentiable rendering can be (very) slow • Overfitting is often a problem, weaker inductive prior Rendered view Ground truth Loss
  5. Neural Radiance Fields (NeRF) [Middenhall et al. ECCV 2020]

  6. Neural Radiance Fields (NeRF) [Middenhall et al. ECCV 2020] •

    Positional encoding is used to facilitate high-frequency details • The angle parameters are input only to the last layer of the network • “Coarse” network is learned alongside the main network to facilitate faster approximate integration • Still dozens of seconds for a VGA image
  7. Neural Radiance Fields (NeRF) [Middenhall et al. ECCV 2020]

  8. Neural Radiance Fields (NeRF) [Middenhall et al. ECCV 2020] Color

    rendering Depth
  9. Deformable NeRF [Park et al. Arxiv 2020]

  10. Deformable NeRF [Park et al. Arxiv 2020] • The deformation

    field is parameterized by rotation (center+quaternion) and translation • Strong deformations are penalized • Points recovered by SfM are pinpointed to stay put • Blurry frames are removed from the training set
  11. Neural Sparse Voxel Fields [Liu et al. NeurIPS 2020] •

    Geometry is approximated explicitly by an octree • Perceptron is sampled at ray-octree intersections • Training includes several refine-and-prune stages • Order of magnitude speedup over NERF (still not real-time)
  12. Neural Sparse Voxel Fields [Liu et al. NeurIPS 2020]

  13. Deferred neural rendering • Scene = Mesh geometry + neural

    texture • Neural rendering network is used as the last stage of the rendering pipeline • Realistic images are generated even for coarse geometry [Thies et al. ACM ToG 2019]
  14. Deferred neural rendering • Realistic images are generated even for

    coarse geometry [Thies et al. ACM ToG 2019]
  15. Neural dressing model Neural texture SMPL-X body model [Pavlakos et

    al. 2019] [Iskakov et al. 2020]
  16. Fullbody avatars with neural textures

  17. Stable view synthesis Delaunay-based 3D surface reconstruction [Riegler & Koltun

    2020]
  18. Stable view synthesis vs NeRF [Riegler & Koltun 2020]

  19. RGB views and reconstructed Point Cloud RGB Depth Point Cloud

    Neural Point-Based Graphics [Aliev et al. ECCV2020]
  20. p1 positions descriptors p2 pN d1 d2 dN points …

    … rasterizer + z-buffer … Raw images Rendering network … … Result [Aliev et al. ECCV2020] Neural Point-Based Graphics
  21. Neural Point-Based Graphics

  22. Mesh-based vs Point-Based Deferred Neural Rendering (mesh-based) NPBG (point-based) Nearest

    Train
  23. Relightable 3D portraits [Sevastopolsky et al. 2020]

  24. Relightable 3D portraits z-buffer Lighting model Relighted view Albedo Normals

    Room light Mask Point cloud+ descriptors Neural rendering [Sevastopolsky et al. 2020]
  25. Relightable 3D portraits From fixed viewpoint Simultaneous relighting & view

    interpolation
  26. So far: training/fitting individual scenes Differentiable Rendering Differentiable representation Rendered

    view Ground truth Loss …. Multiple training views
  27. Few-shot neural reconstruction Differentiable Rendering Differentiable representation Rendered holdout view

    hold-out view Loss Encoding/reconstructing neural net • Training is performed on a dataset of scenes (tuples of views) • New scenes can be reconstructed from few views (or a single view)
  28. SynSin system • One of several recent systems for single-view

    3D modeling • Uses point-based geometric proxy [Wiles CVPR 2020]
  29. SynSin system [Wiles CVPR 2020] • Splatting is used to

    provide gradients over point locations in 2D • Alpha-over compositing of K closest points to make z- buffer differentiable
  30. • Differentiable rendering • Differentiable structure-and-motion • Supported representations: •

    Point clouds • Textured meshes • NeRFs [Ravi et al. 2020]
  31. [Laine et al. ToG 2020]

  32. Stereo magnification [Zhou et al. SIGGRAPH 2018]

  33. Stereo magnification [Zhou et al. SIGGRAPH 2018]

  34. Immersive Lightfield Video [Broxton et al. SIGGRAPH 2020] https://augmentedperception.github.io/deepviewvideo/

  35. Immersive Lightfield Video [Broxton et al. SIGGRAPH 2020]

  36. Immersive Lightfield Video [Broxton et al. SIGGRAPH 2020]

  37. Immersive Lightfield Video [Broxton et al. SIGGRAPH 2020]

  38. Recap • Various neural scene representations are developing: • Perceptron

    (NeRF) • Mesh + neural texture • Point cloud + neural descriptors • Layered semi-transparent meshes • Differentiable renderers (PyTorch3D, nvdiffrast) make integration of neural networks and graphics easier • Scene fitting and few-shot reconstruction are both actively developing
  39. References Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T.

    Barron, Ravi Ramamoorthi, Ren Ng: NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. ECCV (1) 2020: 405-421 Keunhong Park, Utkarsh Sinha, Jonathan T. Barron, Sofien Bouaziz, Dan B. Goldman, Steven M. Seitz, Ricardo Martin-Brualla: Deformable Neural Radiance Fields. CoRR abs/2011.12948 (2020) Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, Christian Theobalt: Neural Sparse Voxel Fields. NeurIPS 2020 Justus Thies, Michael Zollhöfer, Matthias Nießner: Deferred neural rendering: image synthesis using neural textures. ACM Trans. Graph. 38(4): 66:1-66:12 (2019) Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. A. Osman, Dimitrios Tzionas, Michael J. Black: Expressive Body Capture: 3D Hands, Face, and Body From a Single Image. CVPR 2019: 10975-10985 Gernot Riegler, Vladlen Koltun: Stable View Synthesis. CoRR abs/2011.07233 (2020) Kara-Ali Aliev, Artem Sevastopolsky, Maria Kolos, Dmitry Ulyanov, Victor S. Lempitsky: Neural Point-Based Graphics. ECCV (22) 2020: 696-712
  40. References Artem Sevastopolsky, Savva Ignatiev, Gonzalo Ferrer, Evgeny Burnaev, Victor

    Lempitsky: Relightable 3D Head Portraits from a Smartphone Video. CoRRabs/2012.09963 (2020) Olivia Wiles, Georgia Gkioxari, Richard Szeliski, Justin Johnson: SynSin: End-to-End View Synthesis From a Single Image. CVPR 2020: 7465-7475 Nikhila Ravi, Jeremy Reizenstein, David Novotný, Taylor Gordon, Wan-Yen Lo, Justin Johnson, Georgia Gkioxari: Accelerating 3D Deep Learning with PyTorch3D. CoRR abs/2007.08501 (2020) Samuli Laine, Janne Hellsten, Tero Karras, Yeongho Seol, Jaakko Lehtinen, Timo Aila: Modular primitives for high-performance differentiable rendering. ACM Trans. Graph. 39(6): 194:1-194:14 (2020) Tinghui Zhou, Richard Tucker, John Flynn, Graham Fyffe, Noah Snavely: Stereo magnification: learning view synthesis using multiplane images. ACM Trans. Graph. 37(4): 65:1-65:12 (2018) Michael Broxton, John Flynn, Ryan S. Overbeck, Daniel Erickson, Peter Hedman, Matthew DuVall, Jason Dourgarian, Jay Busch, Matt Whalen, Paul E. Debevec: Immersive light field video with a layered mesh representation. ACM Trans. Graph. 39(4): 86 (2020)