Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Image to Point Cloud Translation using Conditional Generative Adversarial Network for Airborne LiDAR data

June 24, 2021

Image to Point Cloud Translation using Conditional Generative Adversarial Network for Airborne LiDAR data

T. Shinohara, H. Xiu, and M. Matsuoka
Tokyo Institute of Technology, Department of Architecture and Building Engineering, Yokohama, Japan
Keywords: Airborne LiDAR, Point Clouds, Conditional Adversarial Generative Network, Generative Model

Abstract. This study introduces a novel image to a 3D point-cloud translation method with a conditional generative adversarial network that creates a large-scale 3D point cloud. This can generate supervised point clouds observed via airborne LiDAR from aerial images. The network is composed of an encoder to produce latent features of input images, generator to translate latent features to fake point clouds, and discriminator to classify false or real point clouds. The encoder is a pre-trained ResNet; to overcome the difficulty of generating 3D point clouds in an outdoor scene, we use a FoldingNet with features from ResNet. After a fixed number of iterations, our generator can produce fake point clouds that correspond to the input image. Experimental results show that our network can learn and generate certain point clouds using the data from the 2018 IEEE GRSS Data Fusion Contest.

How to cite. Shinohara, T., Xiu, H., and Matsuoka, M.: IMAGE TO POINT CLOUD TRANSLATION USING CONDITIONAL GENERATIVE ADVERSARIAL NETWORK FOR AIRBORNE LIDAR DATA, ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., V-2-2021, 169–174, https://doi.org/10.5194/isprs-annals-V-2-2021-169-2021, 2021.


June 24, 2021

More Decks by teddy

Other Decks in Research


  1. 1 Image to Point Cloud Translation using Conditional Generative Adversarial

    Network for Airborne LiDAR data Takayuki Shinohara, Haoyi Xiu, and Masashi Matsuoka Tokyo Institute of Technology 7/July/2021 Online, ISPRS DIGITAL EDITION
  2. 2 Tokyo Tech Outline 1. Background and Objective 2. Proposed

    Method 3. Experimental Result 4. Conclusion
  3. 4 Tokyo Tech Image to Point Cloud Translation Single Aerial

    Photo 3D Point Clouds Translation Data source: https://ieeexplore.ieee.org/document/8328995
  4. 5 Tokyo Tech 3D Reconstruction from Image lMulti View Image

    n Photogrammetric method n Many Images l High Computational Cost lDeep Learning n Statistical Estimation n Single Image l Low Computational Cost 3D Reconstruction from single images using deep learning is important for low-cost 3D restoration.
  5. 6 Tokyo Tech DL-based Reconstruction 1/2 lHeight Image Estimation Previous

    papers only generate 2D image Images cannot represent 3D, point cloud reconstruction is necessary. Image from: https://ieeexplore.ieee.org/abstract/document/9190011 Li et.al 2020
  6. 7 Tokyo Tech DL-based Reconstruction 2/2 lPoint Cloud Reconstruction Only

    one object in the image is targeted. Aerial Photo has more complex objects than previous target https://ieeexplore.ieee.org/document/8099747/ Fan et.al 2016 Lin et.al 2018 https://www.ci2cv.net/media/papers/AAAI2018_chenhuan.pdf
  7. 8 Tokyo Tech Reconstruction for Complex Objects lImage-to-Image Translation n

    Pix2Pix[Isola et.al 2017] l General pipeline for various tasks with cGAN. lAuto Encoder for Point Cloud n FoldingNet[Yang et.al 2018] l A network that creates a point cloud from a 2D grid. l Easy to extend the method to create a point cloud by inputting images. We propose Pix2Pix like aerial photo to airborne Point Cloud translation with FoldingNet-like generator.
  8. 10 Tokyo Tech Overview of Our Method lPix2Pix pipeline Reconstructed

    Fake Data G(𝑧) Input Image 𝑥 𝑧 E G D Sampled from Real Data 𝑦~𝑅𝑒𝑎𝑙 Real/Fake ResNet FoldingNet PointNet++ Translation Discriminator judges Real or Reconstructed Fake data Encoder maps input image into latent feature. Generator reconstructs Point Cloud. GAN
  9. 11 Tokyo Tech Network: Generator lResNet[Shu et al. 2015]-based 𝑧

    ∈ ℝ!"×!" Residual Block Input Image Input Image 𝑥 ∈ ℝ$×%"&×%"& Reconstructed Point Cloud 𝐸 𝑥 = 𝑧 ∈ ℝ!"×!" Encoder extracts latent feature of input image
  10. 12 Tokyo Tech Network: Generator lFoldingNet[Yang et al. 2018]-based MLP

    Point Cloud Reconstructed Point Cloud 𝐺 𝑧 ∈ ℝ'×$ Generator reconstructs point cloud from latent vector extracted by Encoder Concat. Image Feature from Encoder 𝑧 ∈ ℝ!"×!" 2D Grid Concat. Concat.
  11. 13 Tokyo Tech Network: Point Discriminator lPointNet++[Qi et al. 2017]-based

    N Input Patch Prob. Real 8,192 4,096 2,048 1DCNN Downsampling ( ) Sampling Grouping Fake Points Real Points Judge fake or real
  12. 14 Tokyo Tech Optimization lReconstruction 𝐿!"# = lGAN n Wasserstein

    loss Gen: 𝐿$ = −𝔼% 𝐷 𝐺 𝑧 Disc: 𝐿& = 𝔼% 𝐷 𝐺 𝑧 - 𝔼 ́ (~ℝ 𝐷 ́ 𝑥 ] lTotal loss n In the actual training process, we use all these objective functions. Chamfer Distance Earth Mover Distance ! !∈#! min $∈#" 𝑥 − 𝑦 % % + ! !∈#" min $∈#! 𝑥 − 𝑦 % % + min #!→#" ! ': !∈#! 1 2 𝑥 − 𝜙(𝑥) % %
  13. 16 Tokyo Tech Experimental Data lGRSS Data Fusion Contest 2018

    n Airborne LiDAR observation l Target Area l urban area l Building l Vegetation l Road l Training Patch l 25 m2 l 2,045 points l 1,000 patches GT Point Cloud Input Aerial Photo 25 m 25 m Target Area
  14. 17 Tokyo Tech Generated Point Cloud Proposed GAN and VAE

    method generated better results than raw GAN model.
  15. 19 Tokyo Tech Conclusion and Future Work lConclusion n We

    propose a conditional adversarial network to translate Aerial photo into Point Cloud observed by airborne LiDAR. n Our trained Generator was able to make fake point clouds clearly. lFuture work n Only Qualitative evaluation => Quantitative evaluation n Combination of Instance Sem.Seg. => Label guided point cloud generation n Traditional method => Change generator into recent architecture