Image to Point Cloud Translation using Conditional Generative Adversarial Network for Airborne LiDAR data

1 Image to Point Cloud Translation using Conditional Generative Adversarial
Network for Airborne LiDAR data Takayuki Shinohara, Haoyi Xiu, and Masashi Matsuoka Tokyo Institute of Technology 7/July/2021 Online, ISPRS DIGITAL EDITION

2 Tokyo Tech Outline 1. Background and Objective 2. Proposed
Method 3. Experimental Result 4. Conclusion

3 Tokyo Tech 1. Background and Objectives

4 Tokyo Tech Image to Point Cloud Translation Single Aerial
Photo 3D Point Clouds Translation Data source: https://ieeexplore.ieee.org/document/8328995

5 Tokyo Tech 3D Reconstruction from Image lMulti View Image
n Photogrammetric method n Many Images l High Computational Cost lDeep Learning n Statistical Estimation n Single Image l Low Computational Cost 3D Reconstruction from single images using deep learning is important for low-cost 3D restoration.

6 Tokyo Tech DL-based Reconstruction 1/2 lHeight Image Estimation Previous
papers only generate 2D image Images cannot represent 3D, point cloud reconstruction is necessary. Image from: https://ieeexplore.ieee.org/abstract/document/9190011 Li et.al 2020

7 Tokyo Tech DL-based Reconstruction 2/2 lPoint Cloud Reconstruction Only
one object in the image is targeted. Aerial Photo has more complex objects than previous target https://ieeexplore.ieee.org/document/8099747/ Fan et.al 2016 Lin et.al 2018 https://www.ci2cv.net/media/papers/AAAI2018_chenhuan.pdf

8 Tokyo Tech Reconstruction for Complex Objects lImage-to-Image Translation n
Pix2Pix[Isola et.al 2017] l General pipeline for various tasks with cGAN. lAuto Encoder for Point Cloud n FoldingNet[Yang et.al 2018] l A network that creates a point cloud from a 2D grid. l Easy to extend the method to create a point cloud by inputting images. We propose Pix2Pix like aerial photo to airborne Point Cloud translation with FoldingNet-like generator.

9 Tokyo Tech 2. Proposed Method

10 Tokyo Tech Overview of Our Method lPix2Pix pipeline Reconstructed
Fake Data G(𝑧) Input Image 𝑥 𝑧 E G D Sampled from Real Data 𝑦~𝑅𝑒𝑎𝑙 Real/Fake ResNet FoldingNet PointNet++ Translation Discriminator judges Real or Reconstructed Fake data Encoder maps input image into latent feature. Generator reconstructs Point Cloud. GAN

11 Tokyo Tech Network: Generator lResNet[Shu et al. 2015]-based 𝑧
∈ ℝ!"×!" Residual Block Input Image Input Image 𝑥 ∈ ℝ$×%"&×%"& Reconstructed Point Cloud 𝐸 𝑥 = 𝑧 ∈ ℝ!"×!" Encoder extracts latent feature of input image

12 Tokyo Tech Network: Generator lFoldingNet[Yang et al. 2018]-based MLP
Point Cloud Reconstructed Point Cloud 𝐺 𝑧 ∈ ℝ'×$ Generator reconstructs point cloud from latent vector extracted by Encoder Concat. Image Feature from Encoder 𝑧 ∈ ℝ!"×!" 2D Grid Concat. Concat.

13 Tokyo Tech Network: Point Discriminator lPointNet++[Qi et al. 2017]-based
N Input Patch Prob. Real 8,192 4,096 2,048 1DCNN Downsampling ( ) Sampling Grouping Fake Points Real Points Judge fake or real

14 Tokyo Tech Optimization lReconstruction 𝐿!"# = lGAN n Wasserstein
loss Gen: 𝐿$ = −𝔼% 𝐷 𝐺 𝑧 Disc: 𝐿& = 𝔼% 𝐷 𝐺 𝑧 - 𝔼 ́ (~ℝ 𝐷 ́ 𝑥 ] lTotal loss n In the actual training process, we use all these objective functions. Chamfer Distance Earth Mover Distance ! !∈#! min $∈#" 𝑥 − 𝑦 % % + ! !∈#" min $∈#! 𝑥 − 𝑦 % % + min #!→#" ! ': !∈#! 1 2 𝑥 − 𝜙(𝑥) % %

15 Tokyo Tech 3. Experimental Results

16 Tokyo Tech Experimental Data lGRSS Data Fusion Contest 2018
n Airborne LiDAR observation l Target Area l urban area l Building l Vegetation l Road l Training Patch l 25 m2 l 2,045 points l 1,000 patches GT Point Cloud Input Aerial Photo 25 m 25 m Target Area

17 Tokyo Tech Generated Point Cloud Proposed GAN and VAE
method generated better results than raw GAN model.

18 Tokyo Tech 4. Conclusion and Future Work

19 Tokyo Tech Conclusion and Future Work lConclusion n We
propose a conditional adversarial network to translate Aerial photo into Point Cloud observed by airborne LiDAR. n Our trained Generator was able to make fake point clouds clearly. lFuture work n Only Qualitative evaluation => Quantitative evaluation n Combination of Instance Sem.Seg. => Label guided point cloud generation n Traditional method => Change generator into recent architecture

Image to Point Cloud Translation using Conditio...

Image to Point Cloud Translation using Conditional Generative Adversarial Network for Airborne LiDAR data

teddy

More Decks by teddy

Other Decks in Research

Featured

Transcript

1 Image to Point Cloud Translation using Conditional Generative Adversarial

2 Tokyo Tech Outline 1. Background and Objective 2. Proposed

3 Tokyo Tech 1. Background and Objectives

4 Tokyo Tech Image to Point Cloud Translation Single Aerial

5 Tokyo Tech 3D Reconstruction from Image lMulti View Image

6 Tokyo Tech DL-based Reconstruction 1/2 lHeight Image Estimation Previous

7 Tokyo Tech DL-based Reconstruction 2/2 lPoint Cloud Reconstruction Only

8 Tokyo Tech Reconstruction for Complex Objects lImage-to-Image Translation n

9 Tokyo Tech 2. Proposed Method

10 Tokyo Tech Overview of Our Method lPix2Pix pipeline Reconstructed

11 Tokyo Tech Network: Generator lResNet[Shu et al. 2015]-based 𝑧

12 Tokyo Tech Network: Generator lFoldingNet[Yang et al. 2018]-based MLP

13 Tokyo Tech Network: Point Discriminator lPointNet++[Qi et al. 2017]-based

14 Tokyo Tech Optimization lReconstruction 𝐿!"# = lGAN n Wasserstein

15 Tokyo Tech 3. Experimental Results

16 Tokyo Tech Experimental Data lGRSS Data Fusion Contest 2018

17 Tokyo Tech Generated Point Cloud Proposed GAN and VAE

18 Tokyo Tech 4. Conclusion and Future Work

19 Tokyo Tech Conclusion and Future Work lConclusion n We