Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Image to Point Cloud Translation using Conditional Generative Adversarial Network for Airborne LiDAR data

teddy
June 24, 2021

Image to Point Cloud Translation using Conditional Generative Adversarial Network for Airborne LiDAR data

IMAGE TO POINT CLOUD TRANSLATION USING CONDITIONAL GENERATIVE ADVERSARIAL NETWORK FOR AIRBORNE LIDAR DATA
T. Shinohara, H. Xiu, and M. Matsuoka
Tokyo Institute of Technology, Department of Architecture and Building Engineering, Yokohama, Japan
Keywords: Airborne LiDAR, Point Clouds, Conditional Adversarial Generative Network, Generative Model

Abstract. This study introduces a novel image to a 3D point-cloud translation method with a conditional generative adversarial network that creates a large-scale 3D point cloud. This can generate supervised point clouds observed via airborne LiDAR from aerial images. The network is composed of an encoder to produce latent features of input images, generator to translate latent features to fake point clouds, and discriminator to classify false or real point clouds. The encoder is a pre-trained ResNet; to overcome the difficulty of generating 3D point clouds in an outdoor scene, we use a FoldingNet with features from ResNet. After a fixed number of iterations, our generator can produce fake point clouds that correspond to the input image. Experimental results show that our network can learn and generate certain point clouds using the data from the 2018 IEEE GRSS Data Fusion Contest.

How to cite. Shinohara, T., Xiu, H., and Matsuoka, M.: IMAGE TO POINT CLOUD TRANSLATION USING CONDITIONAL GENERATIVE ADVERSARIAL NETWORK FOR AIRBORNE LIDAR DATA, ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci., V-2-2021, 169–174, https://doi.org/10.5194/isprs-annals-V-2-2021-169-2021, 2021.

teddy

June 24, 2021
Tweet

More Decks by teddy

Other Decks in Research

Transcript

  1. 1
    Image to Point Cloud
    Translation using
    Conditional Generative Adversarial
    Network for Airborne LiDAR data
    Takayuki Shinohara, Haoyi Xiu, and Masashi Matsuoka
    Tokyo Institute of Technology
    7/July/2021
    Online, ISPRS DIGITAL EDITION

    View Slide

  2. 2
    Tokyo Tech
    Outline
    1. Background and Objective
    2. Proposed Method
    3. Experimental Result
    4. Conclusion

    View Slide

  3. 3
    Tokyo Tech
    1. Background and Objectives

    View Slide

  4. 4
    Tokyo Tech
    Image to Point Cloud Translation
    Single Aerial Photo 3D Point Clouds
    Translation
    Data source: https://ieeexplore.ieee.org/document/8328995

    View Slide

  5. 5
    Tokyo Tech
    3D Reconstruction from Image
    lMulti View Image
    n Photogrammetric method
    n Many Images
    l High Computational Cost
    lDeep Learning
    n Statistical Estimation
    n Single Image
    l Low Computational Cost
    3D Reconstruction from single images using
    deep learning is important
    for low-cost 3D restoration.

    View Slide

  6. 6
    Tokyo Tech
    DL-based Reconstruction 1/2
    lHeight Image Estimation
    Previous papers only generate 2D image
    Images cannot represent 3D,
    point cloud reconstruction is necessary.
    Image from: https://ieeexplore.ieee.org/abstract/document/9190011
    Li et.al 2020

    View Slide

  7. 7
    Tokyo Tech
    DL-based Reconstruction 2/2
    lPoint Cloud Reconstruction
    Only one object in the image is targeted.
    Aerial Photo has more
    complex objects than previous target
    https://ieeexplore.ieee.org/document/8099747/
    Fan et.al 2016 Lin et.al 2018
    https://www.ci2cv.net/media/papers/AAAI2018_chenhuan.pdf

    View Slide

  8. 8
    Tokyo Tech
    Reconstruction for Complex Objects
    lImage-to-Image Translation
    n Pix2Pix[Isola et.al 2017]
    l General pipeline for various tasks with cGAN.
    lAuto Encoder for Point Cloud
    n FoldingNet[Yang et.al 2018]
    l A network that creates a point cloud from a
    2D grid.
    l Easy to extend the method to create a point
    cloud by inputting images.
    We propose Pix2Pix like
    aerial photo to airborne Point Cloud
    translation with FoldingNet-like generator.

    View Slide

  9. 9
    Tokyo Tech
    2. Proposed Method

    View Slide

  10. 10
    Tokyo Tech
    Overview of Our Method
    lPix2Pix pipeline
    Reconstructed
    Fake Data
    G(𝑧)
    Input
    Image
    𝑥
    𝑧
    E G
    D
    Sampled from
    Real Data
    𝑦~𝑅𝑒𝑎𝑙
    Real/Fake
    ResNet FoldingNet
    PointNet++
    Translation
    Discriminator judges
    Real or Reconstructed Fake data
    Encoder maps input
    image into latent feature.
    Generator reconstructs
    Point Cloud.
    GAN

    View Slide

  11. 11
    Tokyo Tech
    Network: Generator
    lResNet[Shu et al. 2015]-based
    𝑧 ∈ ℝ!"×!"
    Residual Block
    Input
    Image
    Input Image
    𝑥 ∈ ℝ$×%"&×%"&
    Reconstructed Point Cloud
    𝐸 𝑥 = 𝑧 ∈ ℝ!"×!"
    Encoder extracts latent feature of input image

    View Slide

  12. 12
    Tokyo Tech
    Network: Generator
    lFoldingNet[Yang et al. 2018]-based
    MLP
    Point
    Cloud
    Reconstructed
    Point Cloud
    𝐺 𝑧 ∈ ℝ'×$
    Generator reconstructs point cloud
    from latent vector extracted by Encoder
    Concat.
    Image Feature
    from Encoder
    𝑧 ∈ ℝ!"×!"
    2D
    Grid
    Concat.
    Concat.

    View Slide

  13. 13
    Tokyo Tech
    Network: Point Discriminator
    lPointNet++[Qi et al. 2017]-based
    N
    Input
    Patch
    Prob.
    Real
    8,192
    4,096
    2,048
    1DCNN
    Downsampling
    ( )
    Sampling
    Grouping
    Fake
    Points
    Real
    Points
    Judge fake or real

    View Slide

  14. 14
    Tokyo Tech
    Optimization
    lReconstruction
    𝐿!"# =
    lGAN
    n Wasserstein loss
    Gen: 𝐿$ = −𝔼% 𝐷 𝐺 𝑧
    Disc: 𝐿& = 𝔼% 𝐷 𝐺 𝑧 - 𝔼 ́
    (~ℝ 𝐷 ́
    𝑥 ]
    lTotal loss
    n In the actual training process,
    we use all these objective functions.
    Chamfer Distance Earth Mover Distance
    !
    !∈#!
    min
    $∈#"
    𝑥 − 𝑦 %
    % + !
    !∈#"
    min
    $∈#!
    𝑥 − 𝑦 %
    % + min
    #!→#"
    !
    ': !∈#!
    1
    2
    𝑥 − 𝜙(𝑥) %
    %

    View Slide

  15. 15
    Tokyo Tech
    3. Experimental Results

    View Slide

  16. 16
    Tokyo Tech
    Experimental Data
    lGRSS Data Fusion Contest 2018
    n Airborne LiDAR observation
    l Target Area
    l urban area
    l Building
    l Vegetation
    l Road
    l Training Patch
    l 25 m2
    l 2,045 points
    l 1,000 patches
    GT
    Point Cloud
    Input
    Aerial Photo
    25 m
    25 m
    Target Area

    View Slide

  17. 17
    Tokyo Tech
    Generated Point Cloud
    Proposed GAN
    and VAE
    method
    generated
    better results
    than raw GAN
    model.

    View Slide

  18. 18
    Tokyo Tech
    4. Conclusion and
    Future Work

    View Slide

  19. 19
    Tokyo Tech
    Conclusion and Future Work
    lConclusion
    n We propose a conditional adversarial
    network to translate Aerial photo into
    Point Cloud observed by airborne LiDAR.
    n Our trained Generator was able to make
    fake point clouds clearly.
    lFuture work
    n Only Qualitative evaluation
    => Quantitative evaluation
    n Combination of Instance Sem.Seg.
    => Label guided point cloud generation
    n Traditional method
    => Change generator into recent architecture

    View Slide