Upgrade to Pro — share decks privately, control downloads, hide ads and more …

FWNetAE: Spatial Representation Learning for Full Waveform Data Using Deep Learning

teddy
December 09, 2019

FWNetAE: Spatial Representation Learning for Full Waveform Data Using Deep Learning

Artificial Intelligence for 3D Big Spatial Data Processing (AI3D 2019), Co-located with IEEE ISM 2019, San Diego, California, USA, December 9-11, 2019

teddy

December 09, 2019
Tweet

More Decks by teddy

Other Decks in Science

Transcript

  1. FWNetAE: Spatial Representation Learning for
    Full Waveform Data Using Deep Learning
    Takayuki Shinohara, Haoyi Xiu and Masashi Matsuoka
    Tokyo Institute of Technology
    2019.12.09
    Wyndham San Diego Bay Side
    San Diego, California, USA
    Second International Workshop on
    Artificial Intelligence for 3D Big Spatial Data Processing
    (AI3D 2019)

    View full-size slide

  2. Outline
    1. Background and Objective
    2. Related Study
    3. Proposed Method
    4. Experimental Results
    5. Conclusion and Future Study
    2

    View full-size slide

  3. Background and Objective
    3

    View full-size slide

  4. Airborne laser scanning(ALS)
    nApplications
    • Digital Terrain Models (DTMs), Digital Surface Models (DSMs)
    • Urban planning
    • Natural disaster management
    • Forestry
    • Facility monitoring
    nLaser Scanners
    • Only the first and last pulse
    • Multi pulse laser scanner
    • Full-waveform laser scanner
    4

    View full-size slide

  5. Full waveform laser scanners
    nAdvantages
    • Recording the entire reflected
    signal “waveform” discretely.
    • Providing not only 3D point
    clouds, but also additional
    information regarding the target
    properties.
    The shape and power of the backscatter waveform are related to
    the geometry and reflection characteristics of the surface.
    5
    Cited from “Urban land cover classification using airborne LiDAR data: A review”

    View full-size slide

  6. Automatic analysis method
    nCostly and time consuming in manual processing
    • ALS presents significant advantages for large-area observation.
    • Manual processing to extract spatial information from point clouds and
    their waveform is costly and time consuming.
    => Automatic data analysis methods are necessary.
    nAutomatic analysis method for full waveform data
    • Divided into two method
    ‣ 3D point cloud and manmade waveform features
    ‣ Raw waveform
    In this study,
    We investigate raw waveform analysis.
    6

    View full-size slide

  7. Raw waveform analysis
    nSelf organization map(E. Maset et.al., 2015)
    • The first data driven feature extraction method.
    nDeep learning(Zorzi et.al., 2019)
    • The first method for combination of raw waveform and 2D grid.
    The main limitation:
    • Each waveform data are learned individually.
    ‣ Difficult to dealing with spatially irregular data.
    ‣ Most deep learning method are developed for regular data such as
    Image, Audio and Language.
    7
    Research Question:
    Do deep neural networks learn spatially irregular
    full waveform LiDAR data?

    View full-size slide

  8. To address question
    nDealing with spatially irregular data
    • Point Net(Qi et.al, 2017)
    ‣ One of the deep learning method for spatially irregular data
    nInvestigating the power of deep learning method
    • Auto Encoder
    ‣ One of the representation learning.
    ‣ Data driven feature extraction method.
    Objective:
    Using a deep learning method,
    a new representation learning method
    for spatially irregular raw full-waveform data.
    8

    View full-size slide

  9. Related Study
    9

    View full-size slide

  10. Representation learning
    nWhat is representation learning?
    • Discovering the representations from raw data automatically.
    nAuto Encoder(AE)
    • The AE is an architecture that learns to reconstruct an input.
    • Reconstructing an input via a latent vector.
    • A latent vector have the low dimensional essential information.
    10
    If training is successful, bottleneck layer provides
    a low-dimensional latent vector for each input data.
    Input
    Reconstructed
    Input
    Latent
    Vector
    Encode Decode

    View full-size slide

  11. Full Waveform data analysis
    nPoint cloud data with manmade features
    • Rule-based
    • Man-made features and classic machine learning
    ÞFull-waveform laser scanners are highly advantageous
    for point cloud classification.
    Relying on manmade features that are sent to statistical classifiers or
    simple machine-learning algorithms.
    nRaw waveform data
    • Unsupervised classification
    ‣ Self organization map
    • Supervised classification
    ‣ Deep learning-based approach
    11

    View full-size slide

  12. Deep learning-based approach
    n2 stage classification(S. Zorzi et al., 2019)
    • 1st stage: Waveform analysis by 1D CNN
    ‣ Individual feature extraction for waveform.
    ⁃ Highly miss classification.
    • 2nd stage: Spatial analysis by 2D CNN
    ‣ Raw classification results are converted
    to grid data with height information.
    ‣ Prediction the class of each grid.
    Individual waveform learning is diseffective
    Grid based spatial learning is effective
    Þ Spatial learning method for
    raw waveform data are needed.
    12
    Waveform
    Spatial

    View full-size slide

  13. Spatial deep learning method
    nPoint Net(Qi et.al., 2017)
    • The first method for raw point cloud(set of x, y, z) data.
    • Point Net can deal with spatially irregular data.
    13
    We extended to spatially irregular raw full waveform data analysis
    Point Net(Qi et.al., 2017)

    View full-size slide

  14. Proposed Method
    14

    View full-size slide

  15. Problem definition
    nReconstruction of input data using Auto Encoder
    15
    Encoder: full-waveform data into latent vector z
    Decoder: transforms the latent vector z to input data


    Encoder
    Latent vector z
    Decoder


    Each x,y have
    waveform
    Input waveform Reconstructed waveform
    Input data Reconstructed data
    x x
    y
    y
    Reconstruct special distribution and its waveform

    View full-size slide

  16. Proposed network (FWNetAE)
    nAuto Encoder for full waveform data
    • Encoder: Point Net based
    • Decoder: simple Multi Layer Perceptron(MLP)
    16
    62
    2,048

    1D Conv
    features
    features
    features
    features
    features
    features

    Input
    full waveform
    LiDAR
    Data
    Output
    full-waveform
    LiDAR
    Data
    Max Pool MLP 62
    2,048
    Latent vector
    PointNet based Encoder Decoder
    T-nets

    View full-size slide

  17. Point Net based encoder 1/3
    nThe first block
    • Computes local features for each dataset.
    • Some 1D convolution layer.
    17
    62
    2,048

    1D Conv
    features
    features
    features
    features
    features
    features

    Input
    full waveform
    LiDAR
    Data
    Output
    full-waveform
    LiDAR
    Data
    Max Pool MLP 62
    2,048
    Latent vector
    PointNet based Encoder Decoder
    T-nets

    View full-size slide

  18. Point Net based encoder 2/3
    nThe second block
    • T-nets causes the points to be independent from rigid transformations.
    • We can get robust features for object angle.
    18
    62
    2,048

    1D Conv
    features
    features
    features
    features
    features
    features

    Input
    full waveform
    LiDAR
    Data
    Output
    full-waveform
    LiDAR
    Data
    Max Pool MLP 62
    2,048
    Latent vector
    PointNet based Encoder Decoder
    T-nets

    View full-size slide

  19. Point Net based encoder 3/3
    nThe third block
    • Computes global features (Latent Vector) over all the data
    by a max pooling layer as a symmetric function.
    19
    62
    2,048

    1D Conv
    features
    features
    features
    features
    features
    features

    Input
    full waveform
    LiDAR
    Data
    Output
    full-waveform
    LiDAR
    Data
    Max Pool MLP 62
    2,048
    Latent vector
    PointNet based Encoder Decoder
    T-nets
    s

    View full-size slide

  20. Decoder
    nMLP based architecture
    • Fully connected layers to produce reconstructed data the same as
    those of input data.
    20
    62
    2,048

    1D Conv
    features
    features
    features
    features
    features
    features

    Input
    full waveform
    LiDAR
    Data
    Output
    full-waveform
    LiDAR
    Data
    Max Pool MLP 62
    2,048
    Latent vector
    PointNet based Encoder Decoder
    T-nets
    s

    View full-size slide

  21. loss function
    nLoss function for reconstruction
    • FWNetAE aims at reconstructing target reconstructed data !
    ,
    given latent vector produced from encoding input data .
    • Spatial matching loss
    • Waveform reconstruction
    21
    %&'()
    =
    1
    2N
    .
    /01
    2
    3 /
    − 6
    /
    7
    + /
    − 6
    /
    7), 1
    <'=)>?@A
    =
    1
    2N
    .
    /01
    2
    .
    B01
    C
    /,B
    − ̂
    /,B
    7
    . 2
    Optimization process
    Minimize these function

    View full-size slide

  22. Experimental Results
    22

    View full-size slide

  23. Dataset
    nDublin City Dataset
    • Published by NYU
    • Point density
    ‣ about 300 points/m2
    • Used area
    ‣ One of the flight path
    nTraining data
    • Sample size:
    ‣ Train, Val, Test: 300,000, 100,000 100,000
    • Input dimension: 2,048x62 (x, y, waveforms)
    ‣ We selected data including 60 returns, to simplify the
    problem.
    23
    Used data observed from flight path

    View full-size slide

  24. Reconstruction results
    nSpatial reconstruction
    24
    A matching shape was observed.
    Mean error for all of test data was 0.051(normalized value)
    27
    Failure case

    View full-size slide

  25. Reconstruction results
    nWaveform reconstruction
    25
    28
    Failure case
    A matching shape was observed.
    Mean error for all of test data was 0.29

    View full-size slide

  26. Latent space visualization
    nComparison of some method
    26
    PCA Proposed method
    Learnable function
    Without learning
    Nonspatial AE
    Spatial learning method is effective for feature extraction

    View full-size slide

  27. Conclusion and Future Study
    27

    View full-size slide

  28. Conclusion and future study
    nConclusion
    • This paper presents a novel representation learning method for
    spatially distributed full-waveform data observed from an ALS using an
    AE-based architecture called FWNetAE.
    • The results demonstrate a generalization error for invisible test data.
    • Moreover, the FWNetAE encoded a meaningful latent vector and the
    decoders reconstructed the spatial geometry and its waveform value
    from the encoded latent vector.
    • However, the PointNet-based encoders could not deal with various
    input dimension and extract features at various resolutions.
    nFuture Study
    • Modern Hieratical learning: PointNet++, Dynamic Graph CNN
    • Application for Supervised Learning
    28

    View full-size slide

  29. supplemental
    29

    View full-size slide

  30. Making input data
    nK-nn
    • K depend on GPU memory.
    ‣ In this study, K is 2048.
    ‣ If K is big value, we can consider large context.
    ⁃ Context is very important.
    30
    Random sample
    Near sample

    View full-size slide

  31. Spatial Deep Learning method
    nEuclidian data
    • Image
    • Audio signal
    • Natural Language
    Full waveform lidar data are on of the non-Euclid data
    31
    nNon-Euclidian data
    • Graph
    • Point Cloud

    View full-size slide

  32. T-Net
    nTransform
    32
    If the same object are input,
    Itʼs difficult to deal with rotation.
    ÞT-net can provide rotation
    invariant features.

    View full-size slide