FWNetAE: Spatial Representation Learning forFull Waveform Data Using Deep Learning

Slide 1

Slide 1 text

FWNetAE: Spatial Representation Learning for Full Waveform Data Using Deep Learning Takayuki Shinohara, Haoyi Xiu and Masashi Matsuoka Tokyo Institute of Technology 2019.12.09 Wyndham San Diego Bay Side San Diego, California, USA Second International Workshop on Artificial Intelligence for 3D Big Spatial Data Processing (AI3D 2019)

Slide 2

Slide 2 text

Outline 1. Background and Objective 2. Related Study 3. Proposed Method 4. Experimental Results 5. Conclusion and Future Study 2

Slide 3

Slide 3 text

Background and Objective 3

Slide 4

Slide 4 text

Airborne laser scanning(ALS) nApplications • Digital Terrain Models (DTMs), Digital Surface Models (DSMs) • Urban planning • Natural disaster management • Forestry • Facility monitoring nLaser Scanners • Only the first and last pulse • Multi pulse laser scanner • Full-waveform laser scanner 4

Slide 5

Slide 5 text

Full waveform laser scanners nAdvantages • Recording the entire reflected signal “waveform” discretely. • Providing not only 3D point clouds, but also additional information regarding the target properties. The shape and power of the backscatter waveform are related to the geometry and reflection characteristics of the surface. 5 Cited from “Urban land cover classification using airborne LiDAR data: A review”

Slide 6

Slide 6 text

Automatic analysis method nCostly and time consuming in manual processing • ALS presents significant advantages for large-area observation. • Manual processing to extract spatial information from point clouds and their waveform is costly and time consuming. => Automatic data analysis methods are necessary. nAutomatic analysis method for full waveform data • Divided into two method ‣ 3D point cloud and manmade waveform features ‣ Raw waveform In this study, We investigate raw waveform analysis. 6

Slide 7

Slide 7 text

Raw waveform analysis nSelf organization map(E. Maset et.al., 2015) • The first data driven feature extraction method. nDeep learning(Zorzi et.al., 2019) • The first method for combination of raw waveform and 2D grid. The main limitation: • Each waveform data are learned individually. ‣ Difficult to dealing with spatially irregular data. ‣ Most deep learning method are developed for regular data such as Image, Audio and Language. 7 Research Question: Do deep neural networks learn spatially irregular full waveform LiDAR data?

Slide 8

Slide 8 text

To address question nDealing with spatially irregular data • Point Net(Qi et.al, 2017) ‣ One of the deep learning method for spatially irregular data nInvestigating the power of deep learning method • Auto Encoder ‣ One of the representation learning. ‣ Data driven feature extraction method. Objective: Using a deep learning method, a new representation learning method for spatially irregular raw full-waveform data. 8

Slide 9

Slide 9 text

Related Study 9

Slide 10

Slide 10 text

Representation learning nWhat is representation learning? • Discovering the representations from raw data automatically. nAuto Encoder(AE) • The AE is an architecture that learns to reconstruct an input. • Reconstructing an input via a latent vector. • A latent vector have the low dimensional essential information. 10 If training is successful, bottleneck layer provides a low-dimensional latent vector for each input data. Input Reconstructed Input Latent Vector Encode Decode

Slide 11

Slide 11 text

Full Waveform data analysis nPoint cloud data with manmade features • Rule-based • Man-made features and classic machine learning ÞFull-waveform laser scanners are highly advantageous for point cloud classification. Relying on manmade features that are sent to statistical classifiers or simple machine-learning algorithms. nRaw waveform data • Unsupervised classification ‣ Self organization map • Supervised classification ‣ Deep learning-based approach 11

Slide 12

Slide 12 text

Deep learning-based approach n2 stage classification(S. Zorzi et al., 2019) • 1st stage: Waveform analysis by 1D CNN ‣ Individual feature extraction for waveform. ⁃ Highly miss classification. • 2nd stage: Spatial analysis by 2D CNN ‣ Raw classification results are converted to grid data with height information. ‣ Prediction the class of each grid. Individual waveform learning is diseffective Grid based spatial learning is effective Þ Spatial learning method for raw waveform data are needed. 12 Waveform Spatial

Slide 13

Slide 13 text

Spatial deep learning method nPoint Net(Qi et.al., 2017) • The first method for raw point cloud(set of x, y, z) data. • Point Net can deal with spatially irregular data. 13 We extended to spatially irregular raw full waveform data analysis Point Net(Qi et.al., 2017)

Slide 14

Slide 14 text

Proposed Method 14

Slide 15

Slide 15 text

Problem definition nReconstruction of input data using Auto Encoder 15 Encoder: full-waveform data into latent vector z Decoder: transforms the latent vector z to input data Encoder Latent vector z Decoder Each x,y have waveform Input waveform Reconstructed waveform Input data Reconstructed data x x y y Reconstruct special distribution and its waveform

Slide 16

Slide 16 text

Proposed network (FWNetAE) nAuto Encoder for full waveform data • Encoder: Point Net based • Decoder: simple Multi Layer Perceptron(MLP) 16 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets

Slide 17

Slide 17 text

Point Net based encoder 1/3 nThe first block • Computes local features for each dataset. • Some 1D convolution layer. 17 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets

Slide 18

Slide 18 text

Point Net based encoder 2/3 nThe second block • T-nets causes the points to be independent from rigid transformations. • We can get robust features for object angle. 18 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets

Slide 19

Slide 19 text

Point Net based encoder 3/3 nThe third block • Computes global features (Latent Vector) over all the data by a max pooling layer as a symmetric function. 19 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets s

Slide 20

Slide 20 text

Decoder nMLP based architecture • Fully connected layers to produce reconstructed data the same as those of input data. 20 62 2,048 … 1D Conv features features features features features features … Input full waveform LiDAR Data Output full-waveform LiDAR Data Max Pool MLP 62 2,048 Latent vector PointNet based Encoder Decoder T-nets s

Slide 21

Slide 21 text

loss function nLoss function for reconstruction • FWNetAE aims at reconstructing target reconstructed data ! , given latent vector produced from encoding input data . • Spatial matching loss • Waveform reconstruction 21 %&'() = 1 2N . /01 2 3 / − 6 / 7 + / − 6 / 7), 1 <'=)>?@A = 1 2N . /01 2 . B01 C /,B − ̂ /,B 7 . 2 Optimization process Minimize these function

Slide 22

Slide 22 text

Experimental Results 22

Slide 23

Slide 23 text

Dataset nDublin City Dataset • Published by NYU • Point density ‣ about 300 points/m2 • Used area ‣ One of the flight path nTraining data • Sample size: ‣ Train, Val, Test: 300,000, 100,000 100,000 • Input dimension: 2,048x62 (x, y, waveforms) ‣ We selected data including 60 returns, to simplify the problem. 23 Used data observed from flight path

Slide 24

Slide 24 text

Reconstruction results nSpatial reconstruction 24 A matching shape was observed. Mean error for all of test data was 0.051(normalized value) 27 Failure case

Slide 25

Slide 25 text

Reconstruction results nWaveform reconstruction 25 28 Failure case A matching shape was observed. Mean error for all of test data was 0.29

Slide 26

Slide 26 text

Latent space visualization nComparison of some method 26 PCA Proposed method Learnable function Without learning Nonspatial AE Spatial learning method is effective for feature extraction

Slide 27

Slide 27 text

Conclusion and Future Study 27

Slide 28

Slide 28 text

Conclusion and future study nConclusion • This paper presents a novel representation learning method for spatially distributed full-waveform data observed from an ALS using an AE-based architecture called FWNetAE. • The results demonstrate a generalization error for invisible test data. • Moreover, the FWNetAE encoded a meaningful latent vector and the decoders reconstructed the spatial geometry and its waveform value from the encoded latent vector. • However, the PointNet-based encoders could not deal with various input dimension and extract features at various resolutions. nFuture Study • Modern Hieratical learning: PointNet++, Dynamic Graph CNN • Application for Supervised Learning 28

Slide 29

Slide 29 text

supplemental 29

Slide 30

Slide 30 text

Making input data nK-nn • K depend on GPU memory. ‣ In this study, K is 2048. ‣ If K is big value, we can consider large context. ⁃ Context is very important. 30 Random sample Near sample

Slide 31

Slide 31 text

Spatial Deep Learning method nEuclidian data • Image • Audio signal • Natural Language Full waveform lidar data are on of the non-Euclid data 31 nNon-Euclidian data • Graph • Point Cloud

Slide 32

Slide 32 text

T-Net nTransform 32 If the same object are input, Itʼs difficult to deal with rotation. ÞT-net can provide rotation invariant features.