Slide 1

Slide 1 text

Multimodal data fusion by low-rank tensor approximations Applications in remote sensing S3 Seminar February 11th, 2022 Univ. Lille, CNRS, Centrale Lille, UMR 9189 CRIStAL 1/36

Slide 2

Slide 2 text

General introduction

Slide 3

Slide 3 text

General introduction Data fusion

Slide 4

Slide 4 text

What is multimodality ? Vocabulary Modality Signal Phenomenon of interest acquires contains Several datasets → Multimodality. 2/36

Slide 5

Slide 5 text

What is multimodality ? Vocabulary Modality Signal Phenomenon of interest acquires contains Several datasets → Multimodality. fMRI EEG Brain activity 2/36

Slide 6

Slide 6 text

What is multimodality ? Vocabulary Modality Signal Phenomenon of interest acquires contains Several datasets → Multimodality. fMRI EEG Brain activity Hyperspectral Airborne scene Multispectral Pictures are courtesy of (Datcu et. al, 2005) 2/36

Slide 7

Slide 7 text

Multimodal data fusion Definition (Lahat et al., 2015) Data fusion is the analysis of several datasets such that [they] can interact with and inform each other. → (Kanatsoulis et al., 2018), (Biessmann et al., 2011), (Rivet et al., 2014), (Betoule et al., 2014),... Main issues: • Sizes, resolutions, noise contaminations; • Various fusion strategies, links between modalities; • Uncertainties in shared information. 3/36

Slide 8

Slide 8 text

Multimodal data fusion Definition (Lahat et al., 2015) Data fusion is the analysis of several datasets such that [they] can interact with and inform each other. → (Kanatsoulis et al., 2018), (Biessmann et al., 2011), (Rivet et al., 2014), (Betoule et al., 2014),... Main issues: • Sizes, resolutions, noise contaminations; • Various fusion strategies, links between modalities; • Uncertainties in shared information. → In this work: • True fusion: symmetric roles for modalities; • A specific problem: hyperspectral super-resolution. 3/36

Slide 9

Slide 9 text

General introduction The super-resolution problem

Slide 10

Slide 10 text

Principle of spectral imaging Water Soil Vegetation Spectral unmixing Image classification Target detection ... 4/36

Slide 11

Slide 11 text

Tradeoff in resolutions Hyperspectral image (HSI) Multispectral image (MSI) Hundreds of spectral bands; Low spatial resolution; Few spectral bands; High spatial resolution. 5/36

Slide 12

Slide 12 text

Hyperspectral super-resolution 6/36

Slide 13

Slide 13 text

Remote sensing problems • Pansharpening: (Vivone et al., 2014); (Loncan et al., 2015) ; Hyperspectral image Panchromatic image Super-resolution image FROM RECOVER 7/36

Slide 14

Slide 14 text

Remote sensing problems • Pansharpening: (Vivone et al., 2014); (Loncan et al., 2015) ; Hyperspectral image Panchromatic image Super-resolution image FROM RECOVER • Unmixing: (Parente et al., 2010); (Bioucas-Dias et al., 2012); (Qian et al., 2016) ; 7/36

Slide 15

Slide 15 text

General introduction Observation model

Slide 16

Slide 16 text

From matrix models ... → Component substitution (Laben et al., 2000); → Multi-resolution analysis (Aiazzi et al., 2006); → Unmixing (Yokoya et al. 2011); → Bayesian (Wei et al., 2015). I-fold, J-fold diversities (Sidiropoulos et al., 2000) ; Good model fitting and interpretability; Non-unique; → additional diversities (Dohono et. al, 2004), (Comon, 1994) ; Higher-dimensional observations. 8/36

Slide 17

Slide 17 text

... Towards tensor models Tensor: array of p dimensions (p ≥ 3). I-fold, J-fold and K-fold diversities; Structure-preserving for high-dimensional data; Some low-rank decompositions are unique. → Suited for the super-resolution problem. 9/36

Slide 18

Slide 18 text

Tensor algebra (1) Mode product Tensor unfoldings 10/36

Slide 19

Slide 19 text

Tensor algebra (2) • Canonical polyadic decomposition: • Tucker decomposition: • Block-term decomposition: 11/36

Slide 20

Slide 20 text

Tensor observation model YH = Y •1 P1 •2 P2 +EH , YM = Y •3 P3 +EM . • P1 , P2 : Gaussian blurring + downsampling (Wald et al., 1997); • P3 : spectral response functions. 12/36

Slide 21

Slide 21 text

Tensor observation model YH = Y •1 P1 •2 P2 +EH , YM = Y •3 P3 +EM . • P1 , P2 : Gaussian blurring + downsampling (Wald et al., 1997); • P3 : spectral response functions. Basic optimization problem minimize low-rank Y YH −Y • 1 P1 • 2 P2 2 F +λ YM −Y • 3 P3 2 F . 12/36

Slide 22

Slide 22 text

Ill-posedness Aim Recover IJK entries from IJKM +IH JH K observations. COMPLEXITY Matrix LL1-BTD CPD Tucker (IJ +K −R)R ((I +J −L)L+(K −1))R (I +J +K −2)N IR1 +JR2 +KR3 + 3 i=1 Ri − 3 i=1 R2 i • I = J = K = 100, IH = JH = 50, KM = 10; • N = LR, R1 = R2 = LR, R3 = R. 5 10 15 20 103 104 105 Number of unknown parameters 5 10 15 20 103 104 105 106 5 10 15 20 104 105 13/36

Slide 23

Slide 23 text

Variations of the model 14/36

Slide 24

Slide 24 text

Variations of the model 14/36

Slide 25

Slide 25 text

Summary 15/36

Slide 26

Slide 26 text

LL1-BTD for joint fusion and unmixing of the unknown SRI

Slide 27

Slide 27 text

LL1-BTD for joint fusion and unmixing of the unknown SRI Problem statement

Slide 28

Slide 28 text

Inter-image variability Principle Few satellites with bothsensors Different acquisition times Seasonal, atmospheric, illumination variations Inter-image variability (Hilker et al., 2009; Emelyanova et al., 2013) 16/36

Slide 29

Slide 29 text

A more flexible model (Borsoi, Prévost et al., 2021) SRI SRI HSI MSI Variability tensor + = YH = Y •1 P1 •2 P2 +EH , YM = Y •3 P3 +EM = (Y +Ψ)•3 P3 +EM . Very ambiguous; Ψ: General (spatial and spectral) variability. → Low-rank decompositions with small ranks. 17/36

Slide 30

Slide 30 text

Linear mixing model and LL1 LL1-BTD as linear mixing model (Shivappa, 2010) Y = R r=1 ArBT r ⊗cr = R r=1 Sr ⊗cr ⇒Y(3) = SCT, • cr : spectral signatures, • Sr : abundance maps with low-rank L: Sr = ArBT r ; Non-negative factors. 18/36

Slide 31

Slide 31 text

Spectral variability Simple model C = ψ+C, • ψ: spectral variability factor; Explicit in 3rd dimension; Equivalent to multiplicative model (Borsoi et al., 2018); Less general variability, less restrictive recovery conditions. Y = R r=1 ArBT r ⊗cr, Ψ = R r=1 ArBT r ⊗ψr, Y = R r=1 ArBT r ⊗cr = R r=1 ArBT r ⊗(cr +ψr). 19/36

Slide 32

Slide 32 text

LL1-BTD for joint fusion and unmixing of the unknown SRI Proposed approach

Slide 33

Slide 33 text

State-of-the-art Tensor approach; Fusion + unmixing; Exact recovery without additional constraints. 20/36

Slide 34

Slide 34 text

A procedure for joint fusion and unmixing Goal Recover uniquely Y and Ψ•3 P3 and the non-negative LL1 factors. Unknown Spatial degradation Spectral degradation Observations Unmixing HSR LL1-BTD Low-rank Variability model 21/36

Slide 35

Slide 35 text

Exact recovery Hypotheses: noiseless case, YM = Y •3 P3 , YH = Y •1 P1 •2 P2 . Generic theorem The SRI Y and degraded SRI Y •3 P3 are uniquely recovered by Y = R r=1 (ArBT r )⊗cr, Y •3 P3 = R r=1 (ArBT r )•3 P3cr, if IH JH ≥ LR, IJ ≥ L2R and min I L ,R +min J L ,R +min(KM ,R) ≥ 2R+2. Only Ψ•3 P3 ; C, P3C, S: unique up to permutation and scaling; → Holds for unmixing part of the problem. 22/36

Slide 36

Slide 36 text

An approach for fusion only Main idea minimize A,B,C,CM J = YH − R r=1 (P1Ar(P2Br)T)⊗cr 2 F +λ YM − R r=1 (ArBT r )⊗cM,r 2 F . BTD-Var Input: YH , YM , B, C, CM , P1 , P2 , P3 ; R, L; Output: Y, Ψ•3 P3 ; While stopping criterion not met, do 1. Normalize columns of C,CM with unit norm; 2. A,B,C,CM ← minimize J (alternating procedure); 3. S ← ...,vec{ArBT r },... ; End 4. Y(3) ← SCT, Ψ•3 P3 ← YM −Y •3 P3 . Fusion; Unmixing. 23/36

Slide 37

Slide 37 text

A constrained algorithm Constrained optimization problem minimize A,B,{Sr}R r=1 ,C,CM J s. to {Sr = ArBT r }R r=1 ≥ 0,C ≥ 0,CM ≥ 0. CNN-BTD-Var Input: YH , YM , B, C, CM , P1 , P2 , P3 ; R, L; Output: Y, Ψ•3 P3 , {Sr}R r=1 ,C,CM ; While stopping criterion not met, do 1. Normalize columns of C,CM with unit norm; 2. A,B,C,CM ← minimize J (ADMM procedure + non-negativity); 3. S ← ADMM procedure: low-rank + non-negativity; End 4. Y(3) ← SCT, Ψ•3 P3 ← YM −Y •3 P3 . Fusion + unmixing of Y. 24/36

Slide 38

Slide 38 text

LL1-BTD for joint fusion and unmixing of the unknown SRI Simulations

Slide 39

Slide 39 text

Fusion setup • real SRI Y (AVIRIS) and MSI YM (Sentinel 2-A); Y → YH (decimation factor d = 4, filter with unit variance); • EH and EM : white Gaussian noise, 30dB SNR; → normalized spectral bands; • comparison to matrix and tensor methods + FuVar (matrix + variability) and CB-STAR (tensor + localized changes) (Borsoi, Prévost et al., 2021). • Dataset: Lockwood, acquired on 2018-08-20 (SRI) and 2018-10-19 (MSI); Y ∈ R80×100×173. SRI MSI . 25/36

Slide 40

Slide 40 text

Fusion performance Algorithm R-SNR CC SAM ERGAS Time CNMF 18.7829 0.89063 2.9768 6.7014 4.353 HySure 14.125 0.8633 4.4044 11.6 6.9823 FuVar 12.2703 0.7297 3.7313 6.7951 724.91 STEREO 6.552 0.80196 27.3623 25.1749 1.8835 SCOTT 2.2276 0.79276 28.5771 45.9608 0.2228 BTD-Var 20.1273 0.918432 2.92921 6.35566 5.46272 CNN-BTD-Var 19.4882 0.906525 3.0299 6.29101 4.11573 CB-STAR 19.0751 0.89445 3.3707 7.2926 68.0282 Reference BTD-Var CNN-BTD-Var SCOTT CNMF Reference BTD-Var CNN-BTD-Var CT-STAR CB-STAR 26/36

Slide 41

Slide 41 text

Unmixing setup • Unmixing of the SRI Y; • Matrix approach: CNMF (Yokoya et al., 2012); • Two-step procedures: CB-STAR + MU-Acc (Gillis et al., 2012), BMDR-ADMM (Nus et al., 2018). • Lake Tahoe: Acquired on 2014-10-04 (SRI) and 2017-10-24 (MSI); Y ∈ R80×100×173. SRI MSI 27/36

Slide 42

Slide 42 text

Unmixing performance 0 0.1 0.2 Water Ref. CNN-BTD-Var CNMF MU-Acc BMDR-ADMM 0.05 0.1 Soil 0 0.05 0.1 Vegetation Ref. CNN-BTD-Var CNMF Mu-Acc BMDR-ADMM 28/36

Slide 43

Slide 43 text

Retrieving the variability factor 0 5 10 0 0.2 0.4 0.6 Water Ref. Est. 0 5 10 -0.2 0 0.2 0 5 10 0.2 0.3 0.4 Soil 0 5 10 -0.4 -0.2 0 0.2 0 5 10 0.2 0.4 0.6 Vegetation 0 5 10 -0.5 0 0.5 • Water→5th band→blue wavelengths; • Soil→10th band→orange to infrared wavelengths; • Vegetation→7th band→green wavelengths. 29/36

Slide 44

Slide 44 text

Partial conclusion • Flexible tensor model: inter-image variability; • A joint solution for fusion and unmixing of the super-resolution image; • Noiseless recovery guarantees: link with statistical identifiability. 30/36

Slide 45

Slide 45 text

Partial conclusion • Flexible tensor model: inter-image variability; • A joint solution for fusion and unmixing of the super-resolution image; • Noiseless recovery guarantees: link with statistical identifiability. • Are the algorithms efficient ? → Performance analysis for BTD-Var. → A new constrained bound accounting for uncertainties. 30/36

Slide 46

Slide 46 text

Performance bounds for coupled tensor models

Slide 47

Slide 47 text

Deriving bounds for coupled tensor models 1. Choose parameters: low-rank factors ? Reconstructed tensor ? 2. Identify the constraints; 3. Apply formula according to scenario: uncoupled, partially-coupled, fully-coupled; 4. Evaluate performance of estimator. Contributions → Standard CCRB for coupled CP model: performance of STEREO and Blind-STEREO; → Randomly-constrained CRB: application to coupled LL1 models with random variability. 31/36

Slide 48

Slide 48 text

General coupled model Y1 ∼ fY1;ω and Y2 ∼ fY2;ω, g(ω) = 0. Assumption: Model statistically identifiable. Constrained Cramér-Rao bound (CCRB) [Stoica et al.,1998] CCRB(ω) = U UTFU −1 UT, with U a basis of ker(G). Non informative when g(ω) depends on a random parameter; → random parameter θr . 32/36

Slide 49

Slide 49 text

Randomly constrained CRB (RCCRB) First step: • ω ω y : locally unbiased cond. to θr ; Conditional CCRB CCRBθr (ω) = U θr (ω) UT θr (ω)CRB−1 θr (ω)U θr (ω) −1 UT θr (ω). RCCRB RCCRB(ω) = Eθr;ω CCRBθr (ω) , Ey|ω (ω−ω)(ω−ω)T ≥ RCCRB(ω). 33/36

Slide 50

Slide 50 text

LL1 estimation LL1-ALS inspired from (De Lathauwer) min A2,B2,C1 Y1 − R r=1 P1(A2)r(P2(B2)r)T ⊗(c1)r 2 F +λ Y2 − R r=1 (A2)r(B2)T r ⊗P3(c1)r 2 F . → Fully-coupled model; → Ignores the variability; 34/36

Slide 51

Slide 51 text

LL1 estimation LL1-ALS inspired from (De Lathauwer) min A2,B2,C1 Y1 − R r=1 P1(A2)r(P2(B2)r)T ⊗(c1)r 2 F +λ Y2 − R r=1 (A2)r(B2)T r ⊗P3(c1)r 2 F . → Fully-coupled model; → Ignores the variability; BTD-Var designed in Chap. 3 min A2,B2,C1,C2 Y1 − R r=1 P1(A2)r(P2(B2)r)T ⊗(c1)r 2 F +λ Y2 − R r=1 (A2)r(B2)T r ⊗(c2)r 2 F . → Blind in the third dimension; → Random variability hidden in C2 ; 34/36

Slide 52

Slide 52 text

Performance of BTD-Var 5 10 15 20 25 30 35 40 45 50 55 60 100 MSE Trace Performance for image reconstruction 5 10 15 20 100 101 MSE Trace 30 40 50 60 0.025 0.03 0.035 0.04 0.045 0.05 MSE Trace Gain w.r.t. model without variability; Robust to a lack of knowledge of the phenomenon; Asymptotically efficient ? 35/36

Slide 53

Slide 53 text

Conclusion

Slide 54

Slide 54 text

Conclusion • Better performance than state-of-art; • Does not reach the RCCRB; • Still room for a clairvoyant algorithm ! https://cprevost4.github.io Thank you for your attention. 36/36