Slide 1

Slide 1 text

Robust low-rank covariance matrix estimation with missing values and application to classification problems Alexandre Hippert-Ferrer Laboratoire des Signaux et Syst` emes – CentraleSup´ elec – Universit´ e Paris-Saclay S´ eminaire S3 – September 2021 Joint work with: Mohammed Nabil El Korso (LEME, Universit´ e Paris Nanterre) Arnaud Breloy (LEME, Universit´ e Paris Nanterre) Guillaume Ginolhac (LISTIC, Universit´ e Savoie Mont Blanc) Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 1 / 35

Slide 2

Slide 2 text

Introduction (a) Missing completely at random (b) Missing not at random Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 2 / 35

Slide 3

Slide 3 text

Introduction θ = {µ, Σ, . . . } ? ? (a) Missing completely at random (b) Missing not at random Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 2 / 35

Slide 4

Slide 4 text

Table of Contents 1 Problem formulation and data model 2 EM algorithms for incomplete data under the SG distribution A brief introduction to the EM algorithm The unstructured case The structured case Numerical simulations 3 Application to (non-)supervised learning Classification of crop fields Image clustering Classification of EEG signals Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 3 / 35

Slide 5

Slide 5 text

Problem formulation Let us consider: • Complete data {yi }n i=1 ∈ Rp • yi = {yo i , ym i } How to estimate θ from incomplete samples ? 1. Use observed data only; 2. Impute the data, then estimate θ; 3. Use the Expectation-Maximization (EM) algorithm → handy iterative procedure to find ˆ θML. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 4 / 35

Slide 6

Slide 6 text

Problem formulation Let us consider: • Complete data {yi }n i=1 ∈ Rp • yi = {yo i , ym i } • Unknown parameter of interest θ ∈ Ω • A probabilistic model of the data p(y|θ) • Maximum likelihood (ML): ˆ θML = arg max θ∈Ω p(y|θ) How to estimate θ from incomplete samples ? 1. Use observed data only; 2. Impute the data, then estimate θ; 3. Use the Expectation-Maximization (EM) algorithm → handy iterative procedure to find ˆ θML. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 4 / 35

Slide 7

Slide 7 text

Problem formulation Let us consider: • Complete data {yi }n i=1 ∈ Rp • yi = {yo i , ym i } • Unknown parameter of interest θ ∈ Ω • A probabilistic model of the data p(y|θ) • Maximum likelihood (ML): ˆ θML = arg max θ∈Ω p(y|θ) How to estimate θ from incomplete samples ? 1. Use observed data only; 2. Impute the data, then estimate θ; 3. Use the Expectation-Maximization (EM) algorithm → handy iterative procedure to find ˆ θML. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 4 / 35

Slide 8

Slide 8 text

Problem formulation Let us consider: • Complete data {yi }n i=1 ∈ Rp • yi = {yo i , ym i } • Unknown parameter of interest θ ∈ Ω • A probabilistic model of the data p(y|θ) • Maximum likelihood (ML): ˆ θML = arg max θ∈Ω p(y|θ) How to estimate θ from incomplete samples ? 1. Use observed data only; 2. Impute the data, then estimate θ; 3. Use the Expectation-Maximization (EM) algorithm → handy iterative procedure to find ˆ θML. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 4 / 35

Slide 9

Slide 9 text

Problem formulation Let us consider: • Complete data {yi }n i=1 ∈ Rp • yi = {yo i , ym i } • Unknown parameter of interest θ ∈ Ω • A probabilistic model of the data p(y|θ) • Maximum likelihood (ML): ˆ θML = arg max θ∈Ω p(y|θ) How to estimate θ from incomplete samples ? 1. Use observed data only; 2. Impute the data, then estimate θ; 3. Use the Expectation-Maximization (EM) algorithm → handy iterative procedure to find ˆ θML. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 4 / 35

Slide 10

Slide 10 text

Motivations • Work on ML estimation of Σ with incomplete data exist, e.g.: n > p (C. Liu 1999)    Gaussian p > n (Lounici et al. 2014; St¨ adler et al. 2014) rank(Σ) = r < p (Sportisse et al. 2020; Aubry et al. 2021) t-distribution (C. Liu and Rubin 1995; J. Liu et al. 2019) non-Gaussian GEM (Frahm et al. 2010) Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 5 / 35

Slide 11

Slide 11 text

Motivations • Work on ML estimation of Σ with incomplete data exist, e.g.: n > p (C. Liu 1999)    Gaussian p > n (Lounici et al. 2014; St¨ adler et al. 2014) rank(Σ) = r < p (Sportisse et al. 2020; Aubry et al. 2021) t-distribution (C. Liu and Rubin 1995; J. Liu et al. 2019) non-Gaussian GEM (Frahm et al. 2010) • Missing data patterns: monotone general random Most of the work Gaussian / full-rank Σ Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 5 / 35

Slide 12

Slide 12 text

Aim of this work 1. Build generic algorithms that take advantage of both robust estimation and low-rank structure; 2. Handle any pattern (= mechanism) of missing values (monotone, general, random); 3. Apply these procedures to covariance-based imputation/classification/clustering problems. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 6 / 35

Slide 13

Slide 13 text

The scaled-Gaussian (SG) distribution • Model: yi | τi ∼ N(0, τiΣ), Σ ⊆ Sp ++ , τi > 0 Texture parameter Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 7 / 35

Slide 14

Slide 14 text

The scaled-Gaussian (SG) distribution • Model: yi | τi ∼ N(0, τiΣ), Σ ⊆ Sp ++ , τi > 0 Texture parameter • Complete likelihood function: Lc(yi |Σ, τi) ∝ n i=1 |τiΣ|−1 exp − yT i (τiΣ)−1yi Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 7 / 35

Slide 15

Slide 15 text

The scaled-Gaussian (SG) distribution • Model: yi | τi ∼ N(0, τiΣ), Σ ⊆ Sp ++ , τi > 0 Texture parameter • Complete likelihood function: Lc(yi |Σ, τi) ∝ n i=1 |τiΣ|−1 exp − yT i (τiΣ)−1yi • Constraints on Σ: low-rank structure            Σ = σ2Ip + H H 0 rank(H) = r σ > 0 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 7 / 35

Slide 16

Slide 16 text

Data incompleteness model • Data transformation: yi = Piyi = yo i ym i , i ∈ [1, n] {Pi }n i=1 ∈ Rp×p: set of n permutation matrix. • Example: n = p = 3 y1 y2 y3 ˜ y1 ˜ y2 ˜ y3 y11 y12 y13 y21 y22 y23 y31 y33 y12 y13 y11 y23 y21 y22 y31 y33 y32 y32 ˜ yi = Pi yi Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 8 / 35

Slide 17

Slide 17 text

Data incompleteness model • Data transformation: yi = Piyi = yo i ym i , i ∈ [1, n] {Pi }n i=1 ∈ Rp×p: set of n permutation matrix. • Example: n = p = 3 y1 y2 y3 ˜ y1 ˜ y2 ˜ y3 y11 y12 y13 y21 y22 y23 y31 y33 y12 y13 y11 y23 y21 y22 y31 y33 y32 y32 ˜ yi = Pi yi • Covariance matrix: Σi = Σi,oo Σi,mo Σi,om Σi,mm = PiΣPi Σi,mm, Σi,mo, Σi,oo are the block CM of ym i , ym i and yo i , and yo i . Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 8 / 35

Slide 18

Slide 18 text

Table of Contents 1 Problem formulation and data model 2 EM algorithms for incomplete data under the SG distribution A brief introduction to the EM algorithm The unstructured case The structured case Numerical simulations 3 Application to (non-)supervised learning Classification of crop fields Image clustering Classification of EEG signals Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 9 / 35

Slide 19

Slide 19 text

Introduction to the EM algorithm • Iterative scheme for ML estimation in incomplete-data problems; • Provides a formal approach to the intuitive ad hoc idea of filling in missing values; • The EM alternates between making guesses about the complete data y (E-step) and finding θ that maximizes Lc(θ|y) over θ (M-step); • Under some general conditions on the complete data, L(ˆ θEM ) converges to L(ˆ θML). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 10 / 35

Slide 20

Slide 20 text

The EM algorithm • Initialization: At step t = 0, make an initial estimate of θ(0) (using a prior knowledge or by a sub-optimal existing algorithm). • E-step (guess complete data from current θ(t)): Q(θ|θ(t)) = Lc(θ|y)f(ym|yo, θ(t))dym = Eym i |yo i ,θ(t) Lc(θ|y) • M-step (update θ from the “guessed” complete data): Q(θ(t+1)|θ(t)) ≥ Q(θ|θ(t)) Then repeat E and M-steps until a stopping criteria, such as the distance ||θ(t+1) − θ(t)||, converges to a pre-defined threshold. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 11 / 35

Slide 21

Slide 21 text

Ingredients for our EM 1. Transformed data: yi = Piyi = yo i ym i 2. Transformed CM: Σi = PiΣPi 3. The (new) complete loglikelihood Lc: Lc (θ|Y ) ∝ −n log |Σ| − p n i=1 log τi − n i=1 yi (τiΣi)−1yi Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 12 / 35

Slide 22

Slide 22 text

Ingredients for our EM 1. Transformed data: yi = Piyi = yo i ym i 2. Transformed CM: Σi = PiΣPi 3. The (new) complete loglikelihood Lc: Lc (θ|Y ) ∝ −n log |Σ| − p n i=1 log τi − n i=1 yi (τiΣi)−1yi EM 1: no structure on Σ (full-rank) EM 2: low-rank structure on Σ (r < p). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 12 / 35

Slide 23

Slide 23 text

EM 1 – No structure on Σ Parameters: θ = {Σ, τ1 , . . . , τn }. E-step: compute the expectation of Lc Qi(θ|θ(t)) = Eym i |yo i ,θ(t) Lc (θ|yo i , ym i ) = τ−1 i tr B(t) i Σ−1 i need few manipulations with B(t) i = yo i yo i 0 0 Eym i |yo i ,θ(t) ym i ym i . Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 13 / 35

Slide 24

Slide 24 text

EM 1 – No structure on Σ Parameters: θ = {Σ, τ1 , . . . , τn }. E-step: compute the expectation of Lc Qi(θ|θ(t)) = Eym i |yo i ,θ(t) Lc (θ|yo i , ym i ) = τ−1 i tr B(t) i Σ−1 i need few manipulations with B(t) i = yo i yo i 0 0 Eym i |yo i ,θ(t) ym i ym i . M-step: obtain θ(t+1) as the solution of the following max. problems max θ Qi(θ|θ(t)) subject to Σ 0 τi > 0, ∀i Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 13 / 35

Slide 25

Slide 25 text

M-step: update parameters with closed-form expressions Proposition: closed-form expressions of {τi }n i=1 and Σ exists τi = tr(B(t) i Σ) p ; Σ = p n n i=1 C(t) i tr C(t) i Σ−1 ∆ = H(Σ) (1) with C(t) i = Pi B(t) i Pi and where Σm+1 = H(Σm) is the fixed-point algorithm. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 14 / 35

Slide 26

Slide 26 text

M-step: update parameters with closed-form expressions Proposition: closed-form expressions of {τi }n i=1 and Σ exists τi = tr(B(t) i Σ) p ; Σ = p n n i=1 C(t) i tr C(t) i Σ−1 ∆ = H(Σ) (1) with C(t) i = Pi B(t) i Pi and where Σm+1 = H(Σm) is the fixed-point algorithm. To obtain (1), one needs to... 1. Differenciate Lc w.r.t. τi and resolve δLc δτi = 0 2. Differenciate Lc w.r.t. to Σ and resolve δLc δΣ = 0 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 14 / 35

Slide 27

Slide 27 text

EM-Tyl: Estimation of θ under CG distribution with missing values. Require: {yi }n i=1 ∼ N(0, τiΣ), {Pi }n i=1 Ensure: Σ, {τi }n i=1 1: Initialization: Σ(0) = ΣTyl-obs ; τ(0) = 1N 2: repeat EM loop, t varies 3: Compute B(t) i = yo i yo i 0 0 Eym i |yo i ,θ(t) [ym i ym i ] 4: Compute C(t) i = Pi B(t) i Pi 5: repeat fixed-point, m varies (optional loop) 6: Σ(t) m+1 = H(Σ(t) m ) 7: until ||Σ(t) m+1 − Σ(t) m ||2 F converges 8: Compute τ(t) i , i = 1, . . . , n 9: t ← t + 1 10: until ||θ(t+1) − θ(t)||2 F converges Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 15 / 35

Slide 28

Slide 28 text

EM 2 – Low-rank structure on Σ Parameters: θ = {H, σ2, τ1 , . . . , τn }. E-step: compute the expectation of Lc (same as EM 1) Qi(θ|θ(t)) = Eym i |yo i ,θ(t) Lc (θ|yo i , ym i ) = τ−1 i tr B(t) i Σ−1 i with B(t) i = yo i yo i 0 0 Eym i |yo i ,θ(t) ym i ym i . Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 16 / 35

Slide 29

Slide 29 text

EM 2 – Low-rank structure on Σ Parameters: θ = {H, σ2, τ1 , . . . , τn }. E-step: compute the expectation of Lc (same as EM 1) Qi(θ|θ(t)) = Eym i |yo i ,θ(t) Lc (θ|yo i , ym i ) = τ−1 i tr B(t) i Σ−1 i with B(t) i = yo i yo i 0 0 Eym i |yo i ,θ(t) ym i ym i . M-step: obtain θ(t+1) as the solution of the following max. problems max θ Qi(θ|θ(t)) subject to Σ = σ2Ip + H rank(H) = r σ > 0, τi > 0, ∀i Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 16 / 35

Slide 30

Slide 30 text

M-step of EM 2 • Form of the solution given in (Kang et al. 2014; Sun et al. 2016). • Once Σ is updated, eigendecompose it to obtain eigenvectors (u1 , . . . , up) and eigenvalues (λ1 , . . . , λp). • Then, reconstruct Σ using this set of operations:                Σ = σ2Ip + r i=1 ˆ λiuiui = σ2Ip + H σ2 = 1 p − r p i=r+1 λi ˆ λi = λi − σ2, i = 1, . . . , r Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 17 / 35

Slide 31

Slide 31 text

EM-Tyl-r: Low-rank estimation of θ under CG distribution with missing values. Require: {yi }n i=1 ∼ N(0, τiΣ), {Pi }n i=1 , rank r < p Ensure: Σ, {τi }n i=1 1: Initialize Σ(0), τ(0). 2: repeat EM loop, t varies 3: Compute B(t) i and C(t) i as in previous Algorithm 4: repeat fixed point, m varies (optional loop) 5: Σ(t) m+1 = H(Σ(t) m ) 6: Σ(t) m+1 EVD = p i=1 λiuiui 7: Compute σ2, ˆ λi, H and reconstruct Σ(t) m+1 8: until ||Σ(t) m+1 − Σ(t) m ||2 F converges 9: Compute τ(t) i , i = 1, . . . , n 10: t ← t + 1 11: until ||θ(t+1) − θ(t)||2 F converges Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 18 / 35

Slide 32

Slide 32 text

Experiments setup • CM model:            Rij = ρ|i−j| R EVD = UΛU Σ = σ2Ip + UU 0 < ρ < 1 ; σ > 0 ; p = 15 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 19 / 35

Slide 33

Slide 33 text

Experiments setup • CM model:            Rij = ρ|i−j| R EVD = UΛU Σ = σ2Ip + UU 0 < ρ < 1 ; σ > 0 ; p = 15 • Geodesic distance: δ2 Sp ++ (Σ, Σ) = || log(Σ−1 2 ΣΣ−1 2 )||2 2 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 19 / 35

Slide 34

Slide 34 text

Experiments setup • CM model:            Rij = ρ|i−j| R EVD = UΛU Σ = σ2Ip + UU 0 < ρ < 1 ; σ > 0 ; p = 15 • Geodesic distance: δ2 Sp ++ (Σ, Σ) = || log(Σ−1 2 ΣΣ−1 2 )||2 2 Comparison with: • Tyler’s M-estimator (Tyler 1987) on observed, full and imputed data. ΣTyler = p n n i=1 yiyi yi Σ−1 Tyler yi Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 19 / 35

Slide 35

Slide 35 text

Experiments setup • CM model:            Rij = ρ|i−j| R EVD = UΛU Σ = σ2Ip + UU 0 < ρ < 1 ; σ > 0 ; p = 15 • Geodesic distance: δ2 Sp ++ (Σ, Σ) = || log(Σ−1 2 ΣΣ−1 2 )||2 2 Comparison with: • Tyler’s M-estimator (Tyler 1987) on observed, full and imputed data. ΣTyler = p n n i=1 yiyi yi Σ−1 Tyler yi • Sample Covariance Matrix on observed and full data. ΣSCM = 1 n n i=1 yiyi Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 19 / 35

Slide 36

Slide 36 text

Results 102 103 −4 −2 0 2 4 n δ2 Sp ++ (Σ, ˆ Σ) (dB) −4 −2 0 2 4 δ2 Sp ++ (Σ, ˆ Σ) (dB) EM-Tyl EM-SCM Tyl-clair Tyl-obs SCM-clair SCM-obs RMI Mean-Tyl monotone general r = p r = 5 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 20 / 35

Slide 37

Slide 37 text

Results 102 103 −4 −2 0 2 4 n δ2 Sp ++ (Σ, ˆ Σ) (dB) 102 103 n −4 −2 0 2 4 δ2 Sp ++ (Σ, ˆ Σ) (dB) EM-Tyl EM-SCM Tyl-clair Tyl-obs SCM-clair SCM-obs RMI Mean-Tyl EM-Tyl EM-SCM EM-Tyl-r EM-SCM-r monotone general r = p r = 5 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 21 / 35

Slide 38

Slide 38 text

Table of Contents 1 Problem formulation and data model 2 EM algorithms for incomplete data under the SG distribution A brief introduction to the EM algorithm The unstructured case The structured case Numerical simulations 3 Application to (non-)supervised learning Classification of crop fields Image clustering Classification of EEG signals Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 22 / 35

Slide 39

Slide 39 text

Application 1: covariance-based classification of crop fields Breizhcrops dataset (Rußwurm et al. 2020) – {{yt k }K k=1 }T t=1 ∈ Rp: time series of reflectances at field parcels k ∈ [1, K] and timestamps t ∈ [1, T] over p spectral bands. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 23 / 35

Slide 40

Slide 40 text

Application 1: covariance-based classification of crop fields Breizhcrops dataset (Rußwurm et al. 2020) – {{yt k }K k=1 }T t=1 ∈ Rp: time series of reflectances at field parcels k ∈ [1, K] and timestamps t ∈ [1, T] over p spectral bands. Problem – classify incomplete yt k using a minimum distance to Riemannian mean (MDRM) classifier (Barachant et al. 2012). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 23 / 35

Slide 41

Slide 41 text

Application 1: covariance-based classification of crop fields Breizhcrops dataset (Rußwurm et al. 2020) – {{yt k }K k=1 }T t=1 ∈ Rp: time series of reflectances at field parcels k ∈ [1, K] and timestamps t ∈ [1, T] over p spectral bands. Problem – classify incomplete yt k using a minimum distance to Riemannian mean (MDRM) classifier (Barachant et al. 2012). • Each parcel encoded by p × p SPD matrices {Σ1 , . . . , ΣK }. • Test-training form. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 23 / 35

Slide 42

Slide 42 text

Classification results None 1 2 3 4 5 6 30 40 50 60 # of missing bands (1 band ∼ 7% of the data) Overall Accuracy (%) EM-SCM EM-Tyl RSI • EM-Tyl-based classification handles better incompleteness than the EM-SCM one; • EM-SCM∼EM-Tyl for higher missing data ratio. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 24 / 35

Slide 43

Slide 43 text

Application 2: covariance-based image segmentation Indian Pines dataset (Baumgardner et al. 2015) – Hyperspectral image with p = 200 bands, 16 classes partitioning the image. 0 10 20 30 40 50 60 0 10 20 30 40 50 60 70 80 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 (a) Ground truth 0 10 20 30 40 50 60 0 10 20 30 40 50 60 70 80 −1000 −500 0 500 1000 1500 2000 2500 (b) Simulated sensor failure Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 25 / 35

Slide 44

Slide 44 text

Clustering set-up • The mean is substracted to each image. • {Σ1 , . . . , ΣM } are estimated in a w × w sliding window in each image. • Clustering task: K-means++ algorithm (Vassilvitskii et al. 2006). • Assigns each Σi to the cluster whose center is the closest according to a geodesic distance; • Update new class center using a Riemannian gradient descent (Collas et al., 2021) • Low-rank model with r = 5 (95% of the total cumulative variance). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 26 / 35

Slide 45

Slide 45 text

Clustering results None 1 2 3 4 5 45 50 55 # of incomplete bands Overall Accuracy (%) EM-SCM EM-SCM-r EM-Tyl EM-Tyl-r RSI RSI-r • EM-Tyl-r gives better OA for low missing data ratio. • EM-SCM-r gives better OA for high missing data ratio. • Low number of run due to a high runtime. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 27 / 35

Slide 46

Slide 46 text

Application 3: classification of EEG signals [ongoing] Joint work with: Florent Bouchard (L2S, Universit´ e Paris-Saclay), Fr´ ed´ eric Pascal (L2S, Universit´ e Paris-Saclay), Ammar Mian (LISTIC, Universit´ e Savoie Mont Blanc). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 28 / 35

Slide 47

Slide 47 text

Application 3: classification of EEG signals [ongoing] Joint work with: Florent Bouchard (L2S, Universit´ e Paris-Saclay), Fr´ ed´ eric Pascal (L2S, Universit´ e Paris-Saclay), Ammar Mian (LISTIC, Universit´ e Savoie Mont Blanc). Dataset – K EEG trials over T timestamps over p electrodes. • Binary classification task: each trials belongs to the target (T) or non-target ( ¯ T) class; • MDRM classifier with covariances {Σ1 , . . . , ΣK }; • The data is assumed to be Gaussian for now; • Some electrodes are set as missing (sensor failure). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 28 / 35

Slide 48

Slide 48 text

Preliminary results • Results on 10 subjects; • Taking into account incompleteness is better than taking only observed values; • Similar results than a KNN imputer. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 29 / 35

Slide 49

Slide 49 text

A word to conclude To summarize: • EM-based procedure to perform robust low-rank estimation of the covariance matrix; • Handles missing data with a general pattern; • Compared to the Gaussian assumption / unstructured model: improvements in terms of CM estimation, supervised classification and unsupervised clustering. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 30 / 35

Slide 50

Slide 50 text

A word to conclude To summarize: • EM-based procedure to perform robust low-rank estimation of the covariance matrix; • Handles missing data with a general pattern; • Compared to the Gaussian assumption / unstructured model: improvements in terms of CM estimation, supervised classification and unsupervised clustering. Some perspectives include: • Extension to other classes of SG distribution; • Consider the joint distribution of the data and the missing data mechanism: the E-step will change drastically. • Has been done for Gaussian and MNAR data (Sportisse et al. 2020) • Classification with temporal gaps rather than spectral gaps; • EEG signals: take into account SG distribution and think about a variable selection strategy. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 30 / 35

Slide 51

Slide 51 text

Thanks! Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 31 / 35

Slide 52

Slide 52 text

References I Augusto Aubry et al. “Structured Covariance Matrix Estimation with Missing-Data for Radar Applications via Expectation-Maximization”. In: arXiv preprint arXiv:2105.03738 (2021). Alexandre Barachant et al. “Multiclass Brain–Computer Interface Classification by Riemannian Geometry”. In: IEEE Transactions on Biomedical Engineering 59.4 (2012), pp. 920–928. doi: 10.1109/TBME.2011.2172210. Marion F Baumgardner, Larry L Biehl, and David A Landgrebe. “220 band aviris hyperspectral image data set: June 12, 1992 indian pine test site 3”. In: Purdue University Research Repository 10 (2015). Gabriel Frahm and Uwe Jaekel. “A generalization of Tyler’s M-estimators to the case of incomplete data”. In: Computational Statistics & Data Analysis 54.2 (2010), pp. 374–393. Bosung Kang, Vishal Monga, and Muralidhar Rangaswamy. “Rank-constrained maximum likelihood estimation of structured covariance matrices”. In: IEEE Transactions on Aerospace and Electronic Systems 50.1 (2014), pp. 501–515. doi: 10.1109/TAES.2013.120389. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 32 / 35

Slide 53

Slide 53 text

References II Chuanhai Liu. “Efficient ML estimation of the multivariate normal distribution from incomplete data”. In: Journal of Multivariate Analysis 69 (1999), pp. 206–217. doi: 10.1006/jmva.1998.1793. Karim Lounici et al. “High-dimensional covariance matrix estimation with missing observations”. In: Bernoulli 20.3 (2014), pp. 1029–1058. Junyan Liu and Daniel P. Palomar. “Regularized robust estimation of mean and covariance matrix for incomplete data”. In: Signal Processing 165 (2019), pp. 278–291. issn: 0165-1684. doi: https://doi.org/10.1016/j.sigpro.2019.07.009. Chuanhai Liu and Donald B Rubin. “ML estimation of the t distribution using EM and its extensions, ECM and ECME”. In: Statistica Sinica (1995), pp. 19–39. Marc Rußwurm et al. “BreizhCrops: A Time Series Dataset for Crop Type Mapping”. In: International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ISPRS (2020) (2020). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 33 / 35

Slide 54

Slide 54 text

References III Aude Sportisse, Claire Boyer, and Julie Josse. “Imputation and low-rank estimation with Missing Not At Random data”. In: (2020). arXiv: 1812.11409 [stat.ML]. Y. Sun, P. Babu, and D. P. Palomar. “Robust Estimation of Structured Covariance Matrix for Heavy-Tailed Elliptical Distributions”. In: IEEE Transactions on Signal Processing 64.14 (2016), pp. 3576–3590. doi: 10.1109/TSP.2016.2546222. Nicolas St¨ adler, Daniel J Stekhoven, and Peter B¨ uhlmann. “Pattern alternating maximization algorithm for missing data in high-dimensional problems.”. In: J. Mach. Learn. Res. 15.1 (2014), pp. 1903–1928. David E. Tyler. “A Distribution-Free M-Estimator of Multivariate Scatter”. In: The Annals of Statistics 15.1 (1987), pp. 234–251. Sergei Vassilvitskii and David Arthur. “k-means++: The advantages of careful seeding”. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. 2006, pp. 1027–1035. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 34 / 35

Slide 55

Slide 55 text

Convergence of the EM 100 101 102 103 10−18 10−9 100 NMSEEM,Σ NMSEEM,τi 100 101 102 103 100 101 102 103 Figure: ||θ(t+1) − θ(t)|| versus the number of iterations. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 35 / 35