Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Alexandre Hippert-Ferrer

S³ Seminar
September 24, 2021
80

Alexandre Hippert-Ferrer

(L2S, CentraleSupelec, Université Paris Saclay)

https://s3-seminar.github.io/seminars/alexandre-hippert

Title — Robust low-rank covariance matrix estimation with missing values and application to classification problems

Abstract — Missing values are inherent to real-world data sets. Statistical learning problems often require the estimation of parameters as the mean or the covariance matrix (CM). If the data is incomplete, new estimation methodologies need to be designed depending on the data distribution and the missingness pattern (i.e. the pattern describing which values are missing with respect to the observed data). This talk considers robust CM estimation when the data is incomplete. In this perspective, classical statistical estimation methodologies are usually built upon the Gaussian assumption, whereas existing robust estimation ones assume unstructured signal models. The former can be inaccurate in real-world data sets in which heterogeneity causes heavy-tail distributions, while the latter does not profit from the usual low-rank structure of the signal. Taking advantage of both worlds, a CM estimation procedure is designed on a robust (compound Gaussian) low-rank model by leveraging the observed-data likelihood function within an expectation-maximization (EM) algorithm. After a validation on simulated data sets with various missingness patterns, the interest the proposed procedure is shown for CM-based classification and clustering problems with incomplete data. Investigated examples generally show higher classification accuracies with a classifier based on robust estimation compared to the one based on Gaussian assumption and the one based on imputed data.

S³ Seminar

September 24, 2021
Tweet

Transcript

  1. Robust low-rank covariance matrix estimation with missing values and application

    to classification problems Alexandre Hippert-Ferrer Laboratoire des Signaux et Syst` emes – CentraleSup´ elec – Universit´ e Paris-Saclay S´ eminaire S3 – September 2021 Joint work with: Mohammed Nabil El Korso (LEME, Universit´ e Paris Nanterre) Arnaud Breloy (LEME, Universit´ e Paris Nanterre) Guillaume Ginolhac (LISTIC, Universit´ e Savoie Mont Blanc) Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 1 / 35
  2. Introduction (a) Missing completely at random (b) Missing not at

    random Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 2 / 35
  3. Introduction θ = {µ, Σ, . . . } ?

    ? (a) Missing completely at random (b) Missing not at random Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 2 / 35
  4. Table of Contents 1 Problem formulation and data model 2

    EM algorithms for incomplete data under the SG distribution A brief introduction to the EM algorithm The unstructured case The structured case Numerical simulations 3 Application to (non-)supervised learning Classification of crop fields Image clustering Classification of EEG signals Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 3 / 35
  5. Problem formulation Let us consider: • Complete data {yi }n

    i=1 ∈ Rp • yi = {yo i , ym i } How to estimate θ from incomplete samples ? 1. Use observed data only; 2. Impute the data, then estimate θ; 3. Use the Expectation-Maximization (EM) algorithm → handy iterative procedure to find ˆ θML. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 4 / 35
  6. Problem formulation Let us consider: • Complete data {yi }n

    i=1 ∈ Rp • yi = {yo i , ym i } • Unknown parameter of interest θ ∈ Ω • A probabilistic model of the data p(y|θ) • Maximum likelihood (ML): ˆ θML = arg max θ∈Ω p(y|θ) How to estimate θ from incomplete samples ? 1. Use observed data only; 2. Impute the data, then estimate θ; 3. Use the Expectation-Maximization (EM) algorithm → handy iterative procedure to find ˆ θML. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 4 / 35
  7. Problem formulation Let us consider: • Complete data {yi }n

    i=1 ∈ Rp • yi = {yo i , ym i } • Unknown parameter of interest θ ∈ Ω • A probabilistic model of the data p(y|θ) • Maximum likelihood (ML): ˆ θML = arg max θ∈Ω p(y|θ) How to estimate θ from incomplete samples ? 1. Use observed data only; 2. Impute the data, then estimate θ; 3. Use the Expectation-Maximization (EM) algorithm → handy iterative procedure to find ˆ θML. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 4 / 35
  8. Problem formulation Let us consider: • Complete data {yi }n

    i=1 ∈ Rp • yi = {yo i , ym i } • Unknown parameter of interest θ ∈ Ω • A probabilistic model of the data p(y|θ) • Maximum likelihood (ML): ˆ θML = arg max θ∈Ω p(y|θ) How to estimate θ from incomplete samples ? 1. Use observed data only; 2. Impute the data, then estimate θ; 3. Use the Expectation-Maximization (EM) algorithm → handy iterative procedure to find ˆ θML. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 4 / 35
  9. Problem formulation Let us consider: • Complete data {yi }n

    i=1 ∈ Rp • yi = {yo i , ym i } • Unknown parameter of interest θ ∈ Ω • A probabilistic model of the data p(y|θ) • Maximum likelihood (ML): ˆ θML = arg max θ∈Ω p(y|θ) How to estimate θ from incomplete samples ? 1. Use observed data only; 2. Impute the data, then estimate θ; 3. Use the Expectation-Maximization (EM) algorithm → handy iterative procedure to find ˆ θML. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 4 / 35
  10. Motivations • Work on ML estimation of Σ with incomplete

    data exist, e.g.: n > p (C. Liu 1999)    Gaussian p > n (Lounici et al. 2014; St¨ adler et al. 2014) rank(Σ) = r < p (Sportisse et al. 2020; Aubry et al. 2021) t-distribution (C. Liu and Rubin 1995; J. Liu et al. 2019) non-Gaussian GEM (Frahm et al. 2010) Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 5 / 35
  11. Motivations • Work on ML estimation of Σ with incomplete

    data exist, e.g.: n > p (C. Liu 1999)    Gaussian p > n (Lounici et al. 2014; St¨ adler et al. 2014) rank(Σ) = r < p (Sportisse et al. 2020; Aubry et al. 2021) t-distribution (C. Liu and Rubin 1995; J. Liu et al. 2019) non-Gaussian GEM (Frahm et al. 2010) • Missing data patterns: monotone general random Most of the work Gaussian / full-rank Σ Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 5 / 35
  12. Aim of this work 1. Build generic algorithms that take

    advantage of both robust estimation and low-rank structure; 2. Handle any pattern (= mechanism) of missing values (monotone, general, random); 3. Apply these procedures to covariance-based imputation/classification/clustering problems. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 6 / 35
  13. The scaled-Gaussian (SG) distribution • Model: yi | τi ∼

    N(0, τiΣ), Σ ⊆ Sp ++ , τi > 0 Texture parameter Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 7 / 35
  14. The scaled-Gaussian (SG) distribution • Model: yi | τi ∼

    N(0, τiΣ), Σ ⊆ Sp ++ , τi > 0 Texture parameter • Complete likelihood function: Lc(yi |Σ, τi) ∝ n i=1 |τiΣ|−1 exp − yT i (τiΣ)−1yi Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 7 / 35
  15. The scaled-Gaussian (SG) distribution • Model: yi | τi ∼

    N(0, τiΣ), Σ ⊆ Sp ++ , τi > 0 Texture parameter • Complete likelihood function: Lc(yi |Σ, τi) ∝ n i=1 |τiΣ|−1 exp − yT i (τiΣ)−1yi • Constraints on Σ: low-rank structure            Σ = σ2Ip + H H 0 rank(H) = r σ > 0 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 7 / 35
  16. Data incompleteness model • Data transformation: yi = Piyi =

    yo i ym i , i ∈ [1, n] {Pi }n i=1 ∈ Rp×p: set of n permutation matrix. • Example: n = p = 3 y1 y2 y3 ˜ y1 ˜ y2 ˜ y3 y11 y12 y13 y21 y22 y23 y31 y33 y12 y13 y11 y23 y21 y22 y31 y33 y32 y32 ˜ yi = Pi yi Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 8 / 35
  17. Data incompleteness model • Data transformation: yi = Piyi =

    yo i ym i , i ∈ [1, n] {Pi }n i=1 ∈ Rp×p: set of n permutation matrix. • Example: n = p = 3 y1 y2 y3 ˜ y1 ˜ y2 ˜ y3 y11 y12 y13 y21 y22 y23 y31 y33 y12 y13 y11 y23 y21 y22 y31 y33 y32 y32 ˜ yi = Pi yi • Covariance matrix: Σi = Σi,oo Σi,mo Σi,om Σi,mm = PiΣPi Σi,mm, Σi,mo, Σi,oo are the block CM of ym i , ym i and yo i , and yo i . Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 8 / 35
  18. Table of Contents 1 Problem formulation and data model 2

    EM algorithms for incomplete data under the SG distribution A brief introduction to the EM algorithm The unstructured case The structured case Numerical simulations 3 Application to (non-)supervised learning Classification of crop fields Image clustering Classification of EEG signals Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 9 / 35
  19. Introduction to the EM algorithm • Iterative scheme for ML

    estimation in incomplete-data problems; • Provides a formal approach to the intuitive ad hoc idea of filling in missing values; • The EM alternates between making guesses about the complete data y (E-step) and finding θ that maximizes Lc(θ|y) over θ (M-step); • Under some general conditions on the complete data, L(ˆ θEM ) converges to L(ˆ θML). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 10 / 35
  20. The EM algorithm • Initialization: At step t = 0,

    make an initial estimate of θ(0) (using a prior knowledge or by a sub-optimal existing algorithm). • E-step (guess complete data from current θ(t)): Q(θ|θ(t)) = Lc(θ|y)f(ym|yo, θ(t))dym = Eym i |yo i ,θ(t) Lc(θ|y) • M-step (update θ from the “guessed” complete data): Q(θ(t+1)|θ(t)) ≥ Q(θ|θ(t)) Then repeat E and M-steps until a stopping criteria, such as the distance ||θ(t+1) − θ(t)||, converges to a pre-defined threshold. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 11 / 35
  21. Ingredients for our EM 1. Transformed data: yi = Piyi

    = yo i ym i 2. Transformed CM: Σi = PiΣPi 3. The (new) complete loglikelihood Lc: Lc (θ|Y ) ∝ −n log |Σ| − p n i=1 log τi − n i=1 yi (τiΣi)−1yi Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 12 / 35
  22. Ingredients for our EM 1. Transformed data: yi = Piyi

    = yo i ym i 2. Transformed CM: Σi = PiΣPi 3. The (new) complete loglikelihood Lc: Lc (θ|Y ) ∝ −n log |Σ| − p n i=1 log τi − n i=1 yi (τiΣi)−1yi EM 1: no structure on Σ (full-rank) EM 2: low-rank structure on Σ (r < p). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 12 / 35
  23. EM 1 – No structure on Σ Parameters: θ =

    {Σ, τ1 , . . . , τn }. E-step: compute the expectation of Lc Qi(θ|θ(t)) = Eym i |yo i ,θ(t) Lc (θ|yo i , ym i ) = τ−1 i tr B(t) i Σ−1 i need few manipulations with B(t) i = yo i yo i 0 0 Eym i |yo i ,θ(t) ym i ym i . Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 13 / 35
  24. EM 1 – No structure on Σ Parameters: θ =

    {Σ, τ1 , . . . , τn }. E-step: compute the expectation of Lc Qi(θ|θ(t)) = Eym i |yo i ,θ(t) Lc (θ|yo i , ym i ) = τ−1 i tr B(t) i Σ−1 i need few manipulations with B(t) i = yo i yo i 0 0 Eym i |yo i ,θ(t) ym i ym i . M-step: obtain θ(t+1) as the solution of the following max. problems max θ Qi(θ|θ(t)) subject to Σ 0 τi > 0, ∀i Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 13 / 35
  25. M-step: update parameters with closed-form expressions Proposition: closed-form expressions of

    {τi }n i=1 and Σ exists τi = tr(B(t) i Σ) p ; Σ = p n n i=1 C(t) i tr C(t) i Σ−1 ∆ = H(Σ) (1) with C(t) i = Pi B(t) i Pi and where Σm+1 = H(Σm) is the fixed-point algorithm. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 14 / 35
  26. M-step: update parameters with closed-form expressions Proposition: closed-form expressions of

    {τi }n i=1 and Σ exists τi = tr(B(t) i Σ) p ; Σ = p n n i=1 C(t) i tr C(t) i Σ−1 ∆ = H(Σ) (1) with C(t) i = Pi B(t) i Pi and where Σm+1 = H(Σm) is the fixed-point algorithm. To obtain (1), one needs to... 1. Differenciate Lc w.r.t. τi and resolve δLc δτi = 0 2. Differenciate Lc w.r.t. to Σ and resolve δLc δΣ = 0 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 14 / 35
  27. EM-Tyl: Estimation of θ under CG distribution with missing values.

    Require: {yi }n i=1 ∼ N(0, τiΣ), {Pi }n i=1 Ensure: Σ, {τi }n i=1 1: Initialization: Σ(0) = ΣTyl-obs ; τ(0) = 1N 2: repeat EM loop, t varies 3: Compute B(t) i = yo i yo i 0 0 Eym i |yo i ,θ(t) [ym i ym i ] 4: Compute C(t) i = Pi B(t) i Pi 5: repeat fixed-point, m varies (optional loop) 6: Σ(t) m+1 = H(Σ(t) m ) 7: until ||Σ(t) m+1 − Σ(t) m ||2 F converges 8: Compute τ(t) i , i = 1, . . . , n 9: t ← t + 1 10: until ||θ(t+1) − θ(t)||2 F converges Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 15 / 35
  28. EM 2 – Low-rank structure on Σ Parameters: θ =

    {H, σ2, τ1 , . . . , τn }. E-step: compute the expectation of Lc (same as EM 1) Qi(θ|θ(t)) = Eym i |yo i ,θ(t) Lc (θ|yo i , ym i ) = τ−1 i tr B(t) i Σ−1 i with B(t) i = yo i yo i 0 0 Eym i |yo i ,θ(t) ym i ym i . Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 16 / 35
  29. EM 2 – Low-rank structure on Σ Parameters: θ =

    {H, σ2, τ1 , . . . , τn }. E-step: compute the expectation of Lc (same as EM 1) Qi(θ|θ(t)) = Eym i |yo i ,θ(t) Lc (θ|yo i , ym i ) = τ−1 i tr B(t) i Σ−1 i with B(t) i = yo i yo i 0 0 Eym i |yo i ,θ(t) ym i ym i . M-step: obtain θ(t+1) as the solution of the following max. problems max θ Qi(θ|θ(t)) subject to Σ = σ2Ip + H rank(H) = r σ > 0, τi > 0, ∀i Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 16 / 35
  30. M-step of EM 2 • Form of the solution given

    in (Kang et al. 2014; Sun et al. 2016). • Once Σ is updated, eigendecompose it to obtain eigenvectors (u1 , . . . , up) and eigenvalues (λ1 , . . . , λp). • Then, reconstruct Σ using this set of operations:                Σ = σ2Ip + r i=1 ˆ λiuiui = σ2Ip + H σ2 = 1 p − r p i=r+1 λi ˆ λi = λi − σ2, i = 1, . . . , r Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 17 / 35
  31. EM-Tyl-r: Low-rank estimation of θ under CG distribution with missing

    values. Require: {yi }n i=1 ∼ N(0, τiΣ), {Pi }n i=1 , rank r < p Ensure: Σ, {τi }n i=1 1: Initialize Σ(0), τ(0). 2: repeat EM loop, t varies 3: Compute B(t) i and C(t) i as in previous Algorithm 4: repeat fixed point, m varies (optional loop) 5: Σ(t) m+1 = H(Σ(t) m ) 6: Σ(t) m+1 EVD = p i=1 λiuiui 7: Compute σ2, ˆ λi, H and reconstruct Σ(t) m+1 8: until ||Σ(t) m+1 − Σ(t) m ||2 F converges 9: Compute τ(t) i , i = 1, . . . , n 10: t ← t + 1 11: until ||θ(t+1) − θ(t)||2 F converges Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 18 / 35
  32. Experiments setup • CM model:     

          Rij = ρ|i−j| R EVD = UΛU Σ = σ2Ip + UU 0 < ρ < 1 ; σ > 0 ; p = 15 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 19 / 35
  33. Experiments setup • CM model:     

          Rij = ρ|i−j| R EVD = UΛU Σ = σ2Ip + UU 0 < ρ < 1 ; σ > 0 ; p = 15 • Geodesic distance: δ2 Sp ++ (Σ, Σ) = || log(Σ−1 2 ΣΣ−1 2 )||2 2 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 19 / 35
  34. Experiments setup • CM model:     

          Rij = ρ|i−j| R EVD = UΛU Σ = σ2Ip + UU 0 < ρ < 1 ; σ > 0 ; p = 15 • Geodesic distance: δ2 Sp ++ (Σ, Σ) = || log(Σ−1 2 ΣΣ−1 2 )||2 2 Comparison with: • Tyler’s M-estimator (Tyler 1987) on observed, full and imputed data. ΣTyler = p n n i=1 yiyi yi Σ−1 Tyler yi Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 19 / 35
  35. Experiments setup • CM model:     

          Rij = ρ|i−j| R EVD = UΛU Σ = σ2Ip + UU 0 < ρ < 1 ; σ > 0 ; p = 15 • Geodesic distance: δ2 Sp ++ (Σ, Σ) = || log(Σ−1 2 ΣΣ−1 2 )||2 2 Comparison with: • Tyler’s M-estimator (Tyler 1987) on observed, full and imputed data. ΣTyler = p n n i=1 yiyi yi Σ−1 Tyler yi • Sample Covariance Matrix on observed and full data. ΣSCM = 1 n n i=1 yiyi Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 19 / 35
  36. Results 102 103 −4 −2 0 2 4 n δ2

    Sp ++ (Σ, ˆ Σ) (dB) −4 −2 0 2 4 δ2 Sp ++ (Σ, ˆ Σ) (dB) EM-Tyl EM-SCM Tyl-clair Tyl-obs SCM-clair SCM-obs RMI Mean-Tyl monotone general r = p r = 5 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 20 / 35
  37. Results 102 103 −4 −2 0 2 4 n δ2

    Sp ++ (Σ, ˆ Σ) (dB) 102 103 n −4 −2 0 2 4 δ2 Sp ++ (Σ, ˆ Σ) (dB) EM-Tyl EM-SCM Tyl-clair Tyl-obs SCM-clair SCM-obs RMI Mean-Tyl EM-Tyl EM-SCM EM-Tyl-r EM-SCM-r monotone general r = p r = 5 Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 21 / 35
  38. Table of Contents 1 Problem formulation and data model 2

    EM algorithms for incomplete data under the SG distribution A brief introduction to the EM algorithm The unstructured case The structured case Numerical simulations 3 Application to (non-)supervised learning Classification of crop fields Image clustering Classification of EEG signals Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 22 / 35
  39. Application 1: covariance-based classification of crop fields Breizhcrops dataset (Rußwurm

    et al. 2020) – {{yt k }K k=1 }T t=1 ∈ Rp: time series of reflectances at field parcels k ∈ [1, K] and timestamps t ∈ [1, T] over p spectral bands. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 23 / 35
  40. Application 1: covariance-based classification of crop fields Breizhcrops dataset (Rußwurm

    et al. 2020) – {{yt k }K k=1 }T t=1 ∈ Rp: time series of reflectances at field parcels k ∈ [1, K] and timestamps t ∈ [1, T] over p spectral bands. Problem – classify incomplete yt k using a minimum distance to Riemannian mean (MDRM) classifier (Barachant et al. 2012). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 23 / 35
  41. Application 1: covariance-based classification of crop fields Breizhcrops dataset (Rußwurm

    et al. 2020) – {{yt k }K k=1 }T t=1 ∈ Rp: time series of reflectances at field parcels k ∈ [1, K] and timestamps t ∈ [1, T] over p spectral bands. Problem – classify incomplete yt k using a minimum distance to Riemannian mean (MDRM) classifier (Barachant et al. 2012). • Each parcel encoded by p × p SPD matrices {Σ1 , . . . , ΣK }. • Test-training form. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 23 / 35
  42. Classification results None 1 2 3 4 5 6 30

    40 50 60 # of missing bands (1 band ∼ 7% of the data) Overall Accuracy (%) EM-SCM EM-Tyl RSI • EM-Tyl-based classification handles better incompleteness than the EM-SCM one; • EM-SCM∼EM-Tyl for higher missing data ratio. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 24 / 35
  43. Application 2: covariance-based image segmentation Indian Pines dataset (Baumgardner et

    al. 2015) – Hyperspectral image with p = 200 bands, 16 classes partitioning the image. 0 10 20 30 40 50 60 0 10 20 30 40 50 60 70 80 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 (a) Ground truth 0 10 20 30 40 50 60 0 10 20 30 40 50 60 70 80 −1000 −500 0 500 1000 1500 2000 2500 (b) Simulated sensor failure Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 25 / 35
  44. Clustering set-up • The mean is substracted to each image.

    • {Σ1 , . . . , ΣM } are estimated in a w × w sliding window in each image. • Clustering task: K-means++ algorithm (Vassilvitskii et al. 2006). • Assigns each Σi to the cluster whose center is the closest according to a geodesic distance; • Update new class center using a Riemannian gradient descent (Collas et al., 2021) • Low-rank model with r = 5 (95% of the total cumulative variance). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 26 / 35
  45. Clustering results None 1 2 3 4 5 45 50

    55 # of incomplete bands Overall Accuracy (%) EM-SCM EM-SCM-r EM-Tyl EM-Tyl-r RSI RSI-r • EM-Tyl-r gives better OA for low missing data ratio. • EM-SCM-r gives better OA for high missing data ratio. • Low number of run due to a high runtime. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 27 / 35
  46. Application 3: classification of EEG signals [ongoing] Joint work with:

    Florent Bouchard (L2S, Universit´ e Paris-Saclay), Fr´ ed´ eric Pascal (L2S, Universit´ e Paris-Saclay), Ammar Mian (LISTIC, Universit´ e Savoie Mont Blanc). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 28 / 35
  47. Application 3: classification of EEG signals [ongoing] Joint work with:

    Florent Bouchard (L2S, Universit´ e Paris-Saclay), Fr´ ed´ eric Pascal (L2S, Universit´ e Paris-Saclay), Ammar Mian (LISTIC, Universit´ e Savoie Mont Blanc). Dataset – K EEG trials over T timestamps over p electrodes. • Binary classification task: each trials belongs to the target (T) or non-target ( ¯ T) class; • MDRM classifier with covariances {Σ1 , . . . , ΣK }; • The data is assumed to be Gaussian for now; • Some electrodes are set as missing (sensor failure). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 28 / 35
  48. Preliminary results • Results on 10 subjects; • Taking into

    account incompleteness is better than taking only observed values; • Similar results than a KNN imputer. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 29 / 35
  49. A word to conclude To summarize: • EM-based procedure to

    perform robust low-rank estimation of the covariance matrix; • Handles missing data with a general pattern; • Compared to the Gaussian assumption / unstructured model: improvements in terms of CM estimation, supervised classification and unsupervised clustering. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 30 / 35
  50. A word to conclude To summarize: • EM-based procedure to

    perform robust low-rank estimation of the covariance matrix; • Handles missing data with a general pattern; • Compared to the Gaussian assumption / unstructured model: improvements in terms of CM estimation, supervised classification and unsupervised clustering. Some perspectives include: • Extension to other classes of SG distribution; • Consider the joint distribution of the data and the missing data mechanism: the E-step will change drastically. • Has been done for Gaussian and MNAR data (Sportisse et al. 2020) • Classification with temporal gaps rather than spectral gaps; • EEG signals: take into account SG distribution and think about a variable selection strategy. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 30 / 35
  51. References I Augusto Aubry et al. “Structured Covariance Matrix Estimation

    with Missing-Data for Radar Applications via Expectation-Maximization”. In: arXiv preprint arXiv:2105.03738 (2021). Alexandre Barachant et al. “Multiclass Brain–Computer Interface Classification by Riemannian Geometry”. In: IEEE Transactions on Biomedical Engineering 59.4 (2012), pp. 920–928. doi: 10.1109/TBME.2011.2172210. Marion F Baumgardner, Larry L Biehl, and David A Landgrebe. “220 band aviris hyperspectral image data set: June 12, 1992 indian pine test site 3”. In: Purdue University Research Repository 10 (2015). Gabriel Frahm and Uwe Jaekel. “A generalization of Tyler’s M-estimators to the case of incomplete data”. In: Computational Statistics & Data Analysis 54.2 (2010), pp. 374–393. Bosung Kang, Vishal Monga, and Muralidhar Rangaswamy. “Rank-constrained maximum likelihood estimation of structured covariance matrices”. In: IEEE Transactions on Aerospace and Electronic Systems 50.1 (2014), pp. 501–515. doi: 10.1109/TAES.2013.120389. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 32 / 35
  52. References II Chuanhai Liu. “Efficient ML estimation of the multivariate

    normal distribution from incomplete data”. In: Journal of Multivariate Analysis 69 (1999), pp. 206–217. doi: 10.1006/jmva.1998.1793. Karim Lounici et al. “High-dimensional covariance matrix estimation with missing observations”. In: Bernoulli 20.3 (2014), pp. 1029–1058. Junyan Liu and Daniel P. Palomar. “Regularized robust estimation of mean and covariance matrix for incomplete data”. In: Signal Processing 165 (2019), pp. 278–291. issn: 0165-1684. doi: https://doi.org/10.1016/j.sigpro.2019.07.009. Chuanhai Liu and Donald B Rubin. “ML estimation of the t distribution using EM and its extensions, ECM and ECME”. In: Statistica Sinica (1995), pp. 19–39. Marc Rußwurm et al. “BreizhCrops: A Time Series Dataset for Crop Type Mapping”. In: International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ISPRS (2020) (2020). Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 33 / 35
  53. References III Aude Sportisse, Claire Boyer, and Julie Josse. “Imputation

    and low-rank estimation with Missing Not At Random data”. In: (2020). arXiv: 1812.11409 [stat.ML]. Y. Sun, P. Babu, and D. P. Palomar. “Robust Estimation of Structured Covariance Matrix for Heavy-Tailed Elliptical Distributions”. In: IEEE Transactions on Signal Processing 64.14 (2016), pp. 3576–3590. doi: 10.1109/TSP.2016.2546222. Nicolas St¨ adler, Daniel J Stekhoven, and Peter B¨ uhlmann. “Pattern alternating maximization algorithm for missing data in high-dimensional problems.”. In: J. Mach. Learn. Res. 15.1 (2014), pp. 1903–1928. David E. Tyler. “A Distribution-Free M-Estimator of Multivariate Scatter”. In: The Annals of Statistics 15.1 (1987), pp. 234–251. Sergei Vassilvitskii and David Arthur. “k-means++: The advantages of careful seeding”. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. 2006, pp. 1027–1035. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 34 / 35
  54. Convergence of the EM 100 101 102 103 10−18 10−9

    100 NMSEEM,Σ NMSEEM,τi 100 101 102 103 100 101 102 103 Figure: ||θ(t+1) − θ(t)|| versus the number of iterations. Alexandre Hippert-Ferrer CM estimation with missing values S´ eminaire Scube 35 / 35