Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Esa Ollila

C3bc10b8a72ed3c3bfd843793b8a9868?s=47 S³ Seminar
February 03, 2015

Esa Ollila

(Aalto University, Finland)

https://s3-seminar.github.io/seminars/esa-ollila

Title — Robust approaches to multichannel sparse recovery

Abstract — We consider multichannel sparse recovery problem where the objective is to find good recovery of jointly sparse unknown signal vectors from the given multiple measurement vectors which are different linear combinations of the same known elementary vectors (atoms). The model is thus an extension of single measurement vector setting used in compressed sensing (CS). Many popular greedy or convex algorithms proposed for multichannel sparse recovery problem perform poorly under non-Gaussian heavy-tailed noise conditions or in the face of outliers (gross errors), i.e., are not robust. In this talk, we consider different types of mixed robust norms on data fidelity (residual matrix) term and conventional L0-norm constraint on the signal matrix to promote row-sparsity. We devise algorithms based normalized iterative hard thresholding (blumesath and davies, 2010) which is a simple, computationally efficient and scalable approach for solving the simultaneous sparse approximation problem. Performance assessment conducted on simulated data highlights the effectiveness of the proposed approaches to cope with different noise environments (i.i.d., row i.i.d, etc) and outliers. Usefulness of the methods are illustrated in image denoising problem and source localization application with sensor arrays. Finally (if time permits) a (non-robust) Bayesian perspective to multichannel recovery problem is discussed as well.

Biography — Esa Ollila received the M.Sc. degree in mathematics from the University of Oulu, in 1998, Ph.D. degree in statistics with honors from the University of yvaskyla, in 2002, and the D.Sc.(Tech) degree with honors in signal processing from Aalto University, in 2010. From 2004 to 2007 he was a post-doctoral fellow of the Academy of Finland. He has also been a Senior Researcher and a Senior Lecturer at Aalto University and University of Oulu, respectively. Currently, from August 2010, he is appointed as an Academy Research Fellow of the Academy of Finland at the Department of Signal Processing and Acoustics, Aalto University, Finland. He is also adjunct Professor (statistics) of University of Oulu. During the Fall-term 2001 he was a Visiting Researcher with the Department of Statistics, Pennsylvania State University, State College, PA while the academic year 2010-2011 he spent as a Visiting Research Associate with the Department of Electrical Engineering, Princeton University, Princeton, NJ. His research interests focus on theory and methods of statistical signal processing, blind source separation, complex-valued signal processing, array and radar signal processing and robust and non-parametric statistical methods.

C3bc10b8a72ed3c3bfd843793b8a9868?s=128

S³ Seminar

February 03, 2015
Tweet

Transcript

  1. Robust multichannel sparse recovery Esa Ollila Department of Signal Processing

    and Acoustics Aalto University, Finland SUPELEC, Feb 4th, 2015
  2. 1 Introduction 2 Nonparametric sparse recovery 3 Simulation studies 4

    Source localization with Sensor Arrays
  3. Single Measurement Vector Model (SMV) y M × 1 =

    Φ M × N x N × 1 + e N × 1 Φ = φ1 · · · φN = φ(1) · · · φ(M) H, M × N known measurement (design, system) matrix x = (x1, . . . , xN )⊤, unobserved signal vector e = (e1, . . . , eM )⊤, random noise vector. Objective recover K-sparse or compressible signal from y knowing Φ, K NOTE: typically, M < N (ill-posed model) and K ≪ N. 2/24
  4. Single Measurement Vector Model (SMV) y M × 1 =

    Φ M × N x N × 1 + e M × 1 If we can determine Γ = supp(x), then the problem reduces to conventional regression problem (more ’responses’ than ’predictors’): y Φ x e 3/24
  5. Multiple Measurement Vectors Model (MMV) yi = Φxi + ei

    , i = 1, . . . , Q In matrix form Y M × Q = Φ M × N X N × Q + E M × 1 Y = y1 · · · yQ , observed measurement matrix X = x1 · · · xQ , unobserved signal matrix E = e1 · · · eQ , noise matrix Objective recover K-sparse (or compressible) signal matrix X knowing Y, Φ, K. N × Q matrix X is called K-sparse if X 0 = |supp(X)| ≤ K, where Γ = supp(X) = Q i=1 supp(xi ) = {j ∈ {1, . . . , N} : xjk = 0 for some k} 4/24
  6. Multiple Measurement Vectors Model (MMV) Key idea: signals are all

    predominantly supported on a common support set =⇒ Joint estimation can lead both to computational advantages and increased reconstruction accuracy See [Tropp, 2006, Chen and Huo, 2006, Eldar and Rauhut, 2010, Duarte and Eldar, 2011, Blanchard et al., 2014]. Applications EEG/MEG [Gorodnitsky et al., 1995] equalization of sparse communications channels [Cotter and Rao, 2002] blind source separation [Gribonval and Zibulevsky, 2010] source localization using sensor arrays [Malioutov et al., 2005] . . . 5/24
  7. Matrix norms The mixed ℓp,q norm of X for p,

    q ∈ [1, ∞) X p,q = i j |xij |p q/p 1/q = i x(i) q p 1/q . If p = q =⇒ matrix p-norm X p,p = X p . ℓ2 (Frobenius) norm · 2 is denoted shortly as · . The mixed ℓ ∞ -norms of X ∈ CN×Q X p,∞ = maxi∈{1,...,N} x(i) p X ∞,q = i (maxj∈{1,...,Q} |xij |)q 1/q. The row ℓ0 quasi-norm of X X 0 = | supp(X)| = nr. of number of nonzero rows of X 6/24
  8. Optimization problem min X Y − ΦX 2 subject to

    X 0 ≤ K is combinatorial (NP-hard). More feasible approaches: Optimization (Convex relaxation) Greedy pursuit: Orthogonal Matching Pursuit (OMP) Normalized Iterative Hard Thresholding (NIHT) Normalized Hard Thresholding Pursuit (NHTP) Compressive Sampling MP (CoSaMP) Guaranteed to perform very well under suitable conditions (restricted isometry property on Φ, non impulsive noise e) NIHT is simple and scalable, convenient for problems where the support is of primary interest 7/24
  9. Optimization problem min X Y − ΦX 2 subject to

    X 0 ≤ K is combinatorial (NP-hard). More feasible approaches: Optimization (Convex relaxation) Greedy pursuit: Orthogonal Matching Pursuit (OMP) Normalized Iterative Hard Thresholding (NIHT) Normalized Hard Thresholding Pursuit (NHTP) Compressive Sampling MP (CoSaMP) Guaranteed to perform very well under suitable conditions (restricted isometry property on Φ, non impulsive noise e) NIHT is simple and scalable, convenient for problems where the support is of primary interest 7/24
  10. Convex relaxation IDEA: replace X 0 by ℓp,1 norm [Tropp,

    2006] uses p = ∞: min X Y − ΦX 2 + λ X ∞,1 where X ∞,1 = i |x(i) ∞ . [Malioutov et al., 2005] uses p = 2: min X Y − ΦX 2 + λ X 2,1 where X 2,1 = i |x(i) . good recovery depends on the choise of egularization parameter λ > 0 not robust due to the used data fidelity term (LS-loss) not as scalable as greedy methods 8/24
  11. 1 Introduction 2 Nonparametric sparse recovery 3 Simulation studies 4

    Source localization with Sensor Arrays
  12. Nonparametric approach based on mixed norms min X Y −

    ΦX q p,q data fidelity subject to X 0 ≤ K sparsity constraint (Pp,q ) =⇒ mixed ℓ1 norms provide robustness, i.e., give larger weights on small residuals and less weight on large residuals. We consider the cases (p, q) = (1, 1): Y − ΦX 1 = i j |yij − φH (i) xj | robust norm for errors i.i.d. in space and time (p, q) = (2, 1): Y − ΦX 2,1 = i y(i) − XHφ(i) robust norm for errors i.i.d. in ”space” only and devise greedy simultaneous NIHT (SNIHT) algorithm 9/24
  13. Normalized iterative hard thresholding (NIHT) Conventional ℓ2,2 approach : Xn+1

    = HK ↑ hard thresholding Xn + stepsize µn+1 ΦH(Y − ΦXn) ↑ neg. gradient ∇X∗ · 2 ψ2,2 (X) = X =⇒ Conventional ℓ2,2 approach ψ1,1 (X) = [sign(xij )] (elementwise operation of S(·)) ψ2,1 (X) = [sign(x(i) )] (rowwise operation of S(·)) 10/24
  14. Normalized iterative hard thresholding (NIHT) In our mixed norm ℓp,q

    approach: Xn+1 = HK ↑ hard thresholding Xn + stepsize µn+1 ΦHψp,q (Y − ΦXn) ↑ neg. gradient ∇X∗ · q p,q ψ2,2 (X) = X =⇒ Conventional ℓ2,2 approach ψ1,1 (X) = [sign(xij )] (elementwise operation of S(·)) ψ2,1 (X) = [sign(x(i) )] (rowwise operation of S(·)) 10/24
  15. Normalized iterative hard thresholding (NIHT) In our mixed norm ℓp,q

    approach: Xn+1 = HK ↑ hard thresholding Xn + stepsize µn+1 ΦHψp,q (Y − ΦXn) ↑ neg. gradient ∇X∗ · q p,q ψ2,2 (X) = X =⇒ Conventional ℓ2,2 approach ψ1,1 (X) = [sign(xij )] (elementwise operation of S(·)) ψ2,1 (X) = [sign(x(i) )] (rowwise operation of S(·)) Spatial sign sign(x) = x/ x , for x = 0 0, for x = 0 10/24
  16. Normalized iterative hard thresholding (NIHT) In our mixed norm ℓp,q

    approach: Xn+1 = HK ↑ hard thresholding Xn + stepsize µn+1 ΦHψp,q (Y − ΦXn) ↑ neg. gradient ∇X∗ · q p,q ψ2,2 (X) = X =⇒ Conventional ℓ2,2 approach ψ1,1 (X) = [sign(xij )] (elementwise operation of S(·)) ψ2,1 (X) = [sign(x(i) )] (rowwise operation of S(·)) Spatial sign sign(x) = x/ x , for x = 0 0, for x = 0 10/24
  17. SNIHT(p, q) algorithm input : Y, Φ, sparsity K, mixed

    norm indices (p, q) output : (Xn+1, Γn+1) as estimates of X and Γ = supp(X) initialize: X0 = 0, µ0 = 0, n = 0, Γ0 = ∅ 1 Γ0 = supp HK (ΦHψp,q (Y)) while halting criterion false do 2 Rn ψ = ψp,q (Y − ΦXn) 3 Gn = ΦHRn ψ 4 µn+1 = CompStepsize(Φ, Gn, Γn, µn, p, q) 5 Xn+1 = HK (Xn + µn+1Gn) 6 Γn+1 = supp(Xn+1) 7 n = n + 1 end 11/24
  18. Computation of the step size µn+1 For convergence, it is

    important that the stepsize is adaptively controlled at each iteration. Given the found support Γn is correct, one can choose µn+1 as the min. of the objective fnc for the gradient ascent direction Xn + µGn|Γn : L(µ) = Y − Φ Xn + µGn|Γn q p,q = Rn − µB q p,q where Rn = Y − ΦXn and B = ΦG(Γn) For p = q: simple linear regression problem based on ℓp -norm, with response r = vec(Rn) and predictor b = vec(B). ⇒ µn+1 = Gn (Γn) 2/ ΦGn (Γn) 2 for p = q = 2 For p = q = 1 and (p, q) = (2, 1), minimizer of L(µ) can not be found in closed-form. 12/24
  19. Computation of the step size µn+1 When p = q

    = 1, the solution µ verifies a fixed point (FP) equation µ = H(µ) H(µ) = i,j |˜ rij |−1|bij |2 −1 i,j |˜ rij |−1Re(b∗ ij rij ), and ˜ R = Rn − µB (depends on unknown µ) and B = ΦG(Γn) . We select µn+1 as the 1-step FP iterate µn+1 = H(µn), where µn is the value of the stepsize at previous nth iteration. Often this approximation was within ∼ 1e−3 to minimizer of L(µ). Same approach is used in the case (p, q) = (2, 1). 13/24
  20. 1 Introduction 2 Nonparametric sparse recovery 3 Simulation studies 4

    Source localization with Sensor Arrays
  21. Simulation study Y M × Q = Φ M ×

    N X N × Q + E M × 1 Set-up Measurement matrix Φ: φij ∼ CN(0, 1) with unit-norm columns Γ = supp(X) randomly drawn K elements from {1, . . . , N} signal matrix X: |xij | = 10 and Arg(xij ) ∼ Unif (0, 2π) ∀i ∈ Γ. For ℓ = 1, . . . , L MC-trials (L = 1000), compute X[ℓ] and Γ[ℓ] Performance measures mean squared error MSE( ^ X) = 1 LQ L ℓ=1 ^ X[ℓ] − X[ℓ] 2. probability of exact recovery PER = 1 L L ℓ=1 I ^ Γ[ℓ] = Γ[ℓ] . 14/24
  22. IIID Gaussian noise, eij ∼ CN (0, σ2) MSE vs

    SNR for Q=16 0 4 8 12 16 20 10 15 20 25 30 35 SNR (dB) MSE (dB) SNIHT(2, 2) SNIHT(2, 1) SNIHT(1, 1) HUB-SNIHT PER vs Q at SNR = 5 dB 2 4 6 8 10 12 14 16 18 0 0.2 0.4 0.6 0.8 1 Q (number of vectors) PER SNIHT(2, 2) SNIHT(2, 1) SNIHT(1, 1) HUB-SNIHT SNIHT(2, 2) but SNIHT(2, 1) and HUB-SNIHT (threshold c = 1.517) loose only 0.09 dB in MSE (Q = 16) 15/24
  23. IID t-distributed noise, eij ∼ Ctν (0, σ2), σ2 =

    MedF (|ei |2) SNR(σ) = 30 dB 1 1.5 2 3 4 5 0 10 20 30 40 50 60 ν (degrees of freedom) MSE (dB) SNIHT(2, 2) SNIHT(2, 1) SNIHT(1, 1) HUB-SNIHT SNR(σ) = 10 dB, ν = 3 2 4 6 8 10 12 14 16 18 0 0.2 0.4 0.6 0.8 1 Q (number of vectors) PER SNIHT(2, 2) SNIHT(2, 1) SNIHT(1, 1) HUB-SNIHT SNIHT(1, 1) 16/24
  24. Row IIID compound Gaussian inverse Gaussian texture, e(i) ∼ CIGλ

    (0, σ2IQ ) 0 4 8 12 16 20 0 5 10 15 20 25 30 SNR (dB) MSE (dB) SNIHT(2, 2) SNIHT(2, 1) SNIHT(1, 1) HUB-SNIHT SNIHT(2, 1) 17/24
  25. 1 Introduction 2 Nonparametric sparse recovery 3 Simulation studies 4

    Source localization with Sensor Arrays
  26. Sensor array and narrowband, farfield source model M sensors, K

    sources (M > K): Model for sensor measurements at time instant t y(t) = A(θ)x(t) + e(t), - steering matrix A(θ) = a(θ1 ) . . . a(θK ) - unknown (distinct) Direction of Arrivals (DOA’s) θ1, . . . , θK - known array manifold a(θ) (e.g., ULA) - unknown signal waveforms x(t) = (x1 (t), . . . , xK (t))⊤ Objective Estimate the DOA’s of the sources given the snapshots y(t1 ), . . . , y(tQ ) and the number of sources K impinging on the sensor array. 18/24
  27. Multichannel sparse recovery approach to source localization Construct an M

    × N overcomplete steering matrix A(˜ θ) for a grid of all source locations ˜ θ1, . . . , ˜ θN of interest. Suppose that ˜ θ contains the true DOA’s to some accuracy. The array model in matrix form then rewrites as Y = y(t1 ) · · · y(tQ ) = A(˜ θ)X + E, where X ∈ CN×Q is K-sparse, with K non-zero rows containing the source waveforms at time instants t1, . . . , tQ . =⇒ K-sparse MMV model In the multichannel sparce recovery formulation A(˜ θ) is completely known, and we can use SNIHT methods to identify the support. 19/24
  28. Multichannel sparse recovery approach to source localization Construct an M

    × N overcomplete steering matrix A(˜ θ) for a grid of all source locations ˜ θ1, . . . , ˜ θN of interest. Suppose that ˜ θ contains the true DOA’s to some accuracy. The array model in matrix form then rewrites as Y = y(t1 ) · · · y(tQ ) = A(˜ θ)X + E, where X ∈ CN×Q is K-sparse, with K non-zero rows containing the source waveforms at time instants t1, . . . , tQ . =⇒ K-sparse MMV model In the multichannel sparce recovery formulation A(˜ θ) is completely known, and we can use SNIHT methods to identify the support. 19/24
  29. Multichannel sparse recovery approach to source localization Construct an M

    × N overcomplete steering matrix A(˜ θ) for a grid of all source locations ˜ θ1, . . . , ˜ θN of interest. Suppose that ˜ θ contains the true DOA’s to some accuracy. The array model in matrix form then rewrites as Y = y(t1 ) · · · y(tQ ) = A(˜ θ)X + E, where X ∈ CN×Q is K-sparse, with K non-zero rows containing the source waveforms at time instants t1, . . . , tQ . =⇒ Finding DOA’s is equivalent to identifying Γ = supp(X) In the multichannel sparce recovery formulation A(˜ θ) is completely known, and we can use SNIHT methods to identify the support. 19/24
  30. Multichannel sparse recovery approach to source localization Construct an M

    × N overcomplete steering matrix A(˜ θ) for a grid of all source locations ˜ θ1, . . . , ˜ θN of interest. Suppose that ˜ θ contains the true DOA’s to some accuracy. The array model in matrix form then rewrites as Y = y(t1 ) · · · y(tQ ) = A(˜ θ)X + E, where X ∈ CN×Q is K-sparse, with K non-zero rows containing the source waveforms at time instants t1, . . . , tQ . =⇒ Finding DOA’s is equivalent to identifying Γ = supp(X) In the multichannel sparce recovery formulation A(˜ θ) is completely known, and we can use SNIHT methods to identify the support. 19/24
  31. Simul set-up ULA of M = 20 sensors with half

    a wavelength interelement spacing. K = 2 independent Gaussian sources from θ1 = 0o and θ2 = 8o. Temporally/spatially white Gaussian error terms. Low SNR = 0 dB. Uniform grid ˜ θ on [−90, 90] with 2o degree spacing We compute PER rates and relative frequencies of found DOA estimates in the grid for 1000 Monte Carlo runs. 20/24
  32. Number of snapshots Q = 4 DOA’s at θ1 =

    0o and θ1 = 8o. −4 −2 0 2 4 6 8 10 12 14 0 0.2 0.4 0.6 0.8 1 DOA θ degrees (2o sampling grid) Frequency SNIHT(2, 2) SNIHT(1, 1) SNIHT(2, 1) PER rates 1 SNIHT(1, 1) = 81% 2 SNIHT(2, 1) = 73.1% 3 HUB-SNIHT = 42.6% 4 SNIHT = 37.6% SNIHT(1, 1) 21/24
  33. Number of snapshots Q = 40 DOA’s at θ1 =

    0o and θ1 = 8o. −4 −2 0 2 4 6 8 10 12 14 0 0.2 0.4 0.6 0.8 1 DOA θ degrees (2o sampling grid) Frequency SNIHT(2, 2) SNIHT(1, 1) SNIHT(2, 1) PER rates 1 SNIHT(1, 1) = 100% 2 SNIHT(2, 1) = 73.1% 3 HUB-SNIHT = 42.6% 4 SNIHT = 37.6% SNIHT(1, 1) 22/24
  34. Conclusions robust and nonparametric appraches for simultaneous sparse recovery 23/24

  35. Conclusions robust and nonparametric appraches for simultaneous sparse recovery Different

    method, SNIHT(1, 1), SNIHT(2, 1), HUB-SNIHT, can be selected for a problem at hand (iid/correlated noise, high-breakdown, etc) 23/24
  36. Conclusions robust and nonparametric appraches for simultaneous sparse recovery Different

    method, SNIHT(1, 1), SNIHT(2, 1), HUB-SNIHT, can be selected for a problem at hand (iid/correlated noise, high-breakdown, etc) Fast and scalable greedy SNIHT algorithm: increase in robustness does not imply increase in computation time! 23/24
  37. Conclusions robust and nonparametric appraches for simultaneous sparse recovery Different

    method, SNIHT(1, 1), SNIHT(2, 1), HUB-SNIHT, can be selected for a problem at hand (iid/correlated noise, high-breakdown, etc) Fast and scalable greedy SNIHT algorithm: increase in robustness does not imply increase in computation time! Applications in source localization and image denoising, etc. 23/24
  38. This talk based on E. Ollila (2015a) Nonparametric Simultaneous Sparse

    Recovery: an Application to Source Localization PDF (ArXiv): http://arxiv.org/pdf/1502.02441v1 EUSIPCO’15, submitted. E. Ollila (2015b). Robust Simultaneous Sparse Approximation A Festschrift in Honor of Hannu Oja, Springer, Sept. 2015. 24/24
  39. Blanchard, J. D., Cermak, M., Hanle, D., and Jin, Y.

    (2014). Greedy algorithms for joint sparse recovery. IEEE Trans. Signal Processing, 62(7). Chen, J. and Huo, X. (2006). Theoretical results on sparse representations of multiple-measurement vectors. IEEE Trans. Signal Processing, 54(12):4634–4643. Cotter, S. and Rao, B. (2002). Sparse channel estimation via matching pursuit with application to equalization. IEEE Trans. Comm., 50(3):374–377. Duarte, M. F. and Eldar, Y. C. (2011). Structured compressed sensing: From theory to applications. IEEE Trans. Signal Processing, 59(9):4053–4085. Eldar, Y. C. and Rauhut, H. (2010). Average case analysis of multichannel sparse recovery using convex relaxation. IEEE Trans. Inform. Theory, 56(1):505–519. Gorodnitsky, I., George, J., and Rao, B. (1995). Neuromagnetic source imaging with FOCUSS: a recursive weighted minimum norm algorithm. J. Electroencephalogr. Clin. Neurophysiol., 95(4):231–251.
  40. Gribonval, R. and Zibulevsky, M. (2010). Handbook of Blind Source

    Separation, chapter Sparse component analysis, pages 367–420. Academic Press, Oxford, UK. Malioutov, D., C ¸etin, M., and Willsky, A. S. (2005). A sparse signal reconstruction perspective for source localization with sensor arrays. IEEE Trans. Signal Processing, 53(8):3010–3022. Tropp, J. A. (2006). Algorithms for simultaneous sparse approximation. part II: Convex relaxation. Signal Processing, 86:589–602. 24/24