S³ Seminar
March 06, 2015
29

(University of Campinas, BR and ENS Cachan, FR)

Title — L0 optimization for DOA and channel sparse estimation

Abstract — Adilson Chinatto received a degree in Electrical Engineering in 1997 and Masters in 2011, both from the University of Campinas (Unicamp), Brazil. He worked as hardware, software and firmware development engineer for optical transmission equipment in the companies AsGa and CPqD in Brazil. He is a co-founder of Espectro Ltd., a Brazilian design house for hardware and software, focused in signal processing. Nowadays he is coordinator of a High Performance GPS Receiver Project at Espectro Ltd. funded by the Brazilian National Counsel of Technological and Scientific Development (CNPq). He has experience in electrical engineering with emphasis on telecommunication systems, digital signal processing and smart antennas, working mainly with development and implementation of programmable logic devices (FPGA). He is currently finishing his Ph.D. at Unicamp, working with sparse and compressive sensing signal processing.

March 06, 2015

## Transcript

1. ### l 0 -Optimization for DoA and Channel Sparse Estimation Adilson

Chinatto SUPELEC Mars, 6th 2015
2. ### Summary → l p -norm and other definitions → Problem

formulation → l 2 -l 1 solvers → examples on channel estimation → l 2 -l 0 solvers → examples on channel estimation → examples on DoA estimation → Conclusions and future work
3. ### Some Definitions Problem to be solved: l p =∥x∥ p

:=(∑ i ∣x i ∣p )1/ p , p>0 y=A x+n l 0 =∥x∥ 0 :=# of non-null elements in x x=[0 1 0 -2 0 5 0 0 -3]T ∥x∥ 0 =4 or Y=A X+N vector problem matrix problem p-norm of x is defined as: Special case: p = 0 E. g.:
4. ### l 2 Solution of Overdetermined Problem Assumptions: 1. y and

A are known 2. A[MxN] is overdetermined (M ≥ N) In such case, one possible solution is: min ̂ x {f (̂ x)=∥y A ̂ x∥ 2 2} y=A x+n ̂ x=A† y least squares solution closed form
5. ### l 2 Solutions of Underdetermined Problem Assumptions: 1. y and

A are known 2. A[MxN] is underdetermined (M < N) In such case, least squares solutions fails: min ̂ x {f (̂ x)=∥y A ̂ x∥ 2 2} y=A x+n less equations than unknows 3. Vector x is SPARSE How to use the sparsity of x in order to achieve a good solution? there are more null (or near-null) elements than non-null elements is x
6. ### l 2 –l 0 Solutions of Underdetermined Problem The best

solution considering x sparse is given by: 2. If maximum noise n is known: 3. If nor sparsity nor noise are known: 1. If maximum sparsity k is known: These problems are known to be NP-Hard! Assumptions: min ̂ x f ( ̂ x)=∥y A ̂ x∥ 2 2 subject to min g( ̂ x)=min∥̂ x∥ 0 min ̂ x ∥y A ̂ x∥ 2 2 s. t. ∥̂ x∥ 0 ⩽k min ̂ x ∥̂ x∥ 0 s. t. ∥y A ̂ x∥ 2 2⩽n min ̂ x {1 2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 0 }
7. ### l 2 –l 1 Solutions of Underdetermined Problem Relaxation: in

some cases, l 0 -norm can be relaxed l 0 → → → → l 1 1. Noiseless case: Mixture matrix A must obey some restrictions: RIP Assumptions: min ̂ x ∥̂ x∥ 1 s. t. A ̂ x= y Compressive Sensing Framework Allows solution by Convex Optimization techniques 2. Noisy case: min ̂ x {1 2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 1 } Restricted Isometry Property
8. ### l 2 –l 1 Algorithms 1. Basis Pursuit (BP) →

Noiseless case: convex function min ̂ x ∥̂ x∥ 1 s. t. A ̂ x= y g( ̂ x)=A ̂ x y=0 f ( ̂ x)=∥̂ x∥ 1 affine function → → → → Algorithm based on Linear Programming → → → → Drawbacks: → good recovery only for high spase signals (K « N) → too restritive conditions on A for good recovery → computationally intense
9. ### l 2 –l 1 Algorithms 2. Basis Pursuit De-Noising (BPDN)

→ Noisy case: → → → → Algorithm based on Linear Programming → → → → Drawbacks: → good recovery only for high spase signals (K « N) → too restritive conditions on A for good recovery → computationally intense convex function convex function min ̂ x {1 2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 1 }
10. ### l 2 –l 1 Iterative Algorithms 3. Orthogonal Matching Pursuit

→ greedy algorithm min ̂ x {1 2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 1 } init :r(0) ← y ; S=∅ ; k=0 until criterium loop: find i(k)=arg max∣〈r(k) , A〉∣→a i(k ) S ←S∪a i(k) ̂ x(k+1) ←S† y r(k+1) ← y A ̂ x(k+1) k ← k+1 residual iteration column of A
11. ### l 2 –l 1 Iterative Algorithms min ̂ x {1

2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 1 } gradient iteration penalization init : ̂ x(0)=0; k=0; λ(0) ; β ∈(0,1) until criterium loop: ∇ ̂ x (k) ← AH A ̂ x(k) AH y b(k) ← ̂ x(k) λ(k) ∇ ̂ x (k) ̂ x(k+1) ← prox λ(k ) {b(k)} λ(k+1) ← β λ(k ) k ← k+1 −λ +λ 4. Proximal-l 1 → Iterative Soft-Thresolding
12. ### l 2 –l 1 Minimization: Some Examples → Channel Estimation

→ Pilot assisted signal (512 symbols) Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7 y=A h+n 600 taps (each tap: 10 ns) 200 samples (each sample: 30 ns) A : [200x600] h : [600x1] y : [200x1]
13. ### l 2 –l 1 Minimization: Some Examples 50 100 150

200 250 300 350 400 450 500 550 600 -1 0 1 Original Channel Amplitude 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Least Squares Solution Amplitude 1. Least-Squares Solution → pure l 2 minimization Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7
14. ### 50 100 150 200 250 300 350 400 450 500

550 600 -1 0 1 Amplitude Original Channel 0 100 200 300 400 500 600 -1 0 1 Amplitude BPDN l 2 –l 1 Minimization: Some Examples 2. BPDN Solution → l 2 –l 1 minimization via linear programming Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7
15. ### 50 100 150 200 250 300 350 400 450 500

550 600 -1 0 1 Original Channel Amplitude 50 100 150 200 250 300 350 400 450 500 550 600 -0.2 -0.1 0 0.1 0.2 Amplitude Proximal L-1 Solution l 2 –l 1 Minimization: Some Examples 3. Proximal-l 1 Solution → l 2 –l 1 minimization via iterative algorithm Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7
16. ### 50 100 150 200 250 300 350 400 450 500

550 600 -1 0 1 Original Channel Amplitude 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude OMP Solution l 2 –l 1 Minimization: Some Examples 4. OMP Solution → l 2 –l 1 minimization via greedy algorithm Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7
17. ### l 2 –l 1 Minimization: Some Examples → Some (Partial)

Conclusions: → l 2 solution fails as the problem is underdetermined → l 2 –l 1 relaxation works with good accuracy in this case → But if I consider the 3GPP recomendation BW of 20 MHz? Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7 Error=∥h ̂ h∥ 2 2 BW 50 MHz Least Squares 114.94 BPDN 0.4918 Proximal-l 1 4.6436 OMP 0.0609
18. ### l 2 –l 1 Minimization: Some Examples 5. BPDN Solution

→ l 2 –l 1 minimization via linear programming 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude Original Channel 0 100 200 300 400 500 600 -1 -0.5 0 0.5 1 BPDN Solution Amplitude -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 Original Channel BPDN Solution – BW: 20 MHz
19. ### l 2 –l 1 Minimization: Some Examples – BW: 20

MHz 6. Proximal-l 1 Solution → l 2 –l 1 minimization via iterative algorithm 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude Original Channel 0 100 200 300 400 500 600 -0.4 -0.2 0 0.2 0.4 Amplitude Proximal L 1 Solution -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 Original Channel Proximal L 1 Solution
20. ### l 2 –l 1 Minimization: Some Examples 7. OMP Solution

→ l 2 –l 1 minimization via greedy algorithm – BW: 20 MHz 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude Original Channel 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude OMP Solution -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 Original Channel OMP Solution
21. ### l 2 –l 1 Minimization: Some Examples → Some Conclusions:

→ l 2 –l 1 relaxation fails in this case → RIP is not obeyed in this case → Matrix A has columns highly correlated Scenario: – 3GPP ETU channel model – BW: 20 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7 BW 50 MHz 20 MHz BPDN 0.4918 2.9050 Proximal-l 1 4.6436 5.9466 OMP 0.0609 2.7532 Error=∥h ̂ h∥ 2 2
22. ### l 2 –l 0 Solutions: Dealing With l 0 -norm

min ̂ x {1 2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 0 } gradient H √(2λ) (u)= {0 if ∣u∣<√(2λ) u if ∣u∣⩾√(2λ) } init : ̂ x(0)=0; k=0; λ until criterium loop: ∇ ̂ x (k) ← AH A ̂ x(k) AH y b(k) ← ̂ x(k) ∇ ̂ x (k) ̂ x(k+1) ← H √(2λ) {b(k)} k ← k+1 → Iterative Hard Thresholding (IHT)(1): variant 1 → Idea: try to solve applying a hard-threshold in order to eliminate noise and bad answers √(2λ) +√(2λ) (1)Blumensath & Davies (2004)
23. ### l 2 –l 0 Solutions: Dealing With l 0 -norm

min ̂ x {1 2 ∥y A ̂ x∥ 2 2 s. t. ∥̂ x∥ 0 ⩽K } gradient H K (u)→ {keeps the K greatest values of u sets to 0 all other terms } init : ̂ x(0)=0; k=0 until criterium loop: ∇ ̂ x (k) ← AH A ̂ x(k) AH y b(k) ← ̂ x(k) ∇ ̂ x (k) ̂ x(k+1) ← H K {b(k)} k ← k+1 → Iterative Hard Thresholding (IHT)(1): variant 2 → Idea: solve (1)Blumensath & Davies (2004)
24. ### 50 100 150 200 250 300 350 400 450 500

550 600 -1 0 1 Amplitude Original Channel 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 l 2 –l 0 Solutions: Dealing With l 0 -norm → IHT Solution → l 2 –l 0 minimization via iterative algorithm – BW: 20 MHz -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 Original Channel IHT Solution
25. ### l 2 –l 0 Solutions: Dealing With l 0 -norm

→ IHT provides very good sparse channel estimation → IHT is simple and has guaranteed convergence for BUT: → IHT convergence time can be extremelly long → Hard-thresholding discontinuity can result in instability problems Scenario: – 3GPP ETU channel model – BW: 20 MHz – f S : 33.3 MHz – SNR: 10 dB BW 20 MHz BPDN 2.9050 Proximal-l 1 5.9466 OMP 2.7532 IHT 0.0059 Error=∥h ̂ h∥ 2 2 ∥A∥ 2 <1
26. ### l 2 –l 0 Solutions: Improving IHT Performance ̂ x(k+1)=H

√(2λ) (̂ x(k) ∇ ̂ x (k)) H √(2λ)γ∥a i ∥ 2 (ui )= {sign(u i )min (∣u i ∣, (∣u i ∣ √(2λ)γ∥a i ∥ 2 ) + (1 ∥a i ∥ 2 2 γ) ) if ∥a i ∥ 2 2 γ<1 H√(2λ γ) (ui ) if ∥ai ∥ 2 2 γ⩾1 } → Continuous Exact l 0 (CEL0(2)): 1. Original IHT: 2. Inclusion of an additional step size γ γ γ γ: 4. Thresholding modification: ̂ x(k+1)=H √(2λ)γ (̂ x(k) γ ∇ ̂ x (k)) ∥a i ∥ 2 :=l 2 norm of the i-th column of A threshold varies element-by-element original threshold (2) Soubies, Blanc-Férraud & Aubert (2015)
27. ### l 2 –l 0 Solutions: Improving IHT Performance Scenario: –

3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 0 to 20 dB – 100 Monte-Carlo Simulations delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7 MSE= 1 100 ∑ i=1 100 ∥h ̂ hi ∥ 2 2 0 2 4 6 8 10 12 14 16 18 20 10-4 10-3 10-2 10-1 100 101 CEL0 vs IHT SNR MSE CEL0 IHT CRB → Error:
28. ### l 2 –l 0 Solutions: Improving IHT Performance Scenario: –

3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 0 to 20 dB – 100 Monte-Carlo Simulations delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7 0 2 4 6 8 10 12 14 16 18 20 102 103 104 105 CEL0 vs IHT SNR # of Iterations CEL0 IHT → Convergence Time (number of iterations):
29. ### l 2 –l 0 Minimization: DoA Estimation → Traditional Approach:

→ K incident signals → N antennas → L snapshots Y=AS+N A: [NxK] S: [KxL] Y: [NxL] unknown matrix problem Classical Solutions: – MUSIC – ESPRIT – RootMUSIC – ... very “expensive” solutions
30. ### l 2 –l 0 Minimization: DoA Estimation → “Compressed” Approach:

→ K incident signals → N antennas → L snapshots Y=A G S G +N SPARSE! KNOWN!
31. ### -10 -5 0 5 10 15 20 10-2 10-1 100

101 102 RMSE for 64 Antennas SNR RMSE ( o ) IHT MUSIC l 2 –l 0 Minimization: DoA Estimation → Some Simulations: Scenario: – K = 2 incident signals – θ1 = 0° – θ2 = 5° – N = 64 antennas (ULA) – L = 10 snapshots – ∆θ = 0.15° – G = 601 taps (–45° to +45°) – SNR: –10 to 20 dB – 100 Monte-Carlo runs – IHT and MUSIC algorithms RMSE= 1 200 √(∑ i=1 200 [(θ 1 ̂ θ 1i )2+(θ 2 ̂ θ 2i )2] )
32. ### -10 -5 0 5 10 15 20 10-2 10-1 100

101 102 RMSE for 24 Antennas SNR RMSE ( o ) IHT MUSIC l 2 –l 0 Minimization: DoA Estimation → Some Simulations: Scenario: – K = 2 incident signals – θ1 = 0° – θ2 = 5° – N = 24 antennas (ULA) – L = 10 snapshots – ∆θ = 0.15° – G = 601 taps (–45° to +45°) – SNR: –10 to 20 dB – 100 Monte-Carlo runs – IHT and MUSIC algorithms RMSE= 1 200 √(∑ i=1 200 [(θ 1 ̂ θ 1i )2+(θ 2 ̂ θ 2i )2] )
33. ### l 0 -Optimization for DoA and Sparse Channel Estimation →

Conclusions: → l 0 -optimization enables enhancement in channel estimation → precision and resolution → l 0 -optimization by iterative algorithms provide simple, efficient and elegant solutions to sparse problems → DoA estimation can be transformed in a sparse problem whose solution could be reached via l 0 -optimization → CEL0 provides enhancement in convergence rate and stability of IHT → Future: → verification of l 0 -optimization limits in channel estimation → application of IHT modified by CEL0 in DoA estimation → application of l 0 -optimization in other problems → e. g. non-pilot aided channel estimation, ...