Slide 1

Slide 1 text

l 0 -Optimization for DoA and Channel Sparse Estimation Adilson Chinatto SUPELEC Mars, 6th 2015

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

Summary → l p -norm and other definitions → Problem formulation → l 2 -l 1 solvers → examples on channel estimation → l 2 -l 0 solvers → examples on channel estimation → examples on DoA estimation → Conclusions and future work

Slide 4

Slide 4 text

Some Definitions Problem to be solved: l p =∥x∥ p :=(∑ i ∣x i ∣p )1/ p , p>0 y=A x+n l 0 =∥x∥ 0 :=# of non-null elements in x x=[0 1 0 -2 0 5 0 0 -3]T ∥x∥ 0 =4 or Y=A X+N vector problem matrix problem p-norm of x is defined as: Special case: p = 0 E. g.:

Slide 5

Slide 5 text

l 2 Solution of Overdetermined Problem Assumptions: 1. y and A are known 2. A[MxN] is overdetermined (M ≥ N) In such case, one possible solution is: min ̂ x {f (̂ x)=∥y A ̂ x∥ 2 2} y=A x+n ̂ x=A† y least squares solution closed form

Slide 6

Slide 6 text

l 2 Solutions of Underdetermined Problem Assumptions: 1. y and A are known 2. A[MxN] is underdetermined (M < N) In such case, least squares solutions fails: min ̂ x {f (̂ x)=∥y A ̂ x∥ 2 2} y=A x+n less equations than unknows 3. Vector x is SPARSE How to use the sparsity of x in order to achieve a good solution? there are more null (or near-null) elements than non-null elements is x

Slide 7

Slide 7 text

l 2 –l 0 Solutions of Underdetermined Problem The best solution considering x sparse is given by: 2. If maximum noise n is known: 3. If nor sparsity nor noise are known: 1. If maximum sparsity k is known: These problems are known to be NP-Hard! Assumptions: min ̂ x f ( ̂ x)=∥y A ̂ x∥ 2 2 subject to min g( ̂ x)=min∥̂ x∥ 0 min ̂ x ∥y A ̂ x∥ 2 2 s. t. ∥̂ x∥ 0 ⩽k min ̂ x ∥̂ x∥ 0 s. t. ∥y A ̂ x∥ 2 2⩽n min ̂ x {1 2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 0 }

Slide 8

Slide 8 text

l 2 –l 1 Solutions of Underdetermined Problem Relaxation: in some cases, l 0 -norm can be relaxed l 0 → → → → l 1 1. Noiseless case: Mixture matrix A must obey some restrictions: RIP Assumptions: min ̂ x ∥̂ x∥ 1 s. t. A ̂ x= y Compressive Sensing Framework Allows solution by Convex Optimization techniques 2. Noisy case: min ̂ x {1 2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 1 } Restricted Isometry Property

Slide 9

Slide 9 text

l 2 –l 1 Algorithms 1. Basis Pursuit (BP) → Noiseless case: convex function min ̂ x ∥̂ x∥ 1 s. t. A ̂ x= y g( ̂ x)=A ̂ x y=0 f ( ̂ x)=∥̂ x∥ 1 affine function → → → → Algorithm based on Linear Programming → → → → Drawbacks: → good recovery only for high spase signals (K « N) → too restritive conditions on A for good recovery → computationally intense

Slide 10

Slide 10 text

l 2 –l 1 Algorithms 2. Basis Pursuit De-Noising (BPDN) → Noisy case: → → → → Algorithm based on Linear Programming → → → → Drawbacks: → good recovery only for high spase signals (K « N) → too restritive conditions on A for good recovery → computationally intense convex function convex function min ̂ x {1 2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 1 }

Slide 11

Slide 11 text

l 2 –l 1 Iterative Algorithms 3. Orthogonal Matching Pursuit → greedy algorithm min ̂ x {1 2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 1 } init :r(0) ← y ; S=∅ ; k=0 until criterium loop: find i(k)=arg max∣〈r(k) , A〉∣→a i(k ) S ←S∪a i(k) ̂ x(k+1) ←S† y r(k+1) ← y A ̂ x(k+1) k ← k+1 residual iteration column of A

Slide 12

Slide 12 text

l 2 –l 1 Iterative Algorithms min ̂ x {1 2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 1 } gradient iteration penalization init : ̂ x(0)=0; k=0; λ(0) ; β ∈(0,1) until criterium loop: ∇ ̂ x (k) ← AH A ̂ x(k) AH y b(k) ← ̂ x(k) λ(k) ∇ ̂ x (k) ̂ x(k+1) ← prox λ(k ) {b(k)} λ(k+1) ← β λ(k ) k ← k+1 −λ +λ 4. Proximal-l 1 → Iterative Soft-Thresolding

Slide 13

Slide 13 text

l 2 –l 1 Minimization: Some Examples → Channel Estimation → Pilot assisted signal (512 symbols) Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7 y=A h+n 600 taps (each tap: 10 ns) 200 samples (each sample: 30 ns) A : [200x600] h : [600x1] y : [200x1]

Slide 14

Slide 14 text

l 2 –l 1 Minimization: Some Examples 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Original Channel Amplitude 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Least Squares Solution Amplitude 1. Least-Squares Solution → pure l 2 minimization Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7

Slide 15

Slide 15 text

50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude Original Channel 0 100 200 300 400 500 600 -1 0 1 Amplitude BPDN l 2 –l 1 Minimization: Some Examples 2. BPDN Solution → l 2 –l 1 minimization via linear programming Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7

Slide 16

Slide 16 text

50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Original Channel Amplitude 50 100 150 200 250 300 350 400 450 500 550 600 -0.2 -0.1 0 0.1 0.2 Amplitude Proximal L-1 Solution l 2 –l 1 Minimization: Some Examples 3. Proximal-l 1 Solution → l 2 –l 1 minimization via iterative algorithm Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7

Slide 17

Slide 17 text

50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Original Channel Amplitude 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude OMP Solution l 2 –l 1 Minimization: Some Examples 4. OMP Solution → l 2 –l 1 minimization via greedy algorithm Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7

Slide 18

Slide 18 text

l 2 –l 1 Minimization: Some Examples → Some (Partial) Conclusions: → l 2 solution fails as the problem is underdetermined → l 2 –l 1 relaxation works with good accuracy in this case → But if I consider the 3GPP recomendation BW of 20 MHz? Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7 Error=∥h ̂ h∥ 2 2 BW 50 MHz Least Squares 114.94 BPDN 0.4918 Proximal-l 1 4.6436 OMP 0.0609

Slide 19

Slide 19 text

l 2 –l 1 Minimization: Some Examples 5. BPDN Solution → l 2 –l 1 minimization via linear programming 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude Original Channel 0 100 200 300 400 500 600 -1 -0.5 0 0.5 1 BPDN Solution Amplitude -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 Original Channel BPDN Solution – BW: 20 MHz

Slide 20

Slide 20 text

l 2 –l 1 Minimization: Some Examples – BW: 20 MHz 6. Proximal-l 1 Solution → l 2 –l 1 minimization via iterative algorithm 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude Original Channel 0 100 200 300 400 500 600 -0.4 -0.2 0 0.2 0.4 Amplitude Proximal L 1 Solution -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 Original Channel Proximal L 1 Solution

Slide 21

Slide 21 text

l 2 –l 1 Minimization: Some Examples 7. OMP Solution → l 2 –l 1 minimization via greedy algorithm – BW: 20 MHz 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude Original Channel 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude OMP Solution -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 Original Channel OMP Solution

Slide 22

Slide 22 text

l 2 –l 1 Minimization: Some Examples → Some Conclusions: → l 2 –l 1 relaxation fails in this case → RIP is not obeyed in this case → Matrix A has columns highly correlated Scenario: – 3GPP ETU channel model – BW: 20 MHz – f S : 33.3 MHz – SNR: 10 dB delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7 BW 50 MHz 20 MHz BPDN 0.4918 2.9050 Proximal-l 1 4.6436 5.9466 OMP 0.0609 2.7532 Error=∥h ̂ h∥ 2 2

Slide 23

Slide 23 text

l 2 –l 0 Solutions: Dealing With l 0 -norm min ̂ x {1 2 ∥y A ̂ x∥ 2 2+λ∥̂ x∥ 0 } gradient H √(2λ) (u)= {0 if ∣u∣<√(2λ) u if ∣u∣⩾√(2λ) } init : ̂ x(0)=0; k=0; λ until criterium loop: ∇ ̂ x (k) ← AH A ̂ x(k) AH y b(k) ← ̂ x(k) ∇ ̂ x (k) ̂ x(k+1) ← H √(2λ) {b(k)} k ← k+1 → Iterative Hard Thresholding (IHT)(1): variant 1 → Idea: try to solve applying a hard-threshold in order to eliminate noise and bad answers √(2λ) +√(2λ) (1)Blumensath & Davies (2004)

Slide 24

Slide 24 text

l 2 –l 0 Solutions: Dealing With l 0 -norm min ̂ x {1 2 ∥y A ̂ x∥ 2 2 s. t. ∥̂ x∥ 0 ⩽K } gradient H K (u)→ {keeps the K greatest values of u sets to 0 all other terms } init : ̂ x(0)=0; k=0 until criterium loop: ∇ ̂ x (k) ← AH A ̂ x(k) AH y b(k) ← ̂ x(k) ∇ ̂ x (k) ̂ x(k+1) ← H K {b(k)} k ← k+1 → Iterative Hard Thresholding (IHT)(1): variant 2 → Idea: solve (1)Blumensath & Davies (2004)

Slide 25

Slide 25 text

50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 Amplitude Original Channel 50 100 150 200 250 300 350 400 450 500 550 600 -1 0 1 l 2 –l 0 Solutions: Dealing With l 0 -norm → IHT Solution → l 2 –l 0 minimization via iterative algorithm – BW: 20 MHz -1.5 -1 -0.5 0 0.5 1 1.5 -1.5 -1 -0.5 0 0.5 1 1.5 Original Channel IHT Solution

Slide 26

Slide 26 text

l 2 –l 0 Solutions: Dealing With l 0 -norm → IHT provides very good sparse channel estimation → IHT is simple and has guaranteed convergence for BUT: → IHT convergence time can be extremelly long → Hard-thresholding discontinuity can result in instability problems Scenario: – 3GPP ETU channel model – BW: 20 MHz – f S : 33.3 MHz – SNR: 10 dB BW 20 MHz BPDN 2.9050 Proximal-l 1 5.9466 OMP 2.7532 IHT 0.0059 Error=∥h ̂ h∥ 2 2 ∥A∥ 2 <1

Slide 27

Slide 27 text

l 2 –l 0 Solutions: Improving IHT Performance ̂ x(k+1)=H √(2λ) (̂ x(k) ∇ ̂ x (k)) H √(2λ)γ∥a i ∥ 2 (ui )= {sign(u i )min (∣u i ∣, (∣u i ∣ √(2λ)γ∥a i ∥ 2 ) + (1 ∥a i ∥ 2 2 γ) ) if ∥a i ∥ 2 2 γ<1 H√(2λ γ) (ui ) if ∥ai ∥ 2 2 γ⩾1 } → Continuous Exact l 0 (CEL0(2)): 1. Original IHT: 2. Inclusion of an additional step size γ γ γ γ: 4. Thresholding modification: ̂ x(k+1)=H √(2λ)γ (̂ x(k) γ ∇ ̂ x (k)) ∥a i ∥ 2 :=l 2 norm of the i-th column of A threshold varies element-by-element original threshold (2) Soubies, Blanc-Férraud & Aubert (2015)

Slide 28

Slide 28 text

l 2 –l 0 Solutions: Improving IHT Performance Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 0 to 20 dB – 100 Monte-Carlo Simulations delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7 MSE= 1 100 ∑ i=1 100 ∥h ̂ hi ∥ 2 2 0 2 4 6 8 10 12 14 16 18 20 10-4 10-3 10-2 10-1 100 101 CEL0 vs IHT SNR MSE CEL0 IHT CRB → Error:

Slide 29

Slide 29 text

l 2 –l 0 Solutions: Improving IHT Performance Scenario: – 3GPP ETU channel model – BW: 50 MHz – f S : 33.3 MHz – SNR: 0 to 20 dB – 100 Monte-Carlo Simulations delay [ns] Amplitude [dB] 0 –1 50 –1 120 –1 200 0 230 0 500 0 1600 –3 2300 –5 5000 –7 0 2 4 6 8 10 12 14 16 18 20 102 103 104 105 CEL0 vs IHT SNR # of Iterations CEL0 IHT → Convergence Time (number of iterations):

Slide 30

Slide 30 text

l 2 –l 0 Minimization: DoA Estimation → Traditional Approach: → K incident signals → N antennas → L snapshots Y=AS+N A: [NxK] S: [KxL] Y: [NxL] unknown matrix problem Classical Solutions: – MUSIC – ESPRIT – RootMUSIC – ... very “expensive” solutions

Slide 31

Slide 31 text

l 2 –l 0 Minimization: DoA Estimation → “Compressed” Approach: → K incident signals → N antennas → L snapshots Y=A G S G +N SPARSE! KNOWN!

Slide 32

Slide 32 text

-10 -5 0 5 10 15 20 10-2 10-1 100 101 102 RMSE for 64 Antennas SNR RMSE ( o ) IHT MUSIC l 2 –l 0 Minimization: DoA Estimation → Some Simulations: Scenario: – K = 2 incident signals – θ1 = 0° – θ2 = 5° – N = 64 antennas (ULA) – L = 10 snapshots – ∆θ = 0.15° – G = 601 taps (–45° to +45°) – SNR: –10 to 20 dB – 100 Monte-Carlo runs – IHT and MUSIC algorithms RMSE= 1 200 √(∑ i=1 200 [(θ 1 ̂ θ 1i )2+(θ 2 ̂ θ 2i )2] )

Slide 33

Slide 33 text

-10 -5 0 5 10 15 20 10-2 10-1 100 101 102 RMSE for 24 Antennas SNR RMSE ( o ) IHT MUSIC l 2 –l 0 Minimization: DoA Estimation → Some Simulations: Scenario: – K = 2 incident signals – θ1 = 0° – θ2 = 5° – N = 24 antennas (ULA) – L = 10 snapshots – ∆θ = 0.15° – G = 601 taps (–45° to +45°) – SNR: –10 to 20 dB – 100 Monte-Carlo runs – IHT and MUSIC algorithms RMSE= 1 200 √(∑ i=1 200 [(θ 1 ̂ θ 1i )2+(θ 2 ̂ θ 2i )2] )

Slide 34

Slide 34 text

l 0 -Optimization for DoA and Sparse Channel Estimation → Conclusions: → l 0 -optimization enables enhancement in channel estimation → precision and resolution → l 0 -optimization by iterative algorithms provide simple, efficient and elegant solutions to sparse problems → DoA estimation can be transformed in a sparse problem whose solution could be reached via l 0 -optimization → CEL0 provides enhancement in convergence rate and stability of IHT → Future: → verification of l 0 -optimization limits in channel estimation → application of IHT modified by CEL0 in DoA estimation → application of l 0 -optimization in other problems → e. g. non-pilot aided channel estimation, ...

Slide 35

Slide 35 text

No content