Gabriel Peyré
January 01, 2012
940

# Signal Processing Course: Sparse L1 Recovery

January 01, 2012

## Transcript

2. ### Inverse problem: K : RN0 RP , P N0 Example:

Regularization K Kf0 f0 1 y = Kf0 + w measurements
3. ### Inverse problem: observations = K ⇥ ⇥ RP N K

: RN0 RP , P N0 f0 = x0 sparse in dictionary RN0 N , N N0 . x0 RN f0 = x0 RN0 y = Kf0 + w RP w Example: Regularization Model: K Kf0 f0 coe cients image K 1 y = Kf0 + w measurements
4. ### Inverse problem: Fidelity Regularization min x RN 1 2 ||y

x||2 + ||x||1 observations = K ⇥ ⇥ RP N K : RN0 RP , P N0 f0 = x0 sparse in dictionary RN0 N , N N0 . x0 RN f0 = x0 RN0 y = Kf0 + w RP w Example: Regularization Model: K Sparse recovery: f = x where x solves Kf0 f0 coe cients image K 1 y = Kf0 + w measurements
5. ### f0 = x0 y = x0 + w Recovery: Observations:

Data: x ⇥ argmin x RN 1 2 || x y||2 + ||x||1 Variations and Stability (P (y))
6. ### f0 = x0 y = x0 + w Recovery: Observations:

Data: x ⇥ argmin x RN 1 2 || x y||2 + ||x||1 x argmin x=y ||x||1 0+ (no noise) Variations and Stability (P (y)) (P0 (y))
7. ### f0 = x0 y = x0 + w Questions: Recovery:

Observations: Data: – Behavior of x with respect to y and . – Criterion to ensure ||x x0 || = O(||w||). – Criterion to ensure x = x0 when w = 0 and = 0+. x ⇥ argmin x RN 1 2 || x y||2 + ||x||1 x argmin x=y ||x||1 0+ (no noise) Variations and Stability (P (y)) (P0 (y))
8. ### ! Mapping ! x? looks polygonal. ! If x0 sparse

and well chosen, sign(x ? ) = sign(x0). Numerical Illustration 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 s = 3 s = 6 s = 13 s = 25 y = x0 + w, || x0 ||0 = s, 2 R50⇥200 Gaussian .
9. ### Overview • Polytope Noiseless Recovery • Local Behavior of Sparse

Regularization • Robustness to Small Noise • Robustness to Bounded Noise • Compressed Sensing RIP Theory
10. ### (B ) x0 x0 y x (y) 1 2 2

3 3 1 = ( i ) i R2 3 B = {x \ ||x||1 } = ||x0 ||1 min x=y ||x||1 x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B ) Polytopes Approach
11. ### (B ) x0 x0 y x (y) 1 2 2

3 3 1 = ( i ) i R2 3 B = {x \ ||x||1 } = ||x0 ||1 min x=y ||x||1 x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B ) Polytopes Approach (P0 (y))
12. ### Suppose x0 not solution, show (x0 ) int( B )

⇥z, such that x0 = z, ||z||1 = (1 )||x0 ||1. ||z + ⇥||1 ||z|| + || +h||1 (1 )||x0 ||1 + || ||1,1 ||h||1 < ||x0 ||1 For any h = Im( ) such that ||h||1 < || +||1,1 = (x0 ) + h (B ) Proof = (x0 ) + h = (z + ) x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B )
13. ### = Suppose x0 not solution, show (x0 ) int( B

) ⇥z, such that x0 = z, ||z||1 = (1 )||x0 ||1. ||z + ⇥||1 ||z|| + || +h||1 (1 )||x0 ||1 + || ||1,1 ||h||1 < ||x0 ||1 For any h = Im( ) such that ||h||1 < || +||1,1 = (x0 ) + h (B ) Suppose (x0 ) int( B ) Then ⇥z, x0 = (1 ) z and ||z||1 < ||x0 ||1 . ||(1 )z||1 < ||x0 ||1 so x0 is not a solution. z 0 Proof = (x0 ) + h = (z + ) x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B ) x0 (B )
14. ### C(0,1,1) K(0,1,1) Ks = ( isi ) i R3 \

i 0 2-D cones Cs = Ks 2-D quadrant Basis-Pursuit Mapping in 2-D y x (y) 1 2 3 = ( i ) i R2 3
15. ### = ( i ) i R3 N Empty spherical caps

property RN j Delaunay paving of the sphere with spherical triangles Cs Basis-Pursuit Mapping in 3-D y x (y) k Cs i
16. ### All Most RIP Counting faces of random polytopes: Sharp constants.

No noise robustness. All x0 such that ||x0 ||0 Call (P/N)P are identiﬁable. Most x0 such that ||x0 ||0 Cmost (P/N)P are identiﬁable. Call (1/4) 0.065 Cmost (1/4) 0.25 [Donoho] Polytope Noiseless Recovery 50 100 150 200 250 300 350 400 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
17. ### Overview • Polytope Noiseless Recovery • Local Behavior of Sparse

Regularization • Robustness to Small Noise • Robustness to Bounded Noise • Compressed Sensing RIP Theory
18. ### 0 E(x ) I = {i ⇥ {0, . .

. , N 1} \ xi ⇤= 0} Support of the solution: First order condition: x solution of P (y) ( x y) + s = 0 where sI = sign(xI ), ||sIc || 1 First Order CNS Condition x ⇥ argmin x RN E(x) = 1 2 || x y||2 + ||x||1
19. ### 0 E(x ) I = {i ⇥ {0, . .

. , N 1} \ xi ⇤= 0} Support of the solution: First order condition: x solution of P (y) ( x y) + s = 0 where sI = sign(xI ), ||sIc || 1 sIc = 1 Ic ( x y) Note: x solution of P (y) || Ic ( x y)|| First Order CNS Condition Theorem: x ⇥ argmin x RN E(x) = 1 2 || x y||2 + ||x||1
20. ### If I has full rank: = ( x y) +

s = 0 Implicit equation Local Parameterization xI = + I y ( I I ) 1sI + I = ( I I ) 1 I
21. ### If I has full rank: = ( x y) +

s = 0 Implicit equation Given y compute x compute (s, I). ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI Deﬁne By construction ˆ x (y) = x . ˆ x¯ (¯ y) Ic = 0 Local Parameterization xI = + I y ( I I ) 1sI + I = ( I I ) 1 I
22. ### If I has full rank: = ( x y) +

s = 0 Implicit equation Given y compute x compute (s, I). ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI Deﬁne By construction ˆ x (y) = x . Theorem: ˆ x¯ (¯ y) Ic = 0 Remark: the theorem holds outside a union of hyperplanes. 1 2 ||x ||0 =0 such that I is full rank, I = supp( x ?), For (y, ) / 2 H , let x ? be a solution of P (y), Local Parameterization xI = + I y ( I I ) 1sI + I = ( I I ) 1 I 2 2 2 2 2 1 1 1 1 1 for (¯, ¯ y) close to ( , y), ˆ x¯ (¯ y) is solution of P¯ (¯ y)
23. ### ! if ker( I) 6 = { 0 } ,

x ? not unique. Full Rank Condition Lemma: There exists x ? such that ker( I) = {0}.
24. ### ! if ker( I) 6 = { 0 } ,

x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Full Rank Condition Lemma: There exists x ? such that ker( I) = {0}.
25. ### ! if ker( I) 6 = { 0 } ,

x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.
26. ### ! if ker( I) 6 = { 0 } ,

x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). 8 | t | < t0, xt is solution. xt = x ? and same sign: Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.
27. ### ! if ker( I) 6 = { 0 } ,

x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). By continuity, xt0 solution. and | supp( xt0 )| < | supp( x ?)|. 8 | t | < t0, xt is solution. xt = x ? and same sign: Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.
28. ### d s j (¯ y, ¯) = |h 'j, ¯

y I ˆ x¯(¯ y )i| 6 Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, I = supp(s)
29. ### ! ok, by continuity. d s j (¯ y, ¯)

= |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 1: ds j (y, ) < Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, I = supp(s)
30. ### ! ok, by continuity. d s j (¯ y, ¯)

= |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, I = supp(s)
31. ### ! ok, by continuity. 'j / 2 Im( I) !

exclude this case. d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)
32. ### ! ok, by continuity. 'j / 2 Im( I) !

exclude this case. Exclude hyperplanes: d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)
33. ### x ?=0 H;,j ! ok, by continuity. 'j / 2

Im( I) ! exclude this case. Exclude hyperplanes: d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)
34. ### x ?=0 H;,j ! ok, by continuity. 'j / 2

Im( I) ! exclude this case. Exclude hyperplanes: HI,j d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)
35. ### Local parameterization: y x x Under uniqueness assumption: are piecewise

a ne functions. Local Affine Maps ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI x1 x2 0 = 0 k x k = 0 x0 (BP sol.) breaking points change of support of x
36. ### Corrolary: µ ( y ) = x1 = x2 is

uniquely deﬁned. Projector Proposition: If x1 and x2 minimize E , E ( x ) = 1 2 || x y ||2 + || x ||1 then x1 = x2.
37. ### Corrolary: µ ( y ) = x1 = x2 is

uniquely deﬁned. x3 = (x1 + x2)/2 is solution and if x1 6 = x2, Projector Proposition: If x1 and x2 minimize E , E ( x ) = 1 2 || x y ||2 + || x ||1 then x1 = x2. Proof: 2|| x3 ||1 6 || x1 ||1 + || x2 ||1 2|| x3 y ||2 < || x1 y ||2 + || x2 y ||2 E (x3) < E (x1) = E (x2) = ) contradiction.
38. ### For (¯ y, ) close to ( y, ) /

2 H : µ(¯ y) = PI(¯ y) dI = I + I = +,⇤ I sI PI: orthogonal projector on { x \ supp(x) = I } . Corrolary: µ ( y ) = x1 = x2 is uniquely deﬁned. x3 = (x1 + x2)/2 is solution and if x1 6 = x2, Projector Proposition: If x1 and x2 minimize E , E ( x ) = 1 2 || x y ||2 + || x ||1 then x1 = x2. Proof: 2|| x3 ||1 6 || x1 ||1 + || x2 ||1 2|| x3 y ||2 < || x1 y ||2 + || x2 y ||2 E (x3) < E (x1) = E (x2) = ) contradiction.
39. ### Overview • Polytope Noiseless Recovery • Local Behavior of Sparse

Regularization • Robustness to Small Noise • Robustness to Bounded Noise • Compressed Sensing RIP Theory
40. ### Uniqueness Sufficient Condition E ( x ) = 1 2

|| x y ||2 + || x ||1
41. ### Uniqueness Sufficient Condition Theorem: If I has full rank and

|| Ic ( x y)|| < E ( x ) = 1 2 || x y ||2 + || x ||1 then x ? is the unique minimizer of E .
42. ### || Ic ( ˜ x ? y )||1 = ||

Ic ( x ? y )||1 < =) supp(˜ x ?) ⇢ I ˜ x ? I x ? I 2 ker( I) = {0} . Let ˜ x ? be a minimizer. Then x ? = ˜ x ? =) =) ˜ x ? = x ? Uniqueness Sufficient Condition Theorem: If I has full rank and || Ic ( x y)|| < E ( x ) = 1 2 || x y ||2 + || x ||1 Proof: then x ? is the unique minimizer of E .
43. ### F(s) = || IsI || where ⇥ I = Ic

+, I Identiﬁability crition: [Fuchs] ( I is assumed to have full rank) For s ⇥ { 1, 0, +1}N , let I = supp(s) + I = ( I I ) 1 I satisﬁes + I I = Id I Robustness to Small Noise
44. ### F(s) = || IsI || where ⇥ I = Ic

+, I Identiﬁability crition: [Fuchs] ( I is assumed to have full rank) ⇥ If ||w|| small enough, ||x x0 || = O(||w||). is the unique solution of P (y). If ||w||/T is small enough and ||w||, then If F(sign(x0 )) < 1, x0 + + I w ( I I ) 1 sign(x0,I ) T = min i I |x0,i | For s ⇥ { 1, 0, +1}N , let I = supp(s) + I = ( I I ) 1 I satisﬁes + I I = Id I Robustness to Small Noise Theorem:
45. ### F(s) = || IsI || = max j /I |

dI, j ⇥| where dI deﬁned by: i I, dI, i = si dI = I ( I I ) 1sI Geometric Interpretation j i dI = +, I sI
46. ### F(s) = || IsI || = max j /I |

dI, j ⇥| where dI deﬁned by: i I, dI, i = si Condition F(s) < 1: no vector j inside the cap Cs . dI Cs dI = I ( I I ) 1sI Geometric Interpretation j i i j | dI, ⇥| < 1 dI = +, I sI
47. ### F(s) = || IsI || = max j /I |

dI, j ⇥| where dI deﬁned by: i I, dI, i = si Condition F(s) < 1: no vector j inside the cap Cs . dI Cs dI i j k dI = I ( I I ) 1sI Geometric Interpretation j i i j | dI, ⇥| < 1 | dI, ⇥| < 1 dI = +, I sI
48. ### Local candidate: x = ˆ x(sign(x )) ˆ x(s) I

= + I y ( I I ) 1sI, I = supp(s) where implicit equation ⇥ To prove: ˆ x = ˆ x(sign(x0 )) is the unique solution of P (y). Sketch of Proof
49. ### Local candidate: Sign consistency: x = ˆ x(sign(x )) ˆ

x(s) I = + I y ( I I ) 1sI, I = supp(s) where implicit equation sign(ˆ x) = sign(x0 ) (C 1 ) y = x0 + w = ˆ x = x0 + + I w ( I I ) 1sI || + I || ,2 ||w|| + ||( I I ) 1|| , < T = (C 1 ) ⇥ To prove: ˆ x = ˆ x(sign(x0 )) is the unique solution of P (y). Sketch of Proof
50. ### Local candidate: Sign consistency: First order conditions: x = ˆ

x(sign(x )) ˆ x(s) I = + I y ( I I ) 1sI, I = supp(s) where implicit equation sign(ˆ x) = sign(x0 ) (C 1 ) (C 2 ) y = x0 + w = ˆ x = x0 + + I w ( I I ) 1sI || Ic ( ˆ x y)|| < || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 1 ) = (C 2 ) ⇥ To prove: ˆ x = ˆ x(sign(x0 )) is the unique solution of P (y). Sketch of Proof
51. ### = ˆ x is the solution Sketch of Proof (cont)

|| + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0
52. ### = ˆ x is the solution ||w|| ||w|| + ⇥⇤

= T For ||w||/T < ⇥max , one can choose ||w||/T such that ˆ x is the solution of P (y). T max ||w|| ⇥⇤ = 0 Sketch of Proof (cont) || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0
53. ### = ˆ x is the solution ||w|| ||w|| + ⇥⇤

= T For ||w||/T < ⇥max , one can choose ||w||/T such that ˆ x is the solution of P (y). T max ||w|| ⇥⇤ = 0 = O(||w||) ||ˆ x x0 || || + I w|| + ||( I I ) 1|| ,2 =⇥ ||ˆ x x0 || = O(||w||) Sketch of Proof (cont) || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0
54. ### Overview • Polytope Noiseless Recovery • Local Behavior of Sparse

Regularization • Robustness to Small Noise • Robustness to Bounded Noise • Compressed Sensing RIP Theory
55. ### Exact Recovery Criterion (ERC): [Tropp] ERC(I) = || I ||

, Relation with F criterion: ERC(I) = max s,supp(s) I F(s) For a support I ⇥ {0, . . . , N 1} with I full rank, = || + I Ic ||1,1 = max j Ic || + I j ||1 (use ||(aj ) j ||1,1 = max j ||aj ||1 ) Robustness to Bounded Noise where ⇥ I = Ic +, I
56. ### Exact Recovery Criterion (ERC): [Tropp] ERC(I) = || I ||

, Relation with F criterion: ERC(I) = max s,supp(s) I F(s) For a support I ⇥ {0, . . . , N 1} with I full rank, = || + I Ic ||1,1 = max j Ic || + I j ||1 (use ||(aj ) j ||1,1 = max j ||aj ||1 ) Robustness to Bounded Noise where ⇥ I = Ic +, I Theorem: If ERC(supp(x0 )) < 1 and ||w||, then ||x0 x || = O(||w||) x is unique, satisﬁes supp(x ) supp(x0 ), and
57. ### Restricted recovery: ˆ x ⇥ argmin supp(x) I 1 2

|| x y||2 + ||x||1 ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof
58. ### Restricted recovery: ˆ x ⇥ argmin supp(x) I 1 2

|| x y||2 + ||x||1 Implicit equation: ˆ xI = + I y ( I I ) 1sI Important: s = sign(ˆ x) is not equal to sign(x ). ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof
59. ### Restricted recovery: ˆ x ⇥ argmin supp(x) I 1 2

|| x y||2 + ||x||1 Implicit equation: ˆ xI = + I y ( I I ) 1sI Important: s = sign(ˆ x) is not equal to sign(x ). ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof First order conditions: (C 2 ) || Ic ( ˆ x y)|| < || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 2 )
60. ### Restricted recovery: ˆ x ⇥ argmin supp(x) I 1 2

|| x y||2 + ||x||1 Implicit equation: ˆ xI = + I y ( I I ) 1sI Important: s = sign(ˆ x) is not equal to sign(x ). ERC(I) < 1 = F(s) < 1 Since s is arbitrary: Hence, choosing ||w|| implies (C 2 ). ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof First order conditions: (C 2 ) || Ic ( ˆ x y)|| < || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 2 )
61. ### (A, B) = max j i I | ai, bj

⇥| (A) = max j i=j | ai, aj ⇥| w-ERC(I) = ( I, Ic ) 1 ( I ) if ( I ) < 1 + otherwise. Weak Exact Recovery Criterion: [Gribonval,Dossal] (for I = supp(s)) For A = (ai ) i, B = (bi ) i , where ai, bi RP , F(s) ERC(I) w-ERC(I) Weak ERC Denoting = ( i )N 1 i=0 where i RP Theorem:
62. ### ERC(I) = max j /I || + I j ||1

||( I I ) 1||1,1 max j /I || I j ||1 max j /I || I ⇥j ||1 = max j /I i m | ⇥i, ⇥j ⇥| = ( I, Ic ) Proof (for I = supp(s)) F(s) ERC(I) w-ERC(I) Theorem:
63. ### ERC(I) = max j /I || + I j ||1

||( I I ) 1||1,1 max j /I || I j ||1 max j /I || I ⇥j ||1 = max j /I i m | ⇥i, ⇥j ⇥| = ( I, Ic ) One has I I = Id H, if ||H||1,1 < 1, ( I I ) 1 = (Id H) 1 = k 0 Hk ||( I I ) 1||1,1 k 0 ||H||k 1,1 = 1 1 ||H||1,1 ||H||1,1 = max i I j=i | ⇥i, ⇥j ⇥| = ( I ) Proof (for I = supp(s)) F(s) ERC(I) w-ERC(I) Theorem:
64. ### P = 200, N = 1000 F < 1 ERC

< 1 x = x0 w-ERC < 1 Example: Random Matrix 0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 1
65. ### ⇥x = i xi (· i) Increasing : reduces correlation.

F(s) ERC(I) w-ERC(I) reduces resolution. Example: Deconvolution x0 x0
66. ### Coherence Bounds µ( ) = max i=j | i, j

⇥| Mutual coherence: Theorem: F(s) ERC(I) w-ERC(I) |I|µ( ) 1 (|I| 1)µ( )
67. ### Coherence Bounds Theorem: ||x0 x || = O(||w||) ||x0 ||0

< 1 2 1 + 1 µ( ) If µ( ) = max i=j | i, j ⇥| Mutual coherence: one has supp(x ) I, and and ||w||, Theorem: F(s) ERC(I) w-ERC(I) |I|µ( ) 1 (|I| 1)µ( )
68. ### Coherence Bounds Theorem: ||x0 x || = O(||w||) ||x0 ||0

< 1 2 1 + 1 µ( ) If µ( ) = max i=j | i, j ⇥| Mutual coherence: one has supp(x ) I, and and ||w||, Theorem: F(s) ERC(I) w-ERC(I) |I|µ( ) 1 (|I| 1)µ( ) For Gaussian matrices: For convolution matrices: useless criterion. µ( ) log(PN)/P One has: Optimistic setting: ||x0 ||0 O( P) µ( ) N P P(N 1)
69. ### Incoherent pair of orthobases: 2 = k N 1/2e2i N

mk m 1 = {k ⇤⇥ [k m]}m Diracs/Fourier = [ 1, 2 ] RN 2N Coherence - Examples
70. ### Incoherent pair of orthobases: 2 = k N 1/2e2i N

mk m 1 = {k ⇤⇥ [k m]}m Diracs/Fourier = [ 1, 2 ] RN 2N min x R2N 1 2 ||y x||2 + ||x||1 min x1,x2 RN 1 2 ||y 1x1 2x2 ||2 + ||x1 ||1 + ||x2 ||1 = + Coherence - Examples
71. ### Incoherent pair of orthobases: 2 = k N 1/2e2i N

mk m 1 = {k ⇤⇥ [k m]}m Diracs/Fourier = [ 1, 2 ] RN 2N µ( ) = 1 N = separates up to N/2 Diracs + sines. min x R2N 1 2 ||y x||2 + ||x||1 min x1,x2 RN 1 2 ||y 1x1 2x2 ||2 + ||x1 ||1 + ||x2 ||1 = + Coherence - Examples
72. ### Overview • Polytope Noiseless Recovery • Local Behavior of Sparse

Regularization • Robustness to Small Noise • Robustness to Bounded Noise • Compressed Sensing RIP Theory
73. ### ⇥ ||x||0 k, (1 k )||x||2 || x||2 (1 +

k )||x||2 Restricted Isometry Constants: 1 recovery: ⇥ argmin x 1 2 || x y||2 + ||x||1 CS with RIP x⇥ argmin || x y|| ||x||1 where y = x0 + w ||w||
74. ### ⇥ ||x||0 k, (1 k )||x||2 || x||2 (1 +

k )||x||2 Restricted Isometry Constants: 1 recovery: ⇥ argmin x 1 2 || x y||2 + ||x||1 CS with RIP [Candes 2009] x⇥ argmin || x y|| ||x||1 where y = x0 + w ||w|| Theorem: If 2k 2 1, then where xk is the best k-term approximation of x0 . ||x0 x || C0 ⇥ k ||x0 xk ||1 + C1
75. ### ||hT c 0 ||1 ||hT0 ||1 + 2||xT c 0

||1 Optimality conditions: C0 = 2 1 C1 = 1 ⇥ = 2 1 + 2k 1 2k = 2 2k 1 2k Explicit constants: Reference: Elements of Proof {0, . . . , N 1} = T0 ⇥ T1 ⇥ . . . ⇥ Tm k elements of x0 of hT c 0 largest h = x x0 xk = xT0 largest ||x0 x || C0 ⇥ s ||x0 xk ||1 + C1 E. J. Cand` es, CRAS, 2006
76. ### f (⇥) = 1 2⇤ ⇥ (⇥ b)+(a ⇥)+ Eigenvalues

of I I with |I| = k are essentially in [a, b] a = (1 )2 and b = (1 )2 where = k/P When k = P + , the eigenvalue distribution tends to [Marcenko-Pastur] Large deviation inequality [Ledoux] Singular Values Distributions 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 P=200, k=10 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 P=200, k=30 0.4 0.6 0.8 P=200, k=50 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 P=200, k=10 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 P=200, k=30 0.2 0.4 0.6 0.8 P=200, k=50 P = 200, k = 10 f ( ) k = 30
77. ### Link with coherence: k (k 1)µ( ) 2 = µ(

) RIP for Gaussian Matrices µ( ) = max i=j | i, j ⇥|
78. ### Link with coherence: k (k 1)µ( ) For Gaussian matrices:

2 = µ( ) RIP for Gaussian Matrices µ( ) = max i=j | i, j ⇥| µ( ) log(PN)/P
79. ### Link with coherence: k (k 1)µ( ) For Gaussian matrices:

Stronger result: 2 = µ( ) RIP for Gaussian Matrices k C log(N/P)P Theorem: If then 2k 2 1 with high probability. µ( ) = max i=j | i, j ⇥| µ( ) log(PN)/P
80. ### (1 ⇥1 (A))|| ||2 ||A ||2 (1 + ⇥2 (A))||

||2 Stability constant of A: smallest / largest eigenvalues of A A Numerics with RIP
81. ### 2 1 (1 ⇥1 (A))|| ||2 ||A ||2 (1 +

⇥2 (A))|| ||2 Stability constant of A: Upper/lower RIC: i k = max |I|=k i ( I ) k = min( 1 k , 2 k ) k ˆ2 k ˆ2 k Monte-Carlo estimation: ˆ k k smallest / largest eigenvalues of A A Numerics with RIP
82. ### Local behavior: ! x ? polygonal. y ! x ?

piecewise a ne. Conclusion 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25
83. ### Local behavior: ! x ? polygonal. y ! x ?

piecewise a ne. Noiseless recovery: () geometry of polytopes. Conclusion 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 x0
84. ### Local behavior: ! x ? polygonal. y ! x ?

piecewise a ne. Noiseless recovery: () geometry of polytopes. Small noise: ! sign stability. ! support inclusion. Bounded noise: RIP-based: ! no support stability, L1 bounds. Conclusion 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 x0