Signal Processing Course: Sparse L1 Recovery

Recovery Gabriel Peyré www.numerical-tours.com 1 Sparse

Inverse problem: K : RN0 RP , P N0 Example:
Regularization K Kf0 f0 1 y = Kf0 + w measurements

Inverse problem: observations = K ⇥ ⇥ RP N K
: RN0 RP , P N0 f0 = x0 sparse in dictionary RN0 N , N N0 . x0 RN f0 = x0 RN0 y = Kf0 + w RP w Example: Regularization Model: K Kf0 f0 coe cients image K 1 y = Kf0 + w measurements

Inverse problem: Fidelity Regularization min x RN 1 2 ||y
x||2 + ||x||1 observations = K ⇥ ⇥ RP N K : RN0 RP , P N0 f0 = x0 sparse in dictionary RN0 N , N N0 . x0 RN f0 = x0 RN0 y = Kf0 + w RP w Example: Regularization Model: K Sparse recovery: f = x where x solves Kf0 f0 coe cients image K 1 y = Kf0 + w measurements

f0 = x0 y = x0 + w Recovery: Observations:
Data: x ⇥ argmin x RN 1 2 || x y||2 + ||x||1 Variations and Stability (P (y))

f0 = x0 y = x0 + w Recovery: Observations:
Data: x ⇥ argmin x RN 1 2 || x y||2 + ||x||1 x argmin x=y ||x||1 0+ (no noise) Variations and Stability (P (y)) (P0 (y))

f0 = x0 y = x0 + w Questions: Recovery:
Observations: Data: – Behavior of x with respect to y and . – Criterion to ensure ||x x0 || = O(||w||). – Criterion to ensure x = x0 when w = 0 and = 0+. x ⇥ argmin x RN 1 2 || x y||2 + ||x||1 x argmin x=y ||x||1 0+ (no noise) Variations and Stability (P (y)) (P0 (y))

! Mapping ! x? looks polygonal. ! If x0 sparse
and well chosen, sign(x ? ) = sign(x0). Numerical Illustration 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 s = 3 s = 6 s = 13 s = 25 y = x0 + w, || x0 ||0 = s, 2 R50⇥200 Gaussian .

Overview • Polytope Noiseless Recovery • Local Behavior of Sparse
Regularization • Robustness to Small Noise • Robustness to Bounded Noise • Compressed Sensing RIP Theory

(B ) x0 x0 y x (y) 1 2 2
3 3 1 = ( i ) i R2 3 B = {x \ ||x||1 } = ||x0 ||1 min x=y ||x||1 x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B ) Polytopes Approach

(B ) x0 x0 y x (y) 1 2 2
3 3 1 = ( i ) i R2 3 B = {x \ ||x||1 } = ||x0 ||1 min x=y ||x||1 x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B ) Polytopes Approach (P0 (y))

Suppose x0 not solution, show (x0 ) int( B )
⇥z, such that x0 = z, ||z||1 = (1 )||x0 ||1. ||z + ⇥||1 ||z|| + || +h||1 (1 )||x0 ||1 + || ||1,1 ||h||1 < ||x0 ||1 For any h = Im( ) such that ||h||1 < || +||1,1 = (x0 ) + h (B ) Proof = (x0 ) + h = (z + ) x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B )

= Suppose x0 not solution, show (x0 ) int( B
) ⇥z, such that x0 = z, ||z||1 = (1 )||x0 ||1. ||z + ⇥||1 ||z|| + || +h||1 (1 )||x0 ||1 + || ||1,1 ||h||1 < ||x0 ||1 For any h = Im( ) such that ||h||1 < || +||1,1 = (x0 ) + h (B ) Suppose (x0 ) int( B ) Then ⇥z, x0 = (1 ) z and ||z||1 < ||x0 ||1 . ||(1 )z||1 < ||x0 ||1 so x0 is not a solution. z 0 Proof = (x0 ) + h = (z + ) x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B ) x0 (B )

C(0,1,1) K(0,1,1) Ks = ( isi ) i R3 \
i 0 2-D cones Cs = Ks 2-D quadrant Basis-Pursuit Mapping in 2-D y x (y) 1 2 3 = ( i ) i R2 3

= ( i ) i R3 N Empty spherical caps
property RN j Delaunay paving of the sphere with spherical triangles Cs Basis-Pursuit Mapping in 3-D y x (y) k Cs i

All Most RIP Counting faces of random polytopes: Sharp constants.
No noise robustness. All x0 such that ||x0 ||0 Call (P/N)P are identiﬁable. Most x0 such that ||x0 ||0 Cmost (P/N)P are identiﬁable. Call (1/4) 0.065 Cmost (1/4) 0.25 [Donoho] Polytope Noiseless Recovery 50 100 150 200 250 300 350 400 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

0 E(x ) I = {i ⇥ {0, . .
. , N 1} \ xi ⇤= 0} Support of the solution: First order condition: x solution of P (y) ( x y) + s = 0 where sI = sign(xI ), ||sIc || 1 First Order CNS Condition x ⇥ argmin x RN E(x) = 1 2 || x y||2 + ||x||1

0 E(x ) I = {i ⇥ {0, . .
. , N 1} \ xi ⇤= 0} Support of the solution: First order condition: x solution of P (y) ( x y) + s = 0 where sI = sign(xI ), ||sIc || 1 sIc = 1 Ic ( x y) Note: x solution of P (y) || Ic ( x y)|| First Order CNS Condition Theorem: x ⇥ argmin x RN E(x) = 1 2 || x y||2 + ||x||1

If I has full rank: = ( x y) +
s = 0 Implicit equation Local Parameterization xI = + I y ( I I ) 1sI + I = ( I I ) 1 I

s = 0 Implicit equation Given y compute x compute (s, I). ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI Deﬁne By construction ˆ x (y) = x . ˆ x¯ (¯ y) Ic = 0 Local Parameterization xI = + I y ( I I ) 1sI + I = ( I I ) 1 I

s = 0 Implicit equation Given y compute x compute (s, I). ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI Deﬁne By construction ˆ x (y) = x . Theorem: ˆ x¯ (¯ y) Ic = 0 Remark: the theorem holds outside a union of hyperplanes. 1 2 ||x ||0 =0 such that I is full rank, I = supp( x ?), For (y, ) / 2 H , let x ? be a solution of P (y), Local Parameterization xI = + I y ( I I ) 1sI + I = ( I I ) 1 I 2 2 2 2 2 1 1 1 1 1 for (¯, ¯ y) close to ( , y), ˆ x¯ (¯ y) is solution of P¯ (¯ y)

! if ker( I) 6 = { 0 } ,
x ? not unique. Full Rank Condition Lemma: There exists x ? such that ker( I) = {0}.

! if ker( I) 6 = { 0 } ,
x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Full Rank Condition Lemma: There exists x ? such that ker( I) = {0}.

! if ker( I) 6 = { 0 } ,
x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.

! if ker( I) 6 = { 0 } ,
x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). 8 | t | < t0, xt is solution. xt = x ? and same sign: Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.

! if ker( I) 6 = { 0 } ,
x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). By continuity, xt0 solution. and | supp( xt0 )| < | supp( x ?)|. 8 | t | < t0, xt is solution. xt = x ? and same sign: Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.

d s j (¯ y, ¯) = |h 'j, ¯
y I ˆ x¯(¯ y )i| 6 Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, I = supp(s)

! ok, by continuity. d s j (¯ y, ¯)
= |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 1: ds j (y, ) < Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, I = supp(s)

! ok, by continuity. d s j (¯ y, ¯)
= |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, I = supp(s)

! ok, by continuity. 'j / 2 Im( I) !
exclude this case. d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)

! ok, by continuity. 'j / 2 Im( I) !
exclude this case. Exclude hyperplanes: d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)

x ?=0 H;,j ! ok, by continuity. 'j / 2
Im( I) ! exclude this case. Exclude hyperplanes: d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)

x ?=0 H;,j ! ok, by continuity. 'j / 2
Im( I) ! exclude this case. Exclude hyperplanes: HI,j d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)

Local parameterization: y x x Under uniqueness assumption: are piecewise
a ne functions. Local Affine Maps ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI x1 x2 0 = 0 k x k = 0 x0 (BP sol.) breaking points change of support of x

Corrolary: µ ( y ) = x1 = x2 is
uniquely deﬁned. Projector Proposition: If x1 and x2 minimize E , E ( x ) = 1 2 || x y ||2 + || x ||1 then x1 = x2.

Corrolary: µ ( y ) = x1 = x2 is
uniquely deﬁned. x3 = (x1 + x2)/2 is solution and if x1 6 = x2, Projector Proposition: If x1 and x2 minimize E , E ( x ) = 1 2 || x y ||2 + || x ||1 then x1 = x2. Proof: 2|| x3 ||1 6 || x1 ||1 + || x2 ||1 2|| x3 y ||2 < || x1 y ||2 + || x2 y ||2 E (x3) < E (x1) = E (x2) = ) contradiction.

For (¯ y, ) close to ( y, ) /
2 H : µ(¯ y) = PI(¯ y) dI = I + I = +,⇤ I sI PI: orthogonal projector on { x \ supp(x) = I } . Corrolary: µ ( y ) = x1 = x2 is uniquely deﬁned. x3 = (x1 + x2)/2 is solution and if x1 6 = x2, Projector Proposition: If x1 and x2 minimize E , E ( x ) = 1 2 || x y ||2 + || x ||1 then x1 = x2. Proof: 2|| x3 ||1 6 || x1 ||1 + || x2 ||1 2|| x3 y ||2 < || x1 y ||2 + || x2 y ||2 E (x3) < E (x1) = E (x2) = ) contradiction.

Uniqueness Sufficient Condition E ( x ) = 1 2
|| x y ||2 + || x ||1

Uniqueness Sufficient Condition Theorem: If I has full rank and
|| Ic ( x y)|| < E ( x ) = 1 2 || x y ||2 + || x ||1 then x ? is the unique minimizer of E .

|| Ic ( ˜ x ? y )||1 = ||
Ic ( x ? y )||1 < =) supp(˜ x ?) ⇢ I ˜ x ? I x ? I 2 ker( I) = {0} . Let ˜ x ? be a minimizer. Then x ? = ˜ x ? =) =) ˜ x ? = x ? Uniqueness Sufficient Condition Theorem: If I has full rank and || Ic ( x y)|| < E ( x ) = 1 2 || x y ||2 + || x ||1 Proof: then x ? is the unique minimizer of E .

F(s) = || IsI || where ⇥ I = Ic
+, I Identiﬁability crition: [Fuchs] ( I is assumed to have full rank) For s ⇥ { 1, 0, +1}N , let I = supp(s) + I = ( I I ) 1 I satisﬁes + I I = Id I Robustness to Small Noise

F(s) = || IsI || where ⇥ I = Ic
+, I Identiﬁability crition: [Fuchs] ( I is assumed to have full rank) ⇥ If ||w|| small enough, ||x x0 || = O(||w||). is the unique solution of P (y). If ||w||/T is small enough and ||w||, then If F(sign(x0 )) < 1, x0 + + I w ( I I ) 1 sign(x0,I ) T = min i I |x0,i | For s ⇥ { 1, 0, +1}N , let I = supp(s) + I = ( I I ) 1 I satisﬁes + I I = Id I Robustness to Small Noise Theorem:

F(s) = || IsI || = max j /I |
dI, j ⇥| where dI deﬁned by: i I, dI, i = si dI = I ( I I ) 1sI Geometric Interpretation j i dI = +, I sI

F(s) = || IsI || = max j /I |
dI, j ⇥| where dI deﬁned by: i I, dI, i = si Condition F(s) < 1: no vector j inside the cap Cs . dI Cs dI = I ( I I ) 1sI Geometric Interpretation j i i j | dI, ⇥| < 1 dI = +, I sI

F(s) = || IsI || = max j /I |
dI, j ⇥| where dI deﬁned by: i I, dI, i = si Condition F(s) < 1: no vector j inside the cap Cs . dI Cs dI i j k dI = I ( I I ) 1sI Geometric Interpretation j i i j | dI, ⇥| < 1 | dI, ⇥| < 1 dI = +, I sI

Local candidate: x = ˆ x(sign(x )) ˆ x(s) I
= + I y ( I I ) 1sI, I = supp(s) where implicit equation ⇥ To prove: ˆ x = ˆ x(sign(x0 )) is the unique solution of P (y). Sketch of Proof

Local candidate: Sign consistency: x = ˆ x(sign(x )) ˆ
x(s) I = + I y ( I I ) 1sI, I = supp(s) where implicit equation sign(ˆ x) = sign(x0 ) (C 1 ) y = x0 + w = ˆ x = x0 + + I w ( I I ) 1sI || + I || ,2 ||w|| + ||( I I ) 1|| , < T = (C 1 ) ⇥ To prove: ˆ x = ˆ x(sign(x0 )) is the unique solution of P (y). Sketch of Proof

Local candidate: Sign consistency: First order conditions: x = ˆ
x(sign(x )) ˆ x(s) I = + I y ( I I ) 1sI, I = supp(s) where implicit equation sign(ˆ x) = sign(x0 ) (C 1 ) (C 2 ) y = x0 + w = ˆ x = x0 + + I w ( I I ) 1sI || Ic ( ˆ x y)|| < || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 1 ) = (C 2 ) ⇥ To prove: ˆ x = ˆ x(sign(x0 )) is the unique solution of P (y). Sketch of Proof

= ˆ x is the solution Sketch of Proof (cont)
|| + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0

= ˆ x is the solution ||w|| ||w|| + ⇥⇤
= T For ||w||/T < ⇥max , one can choose ||w||/T such that ˆ x is the solution of P (y). T max ||w|| ⇥⇤ = 0 Sketch of Proof (cont) || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0

= ˆ x is the solution ||w|| ||w|| + ⇥⇤
= T For ||w||/T < ⇥max , one can choose ||w||/T such that ˆ x is the solution of P (y). T max ||w|| ⇥⇤ = 0 = O(||w||) ||ˆ x x0 || || + I w|| + ||( I I ) 1|| ,2 =⇥ ||ˆ x x0 || = O(||w||) Sketch of Proof (cont) || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0

Exact Recovery Criterion (ERC): [Tropp] ERC(I) = || I ||
, Relation with F criterion: ERC(I) = max s,supp(s) I F(s) For a support I ⇥ {0, . . . , N 1} with I full rank, = || + I Ic ||1,1 = max j Ic || + I j ||1 (use ||(aj ) j ||1,1 = max j ||aj ||1 ) Robustness to Bounded Noise where ⇥ I = Ic +, I

Exact Recovery Criterion (ERC): [Tropp] ERC(I) = || I ||
, Relation with F criterion: ERC(I) = max s,supp(s) I F(s) For a support I ⇥ {0, . . . , N 1} with I full rank, = || + I Ic ||1,1 = max j Ic || + I j ||1 (use ||(aj ) j ||1,1 = max j ||aj ||1 ) Robustness to Bounded Noise where ⇥ I = Ic +, I Theorem: If ERC(supp(x0 )) < 1 and ||w||, then ||x0 x || = O(||w||) x is unique, satisﬁes supp(x ) supp(x0 ), and

Restricted recovery: ˆ x ⇥ argmin supp(x) I 1 2
|| x y||2 + ||x||1 ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof

|| x y||2 + ||x||1 Implicit equation: ˆ xI = + I y ( I I ) 1sI Important: s = sign(ˆ x) is not equal to sign(x ). ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof

|| x y||2 + ||x||1 Implicit equation: ˆ xI = + I y ( I I ) 1sI Important: s = sign(ˆ x) is not equal to sign(x ). ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof First order conditions: (C 2 ) || Ic ( ˆ x y)|| < || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 2 )

|| x y||2 + ||x||1 Implicit equation: ˆ xI = + I y ( I I ) 1sI Important: s = sign(ˆ x) is not equal to sign(x ). ERC(I) < 1 = F(s) < 1 Since s is arbitrary: Hence, choosing ||w|| implies (C 2 ). ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof First order conditions: (C 2 ) || Ic ( ˆ x y)|| < || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 2 )

(A, B) = max j i I | ai, bj
⇥| (A) = max j i=j | ai, aj ⇥| w-ERC(I) = ( I, Ic ) 1 ( I ) if ( I ) < 1 + otherwise. Weak Exact Recovery Criterion: [Gribonval,Dossal] (for I = supp(s)) For A = (ai ) i, B = (bi ) i , where ai, bi RP , F(s) ERC(I) w-ERC(I) Weak ERC Denoting = ( i )N 1 i=0 where i RP Theorem:

ERC(I) = max j /I || + I j ||1
||( I I ) 1||1,1 max j /I || I j ||1 max j /I || I ⇥j ||1 = max j /I i m | ⇥i, ⇥j ⇥| = ( I, Ic ) Proof (for I = supp(s)) F(s) ERC(I) w-ERC(I) Theorem:

ERC(I) = max j /I || + I j ||1
||( I I ) 1||1,1 max j /I || I j ||1 max j /I || I ⇥j ||1 = max j /I i m | ⇥i, ⇥j ⇥| = ( I, Ic ) One has I I = Id H, if ||H||1,1 < 1, ( I I ) 1 = (Id H) 1 = k 0 Hk ||( I I ) 1||1,1 k 0 ||H||k 1,1 = 1 1 ||H||1,1 ||H||1,1 = max i I j=i | ⇥i, ⇥j ⇥| = ( I ) Proof (for I = supp(s)) F(s) ERC(I) w-ERC(I) Theorem:

P = 200, N = 1000 F < 1 ERC
< 1 x = x0 w-ERC < 1 Example: Random Matrix 0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 1

⇥x = i xi (· i) Increasing : reduces correlation.
F(s) ERC(I) w-ERC(I) reduces resolution. Example: Deconvolution x0 x0

Coherence Bounds µ( ) = max i=j | i, j
⇥| Mutual coherence: Theorem: F(s) ERC(I) w-ERC(I) |I|µ( ) 1 (|I| 1)µ( )

Coherence Bounds Theorem: ||x0 x || = O(||w||) ||x0 ||0
< 1 2 1 + 1 µ( ) If µ( ) = max i=j | i, j ⇥| Mutual coherence: one has supp(x ) I, and and ||w||, Theorem: F(s) ERC(I) w-ERC(I) |I|µ( ) 1 (|I| 1)µ( )

Coherence Bounds Theorem: ||x0 x || = O(||w||) ||x0 ||0
< 1 2 1 + 1 µ( ) If µ( ) = max i=j | i, j ⇥| Mutual coherence: one has supp(x ) I, and and ||w||, Theorem: F(s) ERC(I) w-ERC(I) |I|µ( ) 1 (|I| 1)µ( ) For Gaussian matrices: For convolution matrices: useless criterion. µ( ) log(PN)/P One has: Optimistic setting: ||x0 ||0 O( P) µ( ) N P P(N 1)

Incoherent pair of orthobases: 2 = k N 1/2e2i N
mk m 1 = {k ⇤⇥ [k m]}m Diracs/Fourier = [ 1, 2 ] RN 2N Coherence - Examples

mk m 1 = {k ⇤⇥ [k m]}m Diracs/Fourier = [ 1, 2 ] RN 2N min x R2N 1 2 ||y x||2 + ||x||1 min x1,x2 RN 1 2 ||y 1x1 2x2 ||2 + ||x1 ||1 + ||x2 ||1 = + Coherence - Examples

mk m 1 = {k ⇤⇥ [k m]}m Diracs/Fourier = [ 1, 2 ] RN 2N µ( ) = 1 N = separates up to N/2 Diracs + sines. min x R2N 1 2 ||y x||2 + ||x||1 min x1,x2 RN 1 2 ||y 1x1 2x2 ||2 + ||x1 ||1 + ||x2 ||1 = + Coherence - Examples

⇥ ||x||0 k, (1 k )||x||2 || x||2 (1 +
k )||x||2 Restricted Isometry Constants: 1 recovery: ⇥ argmin x 1 2 || x y||2 + ||x||1 CS with RIP x⇥ argmin || x y|| ||x||1 where y = x0 + w ||w||

⇥ ||x||0 k, (1 k )||x||2 || x||2 (1 +
k )||x||2 Restricted Isometry Constants: 1 recovery: ⇥ argmin x 1 2 || x y||2 + ||x||1 CS with RIP [Candes 2009] x⇥ argmin || x y|| ||x||1 where y = x0 + w ||w|| Theorem: If 2k 2 1, then where xk is the best k-term approximation of x0 . ||x0 x || C0 ⇥ k ||x0 xk ||1 + C1

||hT c 0 ||1 ||hT0 ||1 + 2||xT c 0
||1 Optimality conditions: C0 = 2 1 C1 = 1 ⇥ = 2 1 + 2k 1 2k = 2 2k 1 2k Explicit constants: Reference: Elements of Proof {0, . . . , N 1} = T0 ⇥ T1 ⇥ . . . ⇥ Tm k elements of x0 of hT c 0 largest h = x x0 xk = xT0 largest ||x0 x || C0 ⇥ s ||x0 xk ||1 + C1 E. J. Cand` es, CRAS, 2006

f (⇥) = 1 2⇤ ⇥ (⇥ b)+(a ⇥)+ Eigenvalues
of I I with |I| = k are essentially in [a, b] a = (1 )2 and b = (1 )2 where = k/P When k = P + , the eigenvalue distribution tends to [Marcenko-Pastur] Large deviation inequality [Ledoux] Singular Values Distributions 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 P=200, k=10 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 P=200, k=30 0.4 0.6 0.8 P=200, k=50 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 P=200, k=10 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 P=200, k=30 0.2 0.4 0.6 0.8 P=200, k=50 P = 200, k = 10 f ( ) k = 30

Link with coherence: k (k 1)µ( ) 2 = µ(
) RIP for Gaussian Matrices µ( ) = max i=j | i, j ⇥|

Link with coherence: k (k 1)µ( ) For Gaussian matrices:
2 = µ( ) RIP for Gaussian Matrices µ( ) = max i=j | i, j ⇥| µ( ) log(PN)/P

Link with coherence: k (k 1)µ( ) For Gaussian matrices:
Stronger result: 2 = µ( ) RIP for Gaussian Matrices k C log(N/P)P Theorem: If then 2k 2 1 with high probability. µ( ) = max i=j | i, j ⇥| µ( ) log(PN)/P

(1 ⇥1 (A))|| ||2 ||A ||2 (1 + ⇥2 (A))||
||2 Stability constant of A: smallest / largest eigenvalues of A A Numerics with RIP

2 1 (1 ⇥1 (A))|| ||2 ||A ||2 (1 +
⇥2 (A))|| ||2 Stability constant of A: Upper/lower RIC: i k = max |I|=k i ( I ) k = min( 1 k , 2 k ) k ˆ2 k ˆ2 k Monte-Carlo estimation: ˆ k k smallest / largest eigenvalues of A A Numerics with RIP

Local behavior: ! x ? polygonal. y ! x ?
piecewise a ne. Conclusion 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25

piecewise a ne. Noiseless recovery: () geometry of polytopes. Conclusion 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 x0

piecewise a ne. Noiseless recovery: () geometry of polytopes. Small noise: ! sign stability. ! support inclusion. Bounded noise: RIP-based: ! no support stability, L1 bounds. Conclusion 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 x0

Signal Processing Course: Sparse L1 Recovery

Signal Processing Course: Sparse L1 Recovery

More Decks by Gabriel Peyré

Other Decks in Research

Featured

Transcript