: RN0 RP , P N0 f0 = x0 sparse in dictionary RN0 N , N N0 . x0 RN f0 = x0 RN0 y = Kf0 + w RP w Example: Regularization Model: K Kf0 f0 coe cients image K 1 y = Kf0 + w measurements
x||2 + ||x||1 observations = K ⇥ ⇥ RP N K : RN0 RP , P N0 f0 = x0 sparse in dictionary RN0 N , N N0 . x0 RN f0 = x0 RN0 y = Kf0 + w RP w Example: Regularization Model: K Sparse recovery: f = x where x solves Kf0 f0 coe cients image K 1 y = Kf0 + w measurements
Observations: Data: – Behavior of x with respect to y and . – Criterion to ensure ||x x0 || = O(||w||). – Criterion to ensure x = x0 when w = 0 and = 0+. x ⇥ argmin x RN 1 2 || x y||2 + ||x||1 x argmin x=y ||x||1 0+ (no noise) Variations and Stability (P (y)) (P0 (y))
. , N 1} \ xi ⇤= 0} Support of the solution: First order condition: x solution of P (y) ( x y) + s = 0 where sI = sign(xI ), ||sIc || 1 First Order CNS Condition x ⇥ argmin x RN E(x) = 1 2 || x y||2 + ||x||1
. , N 1} \ xi ⇤= 0} Support of the solution: First order condition: x solution of P (y) ( x y) + s = 0 where sI = sign(xI ), ||sIc || 1 sIc = 1 Ic ( x y) Note: x solution of P (y) || Ic ( x y)|| First Order CNS Condition Theorem: x ⇥ argmin x RN E(x) = 1 2 || x y||2 + ||x||1
s = 0 Implicit equation Given y compute x compute (s, I). ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI Define By construction ˆ x (y) = x . ˆ x¯ (¯ y) Ic = 0 Local Parameterization xI = + I y ( I I ) 1sI + I = ( I I ) 1 I
s = 0 Implicit equation Given y compute x compute (s, I). ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI Define By construction ˆ x (y) = x . Theorem: ˆ x¯ (¯ y) Ic = 0 Remark: the theorem holds outside a union of hyperplanes. 1 2 ||x ||0 =0 such that I is full rank, I = supp( x ?), For (y, ) / 2 H , let x ? be a solution of P (y), Local Parameterization xI = + I y ( I I ) 1sI + I = ( I I ) 1 I 2 2 2 2 2 1 1 1 1 1 for (¯, ¯ y) close to ( , y), ˆ x¯ (¯ y) is solution of P¯ (¯ y)
x ? not unique. Proof: Define 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Full Rank Condition Lemma: There exists x ? such that ker( I) = {0}.
x ? not unique. Proof: Define 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.
x ? not unique. Proof: Define 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). 8 | t | < t0, xt is solution. xt = x ? and same sign: Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.
x ? not unique. Proof: Define 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). By continuity, xt0 solution. and | supp( xt0 )| < | supp( x ?)|. 8 | t | < t0, xt is solution. xt = x ? and same sign: Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.
= |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, I = supp(s)
exclude this case. d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)
exclude this case. Exclude hyperplanes: d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)
Im( I) ! exclude this case. Exclude hyperplanes: d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)
Im( I) ! exclude this case. Exclude hyperplanes: HI,j d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)
uniquely defined. x3 = (x1 + x2)/2 is solution and if x1 6 = x2, Projector Proposition: If x1 and x2 minimize E , E ( x ) = 1 2 || x y ||2 + || x ||1 then x1 = x2. Proof: 2|| x3 ||1 6 || x1 ||1 + || x2 ||1 2|| x3 y ||2 < || x1 y ||2 + || x2 y ||2 E (x3) < E (x1) = E (x2) = ) contradiction.
2 H : µ(¯ y) = PI(¯ y) dI = I + I = +,⇤ I sI PI: orthogonal projector on { x \ supp(x) = I } . Corrolary: µ ( y ) = x1 = x2 is uniquely defined. x3 = (x1 + x2)/2 is solution and if x1 6 = x2, Projector Proposition: If x1 and x2 minimize E , E ( x ) = 1 2 || x y ||2 + || x ||1 then x1 = x2. Proof: 2|| x3 ||1 6 || x1 ||1 + || x2 ||1 2|| x3 y ||2 < || x1 y ||2 + || x2 y ||2 E (x3) < E (x1) = E (x2) = ) contradiction.
Ic ( x ? y )||1 < =) supp(˜ x ?) ⇢ I ˜ x ? I x ? I 2 ker( I) = {0} . Let ˜ x ? be a minimizer. Then x ? = ˜ x ? =) =) ˜ x ? = x ? Uniqueness Sufficient Condition Theorem: If I has full rank and || Ic ( x y)|| < E ( x ) = 1 2 || x y ||2 + || x ||1 Proof: then x ? is the unique minimizer of E .
+, I Identifiability crition: [Fuchs] ( I is assumed to have full rank) For s ⇥ { 1, 0, +1}N , let I = supp(s) + I = ( I I ) 1 I satisfies + I I = Id I Robustness to Small Noise
+, I Identifiability crition: [Fuchs] ( I is assumed to have full rank) ⇥ If ||w|| small enough, ||x x0 || = O(||w||). is the unique solution of P (y). If ||w||/T is small enough and ||w||, then If F(sign(x0 )) < 1, x0 + + I w ( I I ) 1 sign(x0,I ) T = min i I |x0,i | For s ⇥ { 1, 0, +1}N , let I = supp(s) + I = ( I I ) 1 I satisfies + I I = Id I Robustness to Small Noise Theorem:
dI, j ⇥| where dI defined by: i I, dI, i = si Condition F(s) < 1: no vector j inside the cap Cs . dI Cs dI = I ( I I ) 1sI Geometric Interpretation j i i j | dI, ⇥| < 1 dI = +, I sI
dI, j ⇥| where dI defined by: i I, dI, i = si Condition F(s) < 1: no vector j inside the cap Cs . dI Cs dI i j k dI = I ( I I ) 1sI Geometric Interpretation j i i j | dI, ⇥| < 1 | dI, ⇥| < 1 dI = +, I sI
x(s) I = + I y ( I I ) 1sI, I = supp(s) where implicit equation sign(ˆ x) = sign(x0 ) (C 1 ) y = x0 + w = ˆ x = x0 + + I w ( I I ) 1sI || + I || ,2 ||w|| + ||( I I ) 1|| , < T = (C 1 ) ⇥ To prove: ˆ x = ˆ x(sign(x0 )) is the unique solution of P (y). Sketch of Proof
x(sign(x )) ˆ x(s) I = + I y ( I I ) 1sI, I = supp(s) where implicit equation sign(ˆ x) = sign(x0 ) (C 1 ) (C 2 ) y = x0 + w = ˆ x = x0 + + I w ( I I ) 1sI || Ic ( ˆ x y)|| < || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 1 ) = (C 2 ) ⇥ To prove: ˆ x = ˆ x(sign(x0 )) is the unique solution of P (y). Sketch of Proof
= T For ||w||/T < ⇥max , one can choose ||w||/T such that ˆ x is the solution of P (y). T max ||w|| ⇥⇤ = 0 Sketch of Proof (cont) || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0
= T For ||w||/T < ⇥max , one can choose ||w||/T such that ˆ x is the solution of P (y). T max ||w|| ⇥⇤ = 0 = O(||w||) ||ˆ x x0 || || + I w|| + ||( I I ) 1|| ,2 =⇥ ||ˆ x x0 || = O(||w||) Sketch of Proof (cont) || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0
, Relation with F criterion: ERC(I) = max s,supp(s) I F(s) For a support I ⇥ {0, . . . , N 1} with I full rank, = || + I Ic ||1,1 = max j Ic || + I j ||1 (use ||(aj ) j ||1,1 = max j ||aj ||1 ) Robustness to Bounded Noise where ⇥ I = Ic +, I
, Relation with F criterion: ERC(I) = max s,supp(s) I F(s) For a support I ⇥ {0, . . . , N 1} with I full rank, = || + I Ic ||1,1 = max j Ic || + I j ||1 (use ||(aj ) j ||1,1 = max j ||aj ||1 ) Robustness to Bounded Noise where ⇥ I = Ic +, I Theorem: If ERC(supp(x0 )) < 1 and ||w||, then ||x0 x || = O(||w||) x is unique, satisfies supp(x ) supp(x0 ), and
|| x y||2 + ||x||1 Implicit equation: ˆ xI = + I y ( I I ) 1sI Important: s = sign(ˆ x) is not equal to sign(x ). ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof
|| x y||2 + ||x||1 Implicit equation: ˆ xI = + I y ( I I ) 1sI Important: s = sign(ˆ x) is not equal to sign(x ). ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof First order conditions: (C 2 ) || Ic ( ˆ x y)|| < || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 2 )
|| x y||2 + ||x||1 Implicit equation: ˆ xI = + I y ( I I ) 1sI Important: s = sign(ˆ x) is not equal to sign(x ). ERC(I) < 1 = F(s) < 1 Since s is arbitrary: Hence, choosing ||w|| implies (C 2 ). ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof First order conditions: (C 2 ) || Ic ( ˆ x y)|| < || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 2 )
⇥| (A) = max j i=j | ai, aj ⇥| w-ERC(I) = ( I, Ic ) 1 ( I ) if ( I ) < 1 + otherwise. Weak Exact Recovery Criterion: [Gribonval,Dossal] (for I = supp(s)) For A = (ai ) i, B = (bi ) i , where ai, bi RP , F(s) ERC(I) w-ERC(I) Weak ERC Denoting = ( i )N 1 i=0 where i RP Theorem:
||( I I ) 1||1,1 max j /I || I j ||1 max j /I || I ⇥j ||1 = max j /I i m | ⇥i, ⇥j ⇥| = ( I, Ic ) One has I I = Id H, if ||H||1,1 < 1, ( I I ) 1 = (Id H) 1 = k 0 Hk ||( I I ) 1||1,1 k 0 ||H||k 1,1 = 1 1 ||H||1,1 ||H||1,1 = max i I j=i | ⇥i, ⇥j ⇥| = ( I ) Proof (for I = supp(s)) F(s) ERC(I) w-ERC(I) Theorem:
< 1 2 1 + 1 µ( ) If µ( ) = max i=j | i, j ⇥| Mutual coherence: one has supp(x ) I, and and ||w||, Theorem: F(s) ERC(I) w-ERC(I) |I|µ( ) 1 (|I| 1)µ( ) For Gaussian matrices: For convolution matrices: useless criterion. µ( ) log(PN)/P One has: Optimistic setting: ||x0 ||0 O( P) µ( ) N P P(N 1)
k )||x||2 Restricted Isometry Constants: 1 recovery: ⇥ argmin x 1 2 || x y||2 + ||x||1 CS with RIP [Candes 2009] x⇥ argmin || x y|| ||x||1 where y = x0 + w ||w|| Theorem: If 2k 2 1, then where xk is the best k-term approximation of x0 . ||x0 x || C0 ⇥ k ||x0 xk ||1 + C1
||1 Optimality conditions: C0 = 2 1 C1 = 1 ⇥ = 2 1 + 2k 1 2k = 2 2k 1 2k Explicit constants: Reference: Elements of Proof {0, . . . , N 1} = T0 ⇥ T1 ⇥ . . . ⇥ Tm k elements of x0 of hT c 0 largest h = x x0 xk = xT0 largest ||x0 x || C0 ⇥ s ||x0 xk ||1 + C1 E. J. Cand` es, CRAS, 2006
Stronger result: 2 = µ( ) RIP for Gaussian Matrices k C log(N/P)P Theorem: If then 2k 2 1 with high probability. µ( ) = max i=j | i, j ⇥| µ( ) log(PN)/P
⇥2 (A))|| ||2 Stability constant of A: Upper/lower RIC: i k = max |I|=k i ( I ) k = min( 1 k , 2 k ) k ˆ2 k ˆ2 k Monte-Carlo estimation: ˆ k k smallest / largest eigenvalues of A A Numerics with RIP