Signal Processing Course: Sparse L1 Recovery

Slide 1

Slide 1 text

Recovery Gabriel Peyré www.numerical-tours.com 1 Sparse

Slide 2

Slide 2 text

Inverse problem: K : RN0 RP , P N0 Example: Regularization K Kf0 f0 1 y = Kf0 + w measurements

Slide 3

Slide 3 text

Inverse problem: observations = K ⇥ ⇥ RP N K : RN0 RP , P N0 f0 = x0 sparse in dictionary RN0 N , N N0 . x0 RN f0 = x0 RN0 y = Kf0 + w RP w Example: Regularization Model: K Kf0 f0 coe cients image K 1 y = Kf0 + w measurements

Slide 4

Slide 4 text

Inverse problem: Fidelity Regularization min x RN 1 2 ||y x||2 + ||x||1 observations = K ⇥ ⇥ RP N K : RN0 RP , P N0 f0 = x0 sparse in dictionary RN0 N , N N0 . x0 RN f0 = x0 RN0 y = Kf0 + w RP w Example: Regularization Model: K Sparse recovery: f = x where x solves Kf0 f0 coe cients image K 1 y = Kf0 + w measurements

Slide 5

Slide 5 text

f0 = x0 y = x0 + w Recovery: Observations: Data: x ⇥ argmin x RN 1 2 || x y||2 + ||x||1 Variations and Stability (P (y))

Slide 6

Slide 6 text

f0 = x0 y = x0 + w Recovery: Observations: Data: x ⇥ argmin x RN 1 2 || x y||2 + ||x||1 x argmin x=y ||x||1 0+ (no noise) Variations and Stability (P (y)) (P0 (y))

Slide 7

Slide 7 text

f0 = x0 y = x0 + w Questions: Recovery: Observations: Data: – Behavior of x with respect to y and . – Criterion to ensure ||x x0 || = O(||w||). – Criterion to ensure x = x0 when w = 0 and = 0+. x ⇥ argmin x RN 1 2 || x y||2 + ||x||1 x argmin x=y ||x||1 0+ (no noise) Variations and Stability (P (y)) (P0 (y))

Slide 8

Slide 8 text

! Mapping ! x? looks polygonal. ! If x0 sparse and well chosen, sign(x ? ) = sign(x0). Numerical Illustration 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 s = 3 s = 6 s = 13 s = 25 y = x0 + w, || x0 ||0 = s, 2 R50⇥200 Gaussian .

Slide 9

Slide 9 text

Overview • Polytope Noiseless Recovery • Local Behavior of Sparse Regularization • Robustness to Small Noise • Robustness to Bounded Noise • Compressed Sensing RIP Theory

Slide 10

Slide 10 text

(B ) x0 x0 y x (y) 1 2 2 3 3 1 = ( i ) i R2 3 B = {x \ ||x||1 } = ||x0 ||1 min x=y ||x||1 x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B ) Polytopes Approach

Slide 11

Slide 11 text

(B ) x0 x0 y x (y) 1 2 2 3 3 1 = ( i ) i R2 3 B = {x \ ||x||1 } = ||x0 ||1 min x=y ||x||1 x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B ) Polytopes Approach (P0 (y))

Slide 12

Slide 12 text

Suppose x0 not solution, show (x0 ) int( B ) ⇥z, such that x0 = z, ||z||1 = (1 )||x0 ||1. ||z + ⇥||1 ||z|| + || +h||1 (1 )||x0 ||1 + || ||1,1 ||h||1 < ||x0 ||1 For any h = Im( ) such that ||h||1 < || +||1,1 = (x0 ) + h (B ) Proof = (x0 ) + h = (z + ) x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B )

Slide 13

Slide 13 text

= Suppose x0 not solution, show (x0 ) int( B ) ⇥z, such that x0 = z, ||z||1 = (1 )||x0 ||1. ||z + ⇥||1 ||z|| + || +h||1 (1 )||x0 ||1 + || ||1,1 ||h||1 < ||x0 ||1 For any h = Im( ) such that ||h||1 < || +||1,1 = (x0 ) + h (B ) Suppose (x0 ) int( B ) Then ⇥z, x0 = (1 ) z and ||z||1 < ||x0 ||1 . ||(1 )z||1 < ||x0 ||1 so x0 is not a solution. z 0 Proof = (x0 ) + h = (z + ) x0 solution of P0 ( x0 ) ⇥ x0 ⇤ (B ) x0 (B )

Slide 14

Slide 14 text

C(0,1,1) K(0,1,1) Ks = ( isi ) i R3 \ i 0 2-D cones Cs = Ks 2-D quadrant Basis-Pursuit Mapping in 2-D y x (y) 1 2 3 = ( i ) i R2 3

Slide 15

Slide 15 text

= ( i ) i R3 N Empty spherical caps property RN j Delaunay paving of the sphere with spherical triangles Cs Basis-Pursuit Mapping in 3-D y x (y) k Cs i

Slide 16

Slide 16 text

All Most RIP Counting faces of random polytopes: Sharp constants. No noise robustness. All x0 such that ||x0 ||0 Call (P/N)P are identiﬁable. Most x0 such that ||x0 ||0 Cmost (P/N)P are identiﬁable. Call (1/4) 0.065 Cmost (1/4) 0.25 [Donoho] Polytope Noiseless Recovery 50 100 150 200 250 300 350 400 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Slide 17

Slide 17 text

Overview • Polytope Noiseless Recovery • Local Behavior of Sparse Regularization • Robustness to Small Noise • Robustness to Bounded Noise • Compressed Sensing RIP Theory

Slide 18

Slide 18 text

0 E(x ) I = {i ⇥ {0, . . . , N 1} \ xi ⇤= 0} Support of the solution: First order condition: x solution of P (y) ( x y) + s = 0 where sI = sign(xI ), ||sIc || 1 First Order CNS Condition x ⇥ argmin x RN E(x) = 1 2 || x y||2 + ||x||1

Slide 19

Slide 19 text

0 E(x ) I = {i ⇥ {0, . . . , N 1} \ xi ⇤= 0} Support of the solution: First order condition: x solution of P (y) ( x y) + s = 0 where sI = sign(xI ), ||sIc || 1 sIc = 1 Ic ( x y) Note: x solution of P (y) || Ic ( x y)|| First Order CNS Condition Theorem: x ⇥ argmin x RN E(x) = 1 2 || x y||2 + ||x||1

Slide 20

Slide 20 text

If I has full rank: = ( x y) + s = 0 Implicit equation Local Parameterization xI = + I y ( I I ) 1sI + I = ( I I ) 1 I

Slide 21

Slide 21 text

If I has full rank: = ( x y) + s = 0 Implicit equation Given y compute x compute (s, I). ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI Deﬁne By construction ˆ x (y) = x . ˆ x¯ (¯ y) Ic = 0 Local Parameterization xI = + I y ( I I ) 1sI + I = ( I I ) 1 I

Slide 22

Slide 22 text

If I has full rank: = ( x y) + s = 0 Implicit equation Given y compute x compute (s, I). ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI Deﬁne By construction ˆ x (y) = x . Theorem: ˆ x¯ (¯ y) Ic = 0 Remark: the theorem holds outside a union of hyperplanes. 1 2 ||x ||0 =0 such that I is full rank, I = supp( x ?), For (y, ) / 2 H , let x ? be a solution of P (y), Local Parameterization xI = + I y ( I I ) 1sI + I = ( I I ) 1 I 2 2 2 2 2 1 1 1 1 1 for (¯, ¯ y) close to ( , y), ˆ x¯ (¯ y) is solution of P¯ (¯ y)

Slide 23

Slide 23 text

! if ker( I) 6 = { 0 } , x ? not unique. Full Rank Condition Lemma: There exists x ? such that ker( I) = {0}.

Slide 24

Slide 24 text

! if ker( I) 6 = { 0 } , x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Full Rank Condition Lemma: There exists x ? such that ker( I) = {0}.

Slide 25

Slide 25 text

! if ker( I) 6 = { 0 } , x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.

Slide 26

Slide 26 text

! if ker( I) 6 = { 0 } , x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). 8 | t | < t0, xt is solution. xt = x ? and same sign: Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.

Slide 27

Slide 27 text

! if ker( I) 6 = { 0 } , x ? not unique. Proof: Deﬁne 8 t 2 R , xt = x ? + t⌘ . If ker( I) 6= {0}, let ⌘I 2 ker( I) 6= 0. Let t0 the smallest |t| s.t. sign( xt) 6= sign( x ?). By continuity, xt0 solution. and | supp( xt0 )| < | supp( x ?)|. 8 | t | < t0, xt is solution. xt = x ? and same sign: Full Rank Condition Lemma: There exists x ? such that t0 t xt 0 ker( I) = {0}.

Slide 28

Slide 28 text

d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, I = supp(s)

Slide 29

Slide 29 text

! ok, by continuity. d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 1: ds j (y, ) < Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, I = supp(s)

Slide 30

Slide 30 text

! ok, by continuity. d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, I = supp(s)

Slide 31

Slide 31 text

! ok, by continuity. 'j / 2 Im( I) ! exclude this case. d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)

Slide 32

Slide 32 text

! ok, by continuity. 'j / 2 Im( I) ! exclude this case. Exclude hyperplanes: d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)

Slide 33

Slide 33 text

x ?=0 H;,j ! ok, by continuity. 'j / 2 Im( I) ! exclude this case. Exclude hyperplanes: d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)

Slide 34

Slide 34 text

x ?=0 H;,j ! ok, by continuity. 'j / 2 Im( I) ! exclude this case. Exclude hyperplanes: HI,j d s j (¯ y, ¯) = |h 'j, ¯ y I ˆ x¯(¯ y )i| 6 Case 2: ds j (y, ) = and 'j 2 Im( I) Case 1: ds j (y, ) < H = [ {Hs,j \ 'j / 2 Im( I)} Hs,j = (y, ) \ ds j (¯ y, ¯) = then ds j(¯ y, ¯) = ¯ ! ok. Proof ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI To show: 8 j / 2 I, Case 3: ds j (y, ) = and I = supp(s)

Slide 35

Slide 35 text

Local parameterization: y x x Under uniqueness assumption: are piecewise a ne functions. Local Affine Maps ˆ x¯ (¯ y) I = + I ¯ y ¯( I I ) 1sI x1 x2 0 = 0 k x k = 0 x0 (BP sol.) breaking points change of support of x

Slide 36

Slide 36 text

Corrolary: µ ( y ) = x1 = x2 is uniquely deﬁned. Projector Proposition: If x1 and x2 minimize E , E ( x ) = 1 2 || x y ||2 + || x ||1 then x1 = x2.

Slide 37

Slide 37 text

Corrolary: µ ( y ) = x1 = x2 is uniquely deﬁned. x3 = (x1 + x2)/2 is solution and if x1 6 = x2, Projector Proposition: If x1 and x2 minimize E , E ( x ) = 1 2 || x y ||2 + || x ||1 then x1 = x2. Proof: 2|| x3 ||1 6 || x1 ||1 + || x2 ||1 2|| x3 y ||2 < || x1 y ||2 + || x2 y ||2 E (x3) < E (x1) = E (x2) = ) contradiction.

Slide 38

Slide 38 text

For (¯ y, ) close to ( y, ) / 2 H : µ(¯ y) = PI(¯ y) dI = I + I = +,⇤ I sI PI: orthogonal projector on { x \ supp(x) = I } . Corrolary: µ ( y ) = x1 = x2 is uniquely deﬁned. x3 = (x1 + x2)/2 is solution and if x1 6 = x2, Projector Proposition: If x1 and x2 minimize E , E ( x ) = 1 2 || x y ||2 + || x ||1 then x1 = x2. Proof: 2|| x3 ||1 6 || x1 ||1 + || x2 ||1 2|| x3 y ||2 < || x1 y ||2 + || x2 y ||2 E (x3) < E (x1) = E (x2) = ) contradiction.

Slide 39

Slide 39 text

Overview • Polytope Noiseless Recovery • Local Behavior of Sparse Regularization • Robustness to Small Noise • Robustness to Bounded Noise • Compressed Sensing RIP Theory

Slide 40

Slide 40 text

Uniqueness Sufficient Condition E ( x ) = 1 2 || x y ||2 + || x ||1

Slide 41

Slide 41 text

Uniqueness Sufficient Condition Theorem: If I has full rank and || Ic ( x y)|| < E ( x ) = 1 2 || x y ||2 + || x ||1 then x ? is the unique minimizer of E .

Slide 42

Slide 42 text

|| Ic ( ˜ x ? y )||1 = || Ic ( x ? y )||1 < =) supp(˜ x ?) ⇢ I ˜ x ? I x ? I 2 ker( I) = {0} . Let ˜ x ? be a minimizer. Then x ? = ˜ x ? =) =) ˜ x ? = x ? Uniqueness Sufficient Condition Theorem: If I has full rank and || Ic ( x y)|| < E ( x ) = 1 2 || x y ||2 + || x ||1 Proof: then x ? is the unique minimizer of E .

Slide 43

Slide 43 text

F(s) = || IsI || where ⇥ I = Ic +, I Identiﬁability crition: [Fuchs] ( I is assumed to have full rank) For s ⇥ { 1, 0, +1}N , let I = supp(s) + I = ( I I ) 1 I satisﬁes + I I = Id I Robustness to Small Noise

Slide 44

Slide 44 text

F(s) = || IsI || where ⇥ I = Ic +, I Identiﬁability crition: [Fuchs] ( I is assumed to have full rank) ⇥ If ||w|| small enough, ||x x0 || = O(||w||). is the unique solution of P (y). If ||w||/T is small enough and ||w||, then If F(sign(x0 )) < 1, x0 + + I w ( I I ) 1 sign(x0,I ) T = min i I |x0,i | For s ⇥ { 1, 0, +1}N , let I = supp(s) + I = ( I I ) 1 I satisﬁes + I I = Id I Robustness to Small Noise Theorem:

Slide 45

Slide 45 text

F(s) = || IsI || = max j /I | dI, j ⇥| where dI deﬁned by: i I, dI, i = si dI = I ( I I ) 1sI Geometric Interpretation j i dI = +, I sI

Slide 46

Slide 46 text

F(s) = || IsI || = max j /I | dI, j ⇥| where dI deﬁned by: i I, dI, i = si Condition F(s) < 1: no vector j inside the cap Cs . dI Cs dI = I ( I I ) 1sI Geometric Interpretation j i i j | dI, ⇥| < 1 dI = +, I sI

Slide 47

Slide 47 text

F(s) = || IsI || = max j /I | dI, j ⇥| where dI deﬁned by: i I, dI, i = si Condition F(s) < 1: no vector j inside the cap Cs . dI Cs dI i j k dI = I ( I I ) 1sI Geometric Interpretation j i i j | dI, ⇥| < 1 | dI, ⇥| < 1 dI = +, I sI

Slide 48

Slide 48 text

Local candidate: x = ˆ x(sign(x )) ˆ x(s) I = + I y ( I I ) 1sI, I = supp(s) where implicit equation ⇥ To prove: ˆ x = ˆ x(sign(x0 )) is the unique solution of P (y). Sketch of Proof

Slide 49

Slide 49 text

Local candidate: Sign consistency: x = ˆ x(sign(x )) ˆ x(s) I = + I y ( I I ) 1sI, I = supp(s) where implicit equation sign(ˆ x) = sign(x0 ) (C 1 ) y = x0 + w = ˆ x = x0 + + I w ( I I ) 1sI || + I || ,2 ||w|| + ||( I I ) 1|| , < T = (C 1 ) ⇥ To prove: ˆ x = ˆ x(sign(x0 )) is the unique solution of P (y). Sketch of Proof

Slide 50

Slide 50 text

Local candidate: Sign consistency: First order conditions: x = ˆ x(sign(x )) ˆ x(s) I = + I y ( I I ) 1sI, I = supp(s) where implicit equation sign(ˆ x) = sign(x0 ) (C 1 ) (C 2 ) y = x0 + w = ˆ x = x0 + + I w ( I I ) 1sI || Ic ( ˆ x y)|| < || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 1 ) = (C 2 ) ⇥ To prove: ˆ x = ˆ x(sign(x0 )) is the unique solution of P (y). Sketch of Proof

Slide 51

Slide 51 text

= ˆ x is the solution Sketch of Proof (cont) || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0

Slide 52

Slide 52 text

= ˆ x is the solution ||w|| ||w|| + ⇥⇤ = T For ||w||/T < ⇥max , one can choose ||w||/T such that ˆ x is the solution of P (y). T max ||w|| ⇥⇤ = 0 Sketch of Proof (cont) || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0

Slide 53

Slide 53 text

= ˆ x is the solution ||w|| ||w|| + ⇥⇤ = T For ||w||/T < ⇥max , one can choose ||w||/T such that ˆ x is the solution of P (y). T max ||w|| ⇥⇤ = 0 = O(||w||) ||ˆ x x0 || || + I w|| + ||( I I ) 1|| ,2 =⇥ ||ˆ x x0 || = O(||w||) Sketch of Proof (cont) || + I || ,2 ||w|| + ||( I I ) 1|| , < T || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0

Slide 54

Slide 54 text

Overview • Polytope Noiseless Recovery • Local Behavior of Sparse Regularization • Robustness to Small Noise • Robustness to Bounded Noise • Compressed Sensing RIP Theory

Slide 55

Slide 55 text

Exact Recovery Criterion (ERC): [Tropp] ERC(I) = || I || , Relation with F criterion: ERC(I) = max s,supp(s) I F(s) For a support I ⇥ {0, . . . , N 1} with I full rank, = || + I Ic ||1,1 = max j Ic || + I j ||1 (use ||(aj ) j ||1,1 = max j ||aj ||1 ) Robustness to Bounded Noise where ⇥ I = Ic +, I

Slide 56

Slide 56 text

Slide 57

Slide 57 text

Restricted recovery: ˆ x ⇥ argmin supp(x) I 1 2 || x y||2 + ||x||1 ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof

Slide 58

Slide 58 text

Slide 59

Slide 59 text

Restricted recovery: ˆ x ⇥ argmin supp(x) I 1 2 || x y||2 + ||x||1 Implicit equation: ˆ xI = + I y ( I I ) 1sI Important: s = sign(ˆ x) is not equal to sign(x ). ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof First order conditions: (C 2 ) || Ic ( ˆ x y)|| < || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 2 )

Slide 60

Slide 60 text

Restricted recovery: ˆ x ⇥ argmin supp(x) I 1 2 || x y||2 + ||x||1 Implicit equation: ˆ xI = + I y ( I I ) 1sI Important: s = sign(ˆ x) is not equal to sign(x ). ERC(I) < 1 = F(s) < 1 Since s is arbitrary: Hence, choosing ||w|| implies (C 2 ). ⇥ To prove: ˆ x is the unique solution of P (y). Sketch of Proof First order conditions: (C 2 ) || Ic ( ˆ x y)|| < || Ic ( I + I Id)||2, ||w|| (1 F(s)) < 0 = (C 2 )

Slide 61

Slide 61 text

(A, B) = max j i I | ai, bj ⇥| (A) = max j i=j | ai, aj ⇥| w-ERC(I) = ( I, Ic ) 1 ( I ) if ( I ) < 1 + otherwise. Weak Exact Recovery Criterion: [Gribonval,Dossal] (for I = supp(s)) For A = (ai ) i, B = (bi ) i , where ai, bi RP , F(s) ERC(I) w-ERC(I) Weak ERC Denoting = ( i )N 1 i=0 where i RP Theorem:

Slide 62

Slide 62 text

ERC(I) = max j /I || + I j ||1 ||( I I ) 1||1,1 max j /I || I j ||1 max j /I || I ⇥j ||1 = max j /I i m | ⇥i, ⇥j ⇥| = ( I, Ic ) Proof (for I = supp(s)) F(s) ERC(I) w-ERC(I) Theorem:

Slide 63

Slide 63 text

ERC(I) = max j /I || + I j ||1 ||( I I ) 1||1,1 max j /I || I j ||1 max j /I || I ⇥j ||1 = max j /I i m | ⇥i, ⇥j ⇥| = ( I, Ic ) One has I I = Id H, if ||H||1,1 < 1, ( I I ) 1 = (Id H) 1 = k 0 Hk ||( I I ) 1||1,1 k 0 ||H||k 1,1 = 1 1 ||H||1,1 ||H||1,1 = max i I j=i | ⇥i, ⇥j ⇥| = ( I ) Proof (for I = supp(s)) F(s) ERC(I) w-ERC(I) Theorem:

Slide 64

Slide 64 text

P = 200, N = 1000 F < 1 ERC < 1 x = x0 w-ERC < 1 Example: Random Matrix 0 10 20 30 40 50 0 0.2 0.4 0.6 0.8 1

Slide 65

Slide 65 text

⇥x = i xi (· i) Increasing : reduces correlation. F(s) ERC(I) w-ERC(I) reduces resolution. Example: Deconvolution x0 x0

Slide 66

Slide 66 text

Coherence Bounds µ( ) = max i=j | i, j ⇥| Mutual coherence: Theorem: F(s) ERC(I) w-ERC(I) |I|µ( ) 1 (|I| 1)µ( )

Slide 67

Slide 67 text

Slide 68

Slide 68 text

Coherence Bounds Theorem: ||x0 x || = O(||w||) ||x0 ||0 < 1 2 1 + 1 µ( ) If µ( ) = max i=j | i, j ⇥| Mutual coherence: one has supp(x ) I, and and ||w||, Theorem: F(s) ERC(I) w-ERC(I) |I|µ( ) 1 (|I| 1)µ( ) For Gaussian matrices: For convolution matrices: useless criterion. µ( ) log(PN)/P One has: Optimistic setting: ||x0 ||0 O( P) µ( ) N P P(N 1)

Slide 69

Slide 69 text

Incoherent pair of orthobases: 2 = k N 1/2e2i N mk m 1 = {k ⇤⇥ [k m]}m Diracs/Fourier = [ 1, 2 ] RN 2N Coherence - Examples

Slide 70

Slide 70 text

Incoherent pair of orthobases: 2 = k N 1/2e2i N mk m 1 = {k ⇤⇥ [k m]}m Diracs/Fourier = [ 1, 2 ] RN 2N min x R2N 1 2 ||y x||2 + ||x||1 min x1,x2 RN 1 2 ||y 1x1 2x2 ||2 + ||x1 ||1 + ||x2 ||1 = + Coherence - Examples

Slide 71

Slide 71 text

Incoherent pair of orthobases: 2 = k N 1/2e2i N mk m 1 = {k ⇤⇥ [k m]}m Diracs/Fourier = [ 1, 2 ] RN 2N µ( ) = 1 N = separates up to N/2 Diracs + sines. min x R2N 1 2 ||y x||2 + ||x||1 min x1,x2 RN 1 2 ||y 1x1 2x2 ||2 + ||x1 ||1 + ||x2 ||1 = + Coherence - Examples

Slide 72

Slide 72 text

Overview • Polytope Noiseless Recovery • Local Behavior of Sparse Regularization • Robustness to Small Noise • Robustness to Bounded Noise • Compressed Sensing RIP Theory

Slide 73

Slide 73 text

⇥ ||x||0 k, (1 k )||x||2 || x||2 (1 + k )||x||2 Restricted Isometry Constants: 1 recovery: ⇥ argmin x 1 2 || x y||2 + ||x||1 CS with RIP x⇥ argmin || x y|| ||x||1 where y = x0 + w ||w||

Slide 74

Slide 74 text

⇥ ||x||0 k, (1 k )||x||2 || x||2 (1 + k )||x||2 Restricted Isometry Constants: 1 recovery: ⇥ argmin x 1 2 || x y||2 + ||x||1 CS with RIP [Candes 2009] x⇥ argmin || x y|| ||x||1 where y = x0 + w ||w|| Theorem: If 2k 2 1, then where xk is the best k-term approximation of x0 . ||x0 x || C0 ⇥ k ||x0 xk ||1 + C1

Slide 75

Slide 75 text

||hT c 0 ||1 ||hT0 ||1 + 2||xT c 0 ||1 Optimality conditions: C0 = 2 1 C1 = 1 ⇥ = 2 1 + 2k 1 2k = 2 2k 1 2k Explicit constants: Reference: Elements of Proof {0, . . . , N 1} = T0 ⇥ T1 ⇥ . . . ⇥ Tm k elements of x0 of hT c 0 largest h = x x0 xk = xT0 largest ||x0 x || C0 ⇥ s ||x0 xk ||1 + C1 E. J. Cand` es, CRAS, 2006

Slide 76

Slide 76 text

f (⇥) = 1 2⇤ ⇥ (⇥ b)+(a ⇥)+ Eigenvalues of I I with |I| = k are essentially in [a, b] a = (1 )2 and b = (1 )2 where = k/P When k = P + , the eigenvalue distribution tends to [Marcenko-Pastur] Large deviation inequality [Ledoux] Singular Values Distributions 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 P=200, k=10 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 P=200, k=30 0.4 0.6 0.8 P=200, k=50 0 0.5 1 1.5 2 2.5 0 0.5 1 1.5 P=200, k=10 0 0.5 1 1.5 2 2.5 0 0.2 0.4 0.6 0.8 1 P=200, k=30 0.2 0.4 0.6 0.8 P=200, k=50 P = 200, k = 10 f ( ) k = 30

Slide 77

Slide 77 text

Link with coherence: k (k 1)µ( ) 2 = µ( ) RIP for Gaussian Matrices µ( ) = max i=j | i, j ⇥|

Slide 78

Slide 78 text

Link with coherence: k (k 1)µ( ) For Gaussian matrices: 2 = µ( ) RIP for Gaussian Matrices µ( ) = max i=j | i, j ⇥| µ( ) log(PN)/P

Slide 79

Slide 79 text

Link with coherence: k (k 1)µ( ) For Gaussian matrices: Stronger result: 2 = µ( ) RIP for Gaussian Matrices k C log(N/P)P Theorem: If then 2k 2 1 with high probability. µ( ) = max i=j | i, j ⇥| µ( ) log(PN)/P

Slide 80

Slide 80 text

(1 ⇥1 (A))|| ||2 ||A ||2 (1 + ⇥2 (A))|| ||2 Stability constant of A: smallest / largest eigenvalues of A A Numerics with RIP

Slide 81

Slide 81 text

2 1 (1 ⇥1 (A))|| ||2 ||A ||2 (1 + ⇥2 (A))|| ||2 Stability constant of A: Upper/lower RIC: i k = max |I|=k i ( I ) k = min( 1 k , 2 k ) k ˆ2 k ˆ2 k Monte-Carlo estimation: ˆ k k smallest / largest eigenvalues of A A Numerics with RIP

Slide 82

Slide 82 text

Local behavior: ! x ? polygonal. y ! x ? piecewise a ne. Conclusion 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25

Slide 83

Slide 83 text

Local behavior: ! x ? polygonal. y ! x ? piecewise a ne. Noiseless recovery: () geometry of polytopes. Conclusion 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 x0

Slide 84

Slide 84 text

Local behavior: ! x ? polygonal. y ! x ? piecewise a ne. Noiseless recovery: () geometry of polytopes. Small noise: ! sign stability. ! support inclusion. Bounded noise: RIP-based: ! no support stability, L1 bounds. Conclusion 10 20 30 40 50 60 −1 −0.5 0 0.5 s=3 10 20 30 40 50 60 −0.5 0 0.5 s=6 20 40 60 80 100 −0.5 0 0.5 1 s=13 20 40 60 80 100 120 140 −1.5 −1 −0.5 0 0.5 1 1.5 s=25 x0