Samuel Vaiter
September 12, 2011
23

# Robust Sparse Analysis Recovery

GT Image (MSc defense), Paris-Dauphine, Paris, September 2011

#### Samuel Vaiter

September 12, 2011

## Transcript

4. ### Inverse Problems ill-posed Linear hypothesis One model y = x0

+ w Observations Operator Unknown signal Noise Several problems Inpaiting Super-resolution
5. ### Inverse Problems ill-posed Linear hypothesis One model y = x0

+ w Observations Operator Unknown signal Noise Several problems Inpaiting Super-resolution Regularization x? 2 argmin x 2RN 1 2 || y x ||2 2 + J ( x )
6. ### Inverse Problems ill-posed Linear hypothesis One model y = x0

+ w Observations Operator Unknown signal Noise x? 2 argmin x = y J ( x ) Noiseless 0 Several problems Inpaiting Super-resolution Regularization x? 2 argmin x 2RN 1 2 || y x ||2 2 + J ( x )

Z ||r x ||2
8. ### Image Priors Sobolev J ( x ) = 1 2

Z ||r x ||2 Total variation J ( x ) = Z ||r x ||
9. ### Image Priors Sobolev J ( x ) = 1 2

Z ||r x ||2 (ideal prior) Wavelet sparsity J ( x ) = | { i \ h x, i i 6= 0} | Total variation J ( x ) = Z ||r x ||
10. ### Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives

, P > N
12. ### Dictionary Redundant dictionary of RN : {di }P 1 i=0

, P > N Identity Id
13. ### Dictionary Redundant dictionary of RN : {di }P 1 i=0

, P > N Identity Id shift invariant wavelet frame
14. ### ﬁnite di↵erence operator ⌦n DIF 0 B B B B

B B @ 1 0 +1 1 +1 ... ... 1 0 +1 1 C C C C C C A Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id shift invariant wavelet frame
15. ### ﬁnite di↵erence operator ⌦n DIF 0 B B B B

B B @ 1 0 +1 1 +1 ... ... 1 0 +1 1 C C C C C C A Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id shift invariant wavelet frame fussed lasso ⌦DIF "Id

17. ### Analysis versus Synthesis Two point of view “Generate” x Synthesis

x = D↵ ↵ N P x ! non-unique if P > N
18. ### Analysis versus Synthesis Two point of view “Generate” x Synthesis

x = D↵ ↵ N P x ! non-unique if P > N “Analyze” x OR ? Analysis D ⇤ x = ↵ ↵ x N P
19. ### “Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6=

0} | A Bird’s Eye View of Sparsity
20. ### `0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = |

{i \ ↵i 6= 0} | A Bird’s Eye View of Sparsity
21. ### `0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = |

{i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity
22. ### `0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = |

{i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity d0 d1 q = 1 q = 0 q = 2 q = 1 5 . q = 0 5 .
23. ### `1 norm: convexiﬁcation of `0 prior `0 minimization is NP-hard

“Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity d0 d1 q = 1 q = 0 q = 2 q = 1 5 . q = 0 5 .
24. ### Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1

= D x = D↵ Sparse Regularizations
25. ### Analysis argmin x 2RN 1 2 || y x ||2

2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations
26. ### Analysis argmin x 2RN 1 2 || y x ||2

2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵
27. ### Analysis argmin x 2RN 1 2 || y x ||2

2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵
28. ### Analysis argmin x 2RN 1 2 || y x ||2

2 + || D ⇤ x ||1 Sparse approx. of x ? in D Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵
29. ### Analysis argmin x 2RN 1 2 || y x ||2

2 + || D ⇤ x ||1 Correlation of x ? and D sparse Sparse approx. of x ? in D Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵
30. ### Support and Signal Model I = supp( D ⇤ x

?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
31. ### Ker D⇤ J = GJ Deﬁnition Support and Signal Model

I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
32. ### Ker D⇤ J = GJ Deﬁnition Support and Signal Model

I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? 2 GJ ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
33. ### Ker D⇤ J = GJ Deﬁnition Hypothesis: Ker \ Ker

D⇤ = {0} Support and Signal Model I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? 2 GJ ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

35. ### Examples of Signal Model Credit to G. Peyr´ e shift

invariant wavelet frame Identity ⇥k : k-sparse signals 1
36. ### Examples of Signal Model Credit to G. Peyr´ e shift

invariant wavelet frame ﬁnite di↵erence operator ⇥k : piecewise constant signals with k 1 steps 1 a1 a2 a3 Identity ⇥k : k-sparse signals 1
37. ### Examples of Signal Model Credit to G. Peyr´ e shift

invariant wavelet frame ⇥k : sum of k interval characteristic functions fussed lasso a1 a2 a3 a4 a5 a6 a7 a8 1 ﬁnite di↵erence operator ⇥k : piecewise constant signals with k 1 steps 1 a1 a2 a3 Identity ⇥k : k-sparse signals 1
38. ### Synthesis Analysis ! 0 x? = argmin x = y

|| D ⇤ x ||1 P(y, 0) Remember ! ↵? = argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 P(y, ) x? = argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1
39. ### Local behavior ? Properties of x ? solution of P

(y, ) as a function of y Toward a Better Understanding
40. ### Local behavior ? Properties of x ? solution of P

(y, ) as a function of y Noiseless identiﬁability ? Is x0 the unique solution of P ( x0, 0) ? Toward a Better Understanding
41. ### Noise robustness ? What can we say about || x

? x0 || for noisy observations ? Local behavior ? Properties of x ? solution of P (y, ) as a function of y Noiseless identiﬁability ? Is x0 the unique solution of P ( x0, 0) ? Toward a Better Understanding
42. ### [Fuchs, Tropp, Dossal]: address these questions — Previous works in

synthesis From Synthesis to Analysis Results
43. ### [Fuchs, Tropp, Dossal]: address these questions — Previous works in

synthesis Geometry of the problem ? — Similar problem but much more di culties in analysis From Synthesis to Analysis Results

d1 d2

d1 d2 y = ↵

d1 d2 y = ↵
50. ### G2 G1 ||↵||1 = 1 From Synthesis to Analysis Results

d1 d2 y = ↵ ↵?
51. ### G2 G1 ||↵||1 = 1 sparsest solution From Synthesis to

Analysis Results d1 d2 y = ↵ ↵?

G1
55. ### From Synthesis to Analysis Results d1 d2 d3 G3 G2

G1 || D ⇤ x ||1 = 1
56. ### From Synthesis to Analysis Results d1 d2 d3 G3 G2

G1 || D ⇤ x ||1 = 1 y = x
57. ### From Synthesis to Analysis Results d1 d2 d3 G3 G2

G1 || D ⇤ x ||1 = 1 y = x
58. ### From Synthesis to Analysis Results d1 d2 d3 G3 G2

G1 || D ⇤ x ||1 = 1 y = x x ?
59. ### Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
60. ### Analysis is Piecewise Affine i.e solutions of P ( y,

) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable,
61. ### Analysis is Piecewise Affine i.e solutions of P ( y,

) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0
62. ### Analysis is Piecewise Affine i.e solutions of P ( y,

) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y = x
63. ### Analysis is Piecewise Affine i.e solutions of P ( y,

) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x
64. ### A ne function: ¯ y 7! x(¯ y) = A

⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x
65. ### Except for y 2 H , if ¯ y is

close enough from y , then x(¯ y) is a solution of P (¯ y, ). Theorem 1 A ne function: ¯ y 7! x(¯ y) = A ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x
66. ### Except for y 2 H , if ¯ y is

close enough from y , then x(¯ y) is a solution of P (¯ y, ). Theorem 1 A ne function: ¯ y 7! x(¯ y) = A ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, deﬁnition in few minutes GJ GJ0 y + " = x y = x
67. ### Sketch of the Proof Problem : Lasso x? 2 argmin

x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
68. ### Sketch of the Proof Support I = supp( D ⇤

x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
69. ### Sketch of the Proof Support I = supp( D ⇤

x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ
70. ### Hypothesis Ker \ GJ = {0} Sketch of the Proof

Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ
71. ### Hypothesis Ker \ GJ = {0} Sketch of the Proof

Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ — I, J, s = sign(D ⇤ x ? ) are ﬁxed by x ? — We ﬁx observations y
72. ### First Order Conditions x? 2 argmin x 2RN 1 2

|| y x ||2 2 + || D ⇤ x ||1 P(y, )
73. ### Non di↵erentiable problem x ? is a minimum of P

(y, ) if, and only if, 0 2 @f(x ? ) First Order Conditions x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
74. ### Non di↵erentiable problem x ? is a minimum of P

(y, ) if, and only if, 0 2 @f(x ? ) First Order Conditions x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? solution of P (y, ) , 9 2 ⌃y(x ? ), || ||1 6 1 ⌃y( x ) = n 2 R|J| \ ⇤( x y ) + DI s + DJ = 0o First-order conditions of Lasso Gradient Subdi↵erential
75. ### x ( y ) 2 argmin x 2GJ 1 2

|| y x ||2 2 + || D ⇤ x ||1 A Solution of Lasso
76. ### x ( y ) 2 argmin x 2GJ 1 2

|| y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso
77. ### ⇤ x ( y ) = ⇤ y DI s

DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso
78. ### ⇤ x ( y ) = ⇤ y DI s

DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso Non-inversible
79. ### ⇤ x ( y ) = ⇤ y DI s

DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible
80. ### ⇤ x ( y ) = ⇤ y DI s

DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? x ( y ) = A ⇤ y ADI s ADJ A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible
81. ### ⇤ x ( y ) = ⇤ y DI s

DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? x ( y ) = A ⇤ y ADI s ADJ A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible = 0 ( x ( y ) 2 GJ )
82. ### H = ⇢ y 2 RQ \ 9 x 2

RN : min 2⌃y( x ) || ||1 = 1 Transition Space
83. ### H : ﬁrst order conditions saturation ! “jump” from GJ

to GJ0 H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space
84. ### H : ﬁrst order conditions saturation ! “jump” from GJ

to GJ0 H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space x = GJ GJ0 2 H =
85. ### H : ﬁrst order conditions saturation ! “jump” from GJ

to GJ0 Open question Smallest union of subspace containing H ? H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space x = GJ GJ0 2 H =
86. ### ¯ y 7! x (¯ y ) = A ⇤

¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) End of the Proof
87. ### ¯ y 7! x (¯ y ) = A ⇤

¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) — Fix ¯ y close enough to have sign(D ⇤ x(y)) = sign(D ⇤ x(¯ y)) Sign stability End of the Proof
88. ### ¯ y 7! x (¯ y ) = A ⇤

¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) — Check that x(¯ y) is indeed solution of P (¯ y, ) Use of ﬁrst order conditions — Fix ¯ y close enough to have sign(D ⇤ x(y)) = sign(D ⇤ x(¯ y)) Sign stability End of the Proof

s Remember !
90. ### x (¯ y ) = A ⇤ ¯ y ADI

s Remember ! Inverse of on GJ
91. ### — continuous y 7! x ?( y ) is :

x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ
92. ### — continuous y 7! x ?( y ) is :

x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne
93. ### — continuous y 7! x ?( y ) is :

Property given by sign stability x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne
94. ### — continuous y 7! x ?( y ) is :

Property given by sign stability Useful for : — Robustness study — SURE denoising risk estimation — Inverse problem on x x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne
95. ### Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
96. ### Identiﬁability: x0 unique solution of P ( x0, 0) {

x0 } ? = argmin x = x0 || D ⇤ x ||1 Identifiability
97. ### Identiﬁability: x0 unique solution of P ( x0, 0) {

x0 } ? = argmin x = x0 || D ⇤ x ||1 Strategy: P ( y, ) is almost P ( y, 0) for small values of Identifiability
98. ### Identiﬁability: x0 unique solution of P ( x0, 0) {

x0 } ? = argmin x = x0 || D ⇤ x ||1 ! Restrictive condition ! But gives a stability results for small noise. Assumption: GJ must be stable for small values of Strategy: P ( y, ) is almost P ( y, 0) for small values of Identifiability
99. ### ⌦ = D+ J ( ⇤ A Id)DI F(s) =

min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector Noiseless and Sign Criterion
100. ### ⌦ = D+ J ( ⇤ A Id)DI F(s) =

min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) Noiseless and Sign Criterion
101. ### ⌦ = D+ J ( ⇤ A Id)DI F(s) =

min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) If F (sign ( D ⇤ I x0)) < 1 then x0 is identiﬁable. Let x0 2 RN be a ﬁxed vector, and J = I c where I = I(D ⇤ x0). Theorem 2 Suppose that Ker \ GJ = { 0 } . Noiseless and Sign Criterion
102. ### ⌦ = D+ J ( ⇤ A Id)DI F(s) =

min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) If F (sign ( D ⇤ I x0)) < 1 then x0 is identiﬁable. Let x0 2 RN be a ﬁxed vector, and J = I c where I = I(D ⇤ x0). Theorem 2 Suppose that Ker \ GJ = { 0 } . Specializes to Fuchs results for synthesis ( D = Id) Noiseless and Sign Criterion
103. ### Nam et al. Results Only other work on analysis recovery

[Nam 2011] “Cosparse” model
104. ### Nam et al. Results G(s) = || s||1 M⇤ orthonormal

basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model
105. ### Nam et al. Results G(s) = || s||1 M⇤ orthonormal

basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model Theorem Let x0 2 RN be a ﬁxed vector, and J = I c where I = I(D ⇤ x0). Suppose that Ker \ GJ = { 0 } . If G (sign ( D ⇤ I x0)) < 1 then x0 is identiﬁable.
106. ### ! But no noise robustness, even for small ones More

intrinsic criterion Nam et al. Results G(s) = || s||1 M⇤ orthonormal basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model Theorem Let x0 2 RN be a ﬁxed vector, and J = I c where I = I(D ⇤ x0). Suppose that Ker \ GJ = { 0 } . If G (sign ( D ⇤ I x0)) < 1 then x0 is identiﬁable.
107. ### x ( x0) = A ⇤ x0 ADI s Idea:

Study P ( y, ) for ⇡ 0 Sketch of the Proof
108. ### small enough to have sign(D ⇤ x ( x0)) =

sign(D ⇤ x0) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 Sketch of the Proof
109. ### small enough to have sign(D ⇤ x ( x0)) =

sign(D ⇤ x0) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 Sketch of the Proof
110. ### small enough to have sign(D ⇤ x ( x0)) =

sign(D ⇤ x0) x ( x0) solution of P ( ) and x ( x0) ! !0 x0( x0) x0( x0) solution of P (0) ) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 Sketch of the Proof
111. ### small enough to have sign(D ⇤ x ( x0)) =

sign(D ⇤ x0) x ( x0) solution of P ( ) and x ( x0) ! !0 x0( x0) x0( x0) solution of P (0) ) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 F(sign(D ⇤ x ( x0)) < 1 ) x ( x0) unique solution Sketch of the Proof
112. ### Suppose we observe y = x0 + w Does argmin

x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
113. ### Generalization of Theorem 2 : Yes, if ||w|| small enough

Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
114. ### Generalization of Theorem 2 : Yes, if ||w|| small enough

Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) F (sign( D ⇤ x0)) < 1 gives • identiﬁability • small noise robustness Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
115. ### Generalization of Theorem 2 : Yes, if ||w|| small enough

Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) Question: And for an arbitrary noise ? F (sign( D ⇤ x0)) < 1 gives • identiﬁability • small noise robustness Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
116. ### Settings: y = x0 + w, with w bounded noise.

Noisy and Support Criterion
117. ### identiﬁability of vector ! identiﬁability of support Settings: y =

x0 + w, with w bounded noise. Noisy and Support Criterion
118. ### ARC(I) = max x 2GJ F(sign(D ⇤ I x)) identiﬁability

of vector ! identiﬁability of support Settings: y = x0 + w, with w bounded noise. Noisy and Support Criterion
119. ### ARC(I) = max x 2GJ F(sign(D ⇤ I x)) identiﬁability

of vector ! identiﬁability of support Settings: y = x0 + w, with w bounded noise. then x (y) is the unique solution of P (y, ) and ||x (¯ y) x0 || = O( ) Theorem 3 Suppose ARC( I ) < 1 and > K ||w|| 1 ARC( I ) Noisy and Support Criterion
120. ### sign support Noiseless Noisy F(s) = min w2Ker DJ ||⌦s

w||1 Vector identiﬁability Support identiﬁability Remember ! ARC(I) = max x 2GJ F(sign(D ⇤ I x))

to Numerics
122. ### How far are we from a necessary condition ? We

give a su cient condition for identiﬁability. From Theory to Numerics
123. ### Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
124. ### f l.s.c convex function from C convex of an Hilbert

H in R. Proximal Operator
125. ### Proximal operator proxf (x) = argmin u2RN ⇢ f(u) +

1 2 || u x ||2 2 f l.s.c convex function from C convex of an Hilbert H in R. Proximal Operator
126. ### Proximal operator proxf (x) = argmin u2RN ⇢ f(u) +

1 2 || u x ||2 2 f l.s.c convex function from C convex of an Hilbert H in R. Proximal Operator Fundamental examples: proxiC = PC prox||·||1 = S1 T .
127. ### min x 2RN L( K ( x )) where ⇢

L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?
128. ### Alternating Direction Method of Multipliers un = prox L⇤ (un

1 + K(zn 1)) xn = prox⌧G(xn 1 ⌧K ⇤ (un)) zn = xn + ✓(xn xn 1) [Chambolle, Pock] min x 2RN L( K ( x )) where ⇢ L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?
129. ### For P ( y, 0), ||y g||2 ! i{y} Alternating

Direction Method of Multipliers un = prox L⇤ (un 1 + K(zn 1)) xn = prox⌧G(xn 1 ⌧K ⇤ (un)) zn = xn + ✓(xn xn 1) [Chambolle, Pock] min x 2RN L( K ( x )) where ⇢ L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?
130. ### Computing Criterions Unconstrained formulation F(s) = min w2RN ||⌦s w||1

+ iD(w) Prox P||·||1=1 PD
131. ### Computing Criterions ARC(I) = max x 2GJ F(sign(D ⇤ I

x)) 6 wARC(I) = max s 2{ 1 , 1}|J| F(s) 6 oARC(I) = || ⌦ ||1!1 easy non-convex non-convex ARC di cult to compute (non-convex) Unconstrained formulation F(s) = min w2RN ||⌦s w||1 + iD(w) Prox P||·||1=1 PD
132. ### More on Signal Models ⇥ = [ k2{1...P } ⇥k

where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
133. ### Sparsity || D ⇤ x0 ||0 is not a good

parameter More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
134. ### D redundant Gaussian i.i.d matrix N ⇥ P || D

⇤ x0 ||0 < P N ) x0 = 0 ! Sparsity || D ⇤ x0 ||0 is not a good parameter More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
135. ### D redundant Gaussian i.i.d matrix N ⇥ P || D

⇤ x0 ||0 < P N ) x0 = 0 ! Sparsity || D ⇤ x0 ||0 is not a good parameter Good one : DOF( x ) = dim GJ More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
136. ### || x ||0 = DOF( x ) Credit to C.

Dossal Recovery rate Identiﬁability F (sign( D ⇤ x )) < 1 ARC( I ( D ⇤ x )) < 1 Compressed sensing : Q ⌧ N 1) Synthesis results Random Settings

138. ### ! Strong unstability Many dependancies between columns Random Settings 2)

Analysis results D, Gaussian i.i.d random matrices
139. ### ! Strong unstability Many dependancies between columns Close to `2

ball ! Random Settings 2) Analysis results D, Gaussian i.i.d random matrices

141. ### D⇤ = r, = Id Limits : TV Instability ⇥k

: piecewise constant signals with k 1 step.
142. ### D⇤ = r, = Id Limits : TV Instability “Box”

F(s) = 1 " +1 1 ⇥k : piecewise constant signals with k 1 step.
143. ### D⇤ = r, = Id Limits : TV Instability “Box”

F(s) = 1 " +1 1 +1 “Staircase” F(s) = 1 No noise stability even for small one +1 ⇥k : piecewise constant signals with k 1 step.
144. ### Fused Lasso argmin x 2RN 1 2 || y x

||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
145. ### Signal Model: Characteristic functions sum ⇥2 : x0 = 1[a,b]

+ 1[c,d] Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
146. ### Signal Model: Characteristic functions sum ⇥2 : x0 = 1[a,b]

+ 1[c,d] Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF Overlap 1 No overlap 1
147. ### [ a, b ] \ [ c, d ] 6=

; ) F ( x0) > 1 no noise robustness Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
148. ### [ a, b ] \ [ c, d ] =

; ) 2 situations Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
149. ### [ a, b ] \ [ c, d ] =

; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
150. ### [ a, b ] \ [ c, d ] =

; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") strong noise robustness F (sign( D ⇤ x0)) = ARC( I ) < 1 |c b| > ⇠(") Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
151. ### [ a, b ] \ [ c, d ] =

; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") strong noise robustness F (sign( D ⇤ x0)) = ARC( I ) < 1 |c b| > ⇠(") Haar : similar results Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

154. ### — Analysis regularization is robust — Geometry (union of subspaces)

: key concept for recovery Take-Away Messages
155. ### — Analysis regularization is robust — Geometry (union of subspaces)

: key concept for recovery — Sparsity is not univoquely deﬁned Take-Away Messages
156. ### Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
157. ### What’s Next ? Deterministic theorem ! treat the noise as

a random variable — Support identiﬁability with Gaussian, Poisson noise
158. ### — Total Variation identiﬁability Existence of a better criterion to

ensure noisy recovery ? What’s Next ? Deterministic theorem ! treat the noise as a random variable — Support identiﬁability with Gaussian, Poisson noise
159. ### — Total Variation identiﬁability Existence of a better criterion to

ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model Deterministic theorem ! treat the noise as a random variable — Support identiﬁability with Gaussian, Poisson noise
160. ### — Total Variation identiﬁability Existence of a better criterion to

ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model — Larger class of priors J Block sparsity || · ||p,q Deterministic theorem ! treat the noise as a random variable — Support identiﬁability with Gaussian, Poisson noise
161. ### — Total Variation identiﬁability Existence of a better criterion to

ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model — Larger class of priors J Block sparsity || · ||p,q — Real-world recovery results Almost equal support recovery Deterministic theorem ! treat the noise as a random variable — Support identiﬁability with Gaussian, Poisson noise
162. ### Joint work with — Gabriel Peyr´ e (CEREMADE, Dauphine) —

Charles Dossal (IMB, Bordeaux I) — Jalal Fadili (GREYC, ENSICAEN) Any questions ? Thanks
163. ### x (¯ y ) = A ⇤ ¯ y ADI

s An Affine Implicit Mapping
164. ### x (¯ y ) = A ⇤ ¯ y ADI

s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping
165. ### x (¯ y ) = A ⇤ ¯ y ADI

s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ
166. ### B : ( B|IJ = |GJ 1 B|I? J =

0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ
167. ### B : ( B|IJ = |GJ 1 B|I? J =

0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) E cient computation y = Bx = argmin D⇤z=0 || z x ||2 2 An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ
168. ### B : ( B|IJ = |GJ 1 B|I? J =

0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) C ✓ y µ ◆ = ✓ x 0 ◆ where C = ✓ ⇤ D D ⇤ 0 ◆ E cient computation y = Bx = argmin D⇤z=0 || z x ||2 2 An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ