Robust Sparse Analysis Recovery

Robust Sparse Analysis Recovery Samuel Vaiter

Inverse Problems

Inverse Problems Several problems Inpaiting Super-resolution

Inverse Problems ill-posed Linear hypothesis One model y = x0
+ w Observations Operator Unknown signal Noise Several problems Inpaiting Super-resolution

+ w Observations Operator Unknown signal Noise Several problems Inpaiting Super-resolution Regularization x? 2 argmin x 2RN 1 2 || y x ||2 2 + J ( x )

+ w Observations Operator Unknown signal Noise x? 2 argmin x = y J ( x ) Noiseless 0 Several problems Inpaiting Super-resolution Regularization x? 2 argmin x 2RN 1 2 || y x ||2 2 + J ( x )

Image Priors Sobolev J ( x ) = 1 2
Z ||r x ||2

Z ||r x ||2 Total variation J ( x ) = Z ||r x ||

Z ||r x ||2 (ideal prior) Wavelet sparsity J ( x ) = | { i \ h x, i i 6= 0} | Total variation J ( x ) = Z ||r x ||

Overview • Analysis vs. Synthesis Regularization • Local Parameterization of
Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives

Dictionary Redundant dictionary of RN : {di }P 1 i=0
, P > N

, P > N Identity Id

, P > N Identity Id shift invariant wavelet frame

ﬁnite di↵erence operator ⌦n DIF 0 B B B B
B B @ 1 0 +1 1 +1 ... ... 1 0 +1 1 C C C C C C A Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id shift invariant wavelet frame

ﬁnite di↵erence operator ⌦n DIF 0 B B B B
B B @ 1 0 +1 1 +1 ... ... 1 0 +1 1 C C C C C C A Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id shift invariant wavelet frame fussed lasso ⌦DIF "Id

Analysis versus Synthesis Two point of view

Analysis versus Synthesis Two point of view “Generate” x Synthesis
x = D↵ ↵ N P x ! non-unique if P > N

Analysis versus Synthesis Two point of view “Generate” x Synthesis
x = D↵ ↵ N P x ! non-unique if P > N “Analyze” x OR ? Analysis D ⇤ x = ↵ ↵ x N P

“Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6=
0} | A Bird’s Eye View of Sparsity

`0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = |
{i \ ↵i 6= 0} | A Bird’s Eye View of Sparsity

{i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity

{i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity d0 d1 q = 1 q = 0 q = 2 q = 1 5 . q = 0 5 .

`1 norm: convexiﬁcation of `0 prior `0 minimization is NP-hard
“Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity d0 d1 q = 1 q = 0 q = 2 q = 1 5 . q = 0 5 .

Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1
= D x = D↵ Sparse Regularizations

Analysis argmin x 2RN 1 2 || y x ||2
2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations

2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵

2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵

2 + || D ⇤ x ||1 Sparse approx. of x ? in D Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵

2 + || D ⇤ x ||1 Correlation of x ? and D sparse Sparse approx. of x ? in D Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵

Support and Signal Model I = supp( D ⇤ x
?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )

Ker D⇤ J = GJ Deﬁnition Support and Signal Model
I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )

Ker D⇤ J = GJ Deﬁnition Support and Signal Model
I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? 2 GJ ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

Ker D⇤ J = GJ Deﬁnition Hypothesis: Ker \ Ker
D⇤ = {0} Support and Signal Model I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? 2 GJ ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

Examples of Signal Model Identity ⇥k : k-sparse signals 1

Examples of Signal Model Credit to G. Peyr´ e shift
invariant wavelet frame Identity ⇥k : k-sparse signals 1

invariant wavelet frame ﬁnite di↵erence operator ⇥k : piecewise constant signals with k 1 steps 1 a1 a2 a3 Identity ⇥k : k-sparse signals 1

invariant wavelet frame ⇥k : sum of k interval characteristic functions fussed lasso a1 a2 a3 a4 a5 a6 a7 a8 1 ﬁnite di↵erence operator ⇥k : piecewise constant signals with k 1 steps 1 a1 a2 a3 Identity ⇥k : k-sparse signals 1

Synthesis Analysis ! 0 x? = argmin x = y
|| D ⇤ x ||1 P(y, 0) Remember ! ↵? = argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 P(y, ) x? = argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1

Local behavior ? Properties of x ? solution of P
(y, ) as a function of y Toward a Better Understanding

Local behavior ? Properties of x ? solution of P
(y, ) as a function of y Noiseless identiﬁability ? Is x0 the unique solution of P ( x0, 0) ? Toward a Better Understanding

Noise robustness ? What can we say about || x
? x0 || for noisy observations ? Local behavior ? Properties of x ? solution of P (y, ) as a function of y Noiseless identiﬁability ? Is x0 the unique solution of P ( x0, 0) ? Toward a Better Understanding

[Fuchs, Tropp, Dossal]: address these questions — Previous works in
synthesis From Synthesis to Analysis Results

[Fuchs, Tropp, Dossal]: address these questions — Previous works in
synthesis Geometry of the problem ? — Similar problem but much more di culties in analysis From Synthesis to Analysis Results

From Synthesis to Analysis Results

From Synthesis to Analysis Results d1 d2

G2 G1 From Synthesis to Analysis Results d1 d2

G2 G1 ||↵||1 = 1 From Synthesis to Analysis Results
d1 d2

d1 d2 y = ↵

d1 d2 y = ↵ ↵?

G2 G1 ||↵||1 = 1 sparsest solution From Synthesis to
Analysis Results d1 d2 y = ↵ ↵?

From Synthesis to Analysis Results

From Synthesis to Analysis Results d1 d2 d3

From Synthesis to Analysis Results d1 d2 d3 G3 G2
G1

G1 || D ⇤ x ||1 = 1

G1 || D ⇤ x ||1 = 1 y = x

G1 || D ⇤ x ||1 = 1 y = x x ?

Analysis is Piecewise Affine i.e solutions of P ( y,
) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable,

) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0

) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y = x

) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x

A ne function: ¯ y 7! x(¯ y) = A
⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x

Except for y 2 H , if ¯ y is
close enough from y , then x(¯ y) is a solution of P (¯ y, ). Theorem 1 A ne function: ¯ y 7! x(¯ y) = A ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x

Except for y 2 H , if ¯ y is
close enough from y , then x(¯ y) is a solution of P (¯ y, ). Theorem 1 A ne function: ¯ y 7! x(¯ y) = A ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, deﬁnition in few minutes GJ GJ0 y + " = x y = x

Sketch of the Proof Problem : Lasso x? 2 argmin
x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )

Sketch of the Proof Support I = supp( D ⇤
x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )

Sketch of the Proof Support I = supp( D ⇤
x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ

Hypothesis Ker \ GJ = {0} Sketch of the Proof
Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ

Hypothesis Ker \ GJ = {0} Sketch of the Proof
Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ — I, J, s = sign(D ⇤ x ? ) are ﬁxed by x ? — We ﬁx observations y

First Order Conditions x? 2 argmin x 2RN 1 2
|| y x ||2 2 + || D ⇤ x ||1 P(y, )

Non di↵erentiable problem x ? is a minimum of P
(y, ) if, and only if, 0 2 @f(x ? ) First Order Conditions x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )

Non di↵erentiable problem x ? is a minimum of P
(y, ) if, and only if, 0 2 @f(x ? ) First Order Conditions x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? solution of P (y, ) , 9 2 ⌃y(x ? ), || ||1 6 1 ⌃y( x ) = n 2 R|J| \ ⇤( x y ) + DI s + DJ = 0o First-order conditions of Lasso Gradient Subdi↵erential

x ( y ) 2 argmin x 2GJ 1 2
|| y x ||2 2 + || D ⇤ x ||1 A Solution of Lasso

x ( y ) 2 argmin x 2GJ 1 2
|| y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso

⇤ x ( y ) = ⇤ y DI s
DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso

⇤ x ( y ) = ⇤ y DI s
DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso Non-inversible

⇤ x ( y ) = ⇤ y DI s
DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible

⇤ x ( y ) = ⇤ y DI s
DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? x ( y ) = A ⇤ y ADI s ADJ A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible

⇤ x ( y ) = ⇤ y DI s
DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? x ( y ) = A ⇤ y ADI s ADJ A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible = 0 ( x ( y ) 2 GJ )

H = ⇢ y 2 RQ \ 9 x 2
RN : min 2⌃y( x ) || ||1 = 1 Transition Space

H : ﬁrst order conditions saturation ! “jump” from GJ
to GJ0 H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space

to GJ0 H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space x = GJ GJ0 2 H =

to GJ0 Open question Smallest union of subspace containing H ? H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space x = GJ GJ0 2 H =

¯ y 7! x (¯ y ) = A ⇤
¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) End of the Proof

¯ y 7! x (¯ y ) = A ⇤
¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) — Fix ¯ y close enough to have sign(D ⇤ x(y)) = sign(D ⇤ x(¯ y)) Sign stability End of the Proof

¯ y 7! x (¯ y ) = A ⇤
¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) — Check that x(¯ y) is indeed solution of P (¯ y, ) Use of ﬁrst order conditions — Fix ¯ y close enough to have sign(D ⇤ x(y)) = sign(D ⇤ x(¯ y)) Sign stability End of the Proof

x (¯ y ) = A ⇤ ¯ y ADI
s Remember !

x (¯ y ) = A ⇤ ¯ y ADI
s Remember ! Inverse of on GJ

— continuous y 7! x ?( y ) is :
x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ

x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne

Property given by sign stability x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne

Property given by sign stability Useful for : — Robustness study — SURE denoising risk estimation — Inverse problem on x x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne

Identiﬁability: x0 unique solution of P ( x0, 0) {
x0 } ? = argmin x = x0 || D ⇤ x ||1 Identifiability

x0 } ? = argmin x = x0 || D ⇤ x ||1 Strategy: P ( y, ) is almost P ( y, 0) for small values of Identifiability

x0 } ? = argmin x = x0 || D ⇤ x ||1 ! Restrictive condition ! But gives a stability results for small noise. Assumption: GJ must be stable for small values of Strategy: P ( y, ) is almost P ( y, 0) for small values of Identifiability

⌦ = D+ J ( ⇤ A Id)DI F(s) =
min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector Noiseless and Sign Criterion

⌦ = D+ J ( ⇤ A Id)DI F(s) =
min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) Noiseless and Sign Criterion

⌦ = D+ J ( ⇤ A Id)DI F(s) =
min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) If F (sign ( D ⇤ I x0)) < 1 then x0 is identiﬁable. Let x0 2 RN be a ﬁxed vector, and J = I c where I = I(D ⇤ x0). Theorem 2 Suppose that Ker \ GJ = { 0 } . Noiseless and Sign Criterion

⌦ = D+ J ( ⇤ A Id)DI F(s) =
min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) If F (sign ( D ⇤ I x0)) < 1 then x0 is identiﬁable. Let x0 2 RN be a ﬁxed vector, and J = I c where I = I(D ⇤ x0). Theorem 2 Suppose that Ker \ GJ = { 0 } . Specializes to Fuchs results for synthesis ( D = Id) Noiseless and Sign Criterion

Nam et al. Results Only other work on analysis recovery
[Nam 2011] “Cosparse” model

Nam et al. Results G(s) = || s||1 M⇤ orthonormal
basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model

Nam et al. Results G(s) = || s||1 M⇤ orthonormal
basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model Theorem Let x0 2 RN be a ﬁxed vector, and J = I c where I = I(D ⇤ x0). Suppose that Ker \ GJ = { 0 } . If G (sign ( D ⇤ I x0)) < 1 then x0 is identiﬁable.

! But no noise robustness, even for small ones More
intrinsic criterion Nam et al. Results G(s) = || s||1 M⇤ orthonormal basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model Theorem Let x0 2 RN be a ﬁxed vector, and J = I c where I = I(D ⇤ x0). Suppose that Ker \ GJ = { 0 } . If G (sign ( D ⇤ I x0)) < 1 then x0 is identiﬁable.

x ( x0) = A ⇤ x0 ADI s Idea:
Study P ( y, ) for ⇡ 0 Sketch of the Proof

small enough to have sign(D ⇤ x ( x0)) =
sign(D ⇤ x0) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 Sketch of the Proof

sign(D ⇤ x0) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 Sketch of the Proof

sign(D ⇤ x0) x ( x0) solution of P ( ) and x ( x0) ! !0 x0( x0) x0( x0) solution of P (0) ) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 Sketch of the Proof

sign(D ⇤ x0) x ( x0) solution of P ( ) and x ( x0) ! !0 x0( x0) x0( x0) solution of P (0) ) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 F(sign(D ⇤ x ( x0)) < 1 ) x ( x0) unique solution Sketch of the Proof

Suppose we observe y = x0 + w Does argmin
x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery

Generalization of Theorem 2 : Yes, if ||w|| small enough
Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery

Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) F (sign( D ⇤ x0)) < 1 gives • identiﬁability • small noise robustness Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery

Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) Question: And for an arbitrary noise ? F (sign( D ⇤ x0)) < 1 gives • identiﬁability • small noise robustness Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery

Settings: y = x0 + w, with w bounded noise.
Noisy and Support Criterion

identiﬁability of vector ! identiﬁability of support Settings: y =
x0 + w, with w bounded noise. Noisy and Support Criterion

ARC(I) = max x 2GJ F(sign(D ⇤ I x)) identiﬁability
of vector ! identiﬁability of support Settings: y = x0 + w, with w bounded noise. Noisy and Support Criterion

ARC(I) = max x 2GJ F(sign(D ⇤ I x)) identiﬁability
of vector ! identiﬁability of support Settings: y = x0 + w, with w bounded noise. then x (y) is the unique solution of P (y, ) and ||x (¯ y) x0 || = O( ) Theorem 3 Suppose ARC( I ) < 1 and > K ||w|| 1 ARC( I ) Noisy and Support Criterion

sign support Noiseless Noisy F(s) = min w2Ker DJ ||⌦s
w||1 Vector identiﬁability Support identiﬁability Remember ! ARC(I) = max x 2GJ F(sign(D ⇤ I x))

We give a su cient condition for identiﬁability. From Theory
to Numerics

How far are we from a necessary condition ? We
give a su cient condition for identiﬁability. From Theory to Numerics

f l.s.c convex function from C convex of an Hilbert
H in R. Proximal Operator

Proximal operator proxf (x) = argmin u2RN ⇢ f(u) +
1 2 || u x ||2 2 f l.s.c convex function from C convex of an Hilbert H in R. Proximal Operator

Proximal operator proxf (x) = argmin u2RN ⇢ f(u) +
1 2 || u x ||2 2 f l.s.c convex function from C convex of an Hilbert H in R. Proximal Operator Fundamental examples: proxiC = PC prox||·||1 = S1 T .

min x 2RN L( K ( x )) where ⇢
L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?

Alternating Direction Method of Multipliers un = prox L⇤ (un
1 + K(zn 1)) xn = prox⌧G(xn 1 ⌧K ⇤ (un)) zn = xn + ✓(xn xn 1) [Chambolle, Pock] min x 2RN L( K ( x )) where ⇢ L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?

For P ( y, 0), ||y g||2 ! i{y} Alternating
Direction Method of Multipliers un = prox L⇤ (un 1 + K(zn 1)) xn = prox⌧G(xn 1 ⌧K ⇤ (un)) zn = xn + ✓(xn xn 1) [Chambolle, Pock] min x 2RN L( K ( x )) where ⇢ L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?

Computing Criterions Unconstrained formulation F(s) = min w2RN ||⌦s w||1
+ iD(w) Prox P||·||1=1 PD

Computing Criterions ARC(I) = max x 2GJ F(sign(D ⇤ I
x)) 6 wARC(I) = max s 2{ 1 , 1}|J| F(s) 6 oARC(I) = || ⌦ ||1!1 easy non-convex non-convex ARC di cult to compute (non-convex) Unconstrained formulation F(s) = min w2RN ||⌦s w||1 + iD(w) Prox P||·||1=1 PD

More on Signal Models ⇥ = [ k2{1...P } ⇥k
where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

Sparsity || D ⇤ x0 ||0 is not a good
parameter More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

D redundant Gaussian i.i.d matrix N ⇥ P || D
⇤ x0 ||0 < P N ) x0 = 0 ! Sparsity || D ⇤ x0 ||0 is not a good parameter More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

D redundant Gaussian i.i.d matrix N ⇥ P || D
⇤ x0 ||0 < P N ) x0 = 0 ! Sparsity || D ⇤ x0 ||0 is not a good parameter Good one : DOF( x ) = dim GJ More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

|| x ||0 = DOF( x ) Credit to C.
Dossal Recovery rate Identiﬁability F (sign( D ⇤ x )) < 1 ARC( I ( D ⇤ x )) < 1 Compressed sensing : Q ⌧ N 1) Synthesis results Random Settings

Random Settings 2) Analysis results D, Gaussian i.i.d random matrices

! Strong unstability Many dependancies between columns Random Settings 2)
Analysis results D, Gaussian i.i.d random matrices

! Strong unstability Many dependancies between columns Close to `2
ball ! Random Settings 2) Analysis results D, Gaussian i.i.d random matrices

D⇤ = r, = Id Limits : TV Instability

D⇤ = r, = Id Limits : TV Instability ⇥k
: piecewise constant signals with k 1 step.

D⇤ = r, = Id Limits : TV Instability “Box”
F(s) = 1 " +1 1 ⇥k : piecewise constant signals with k 1 step.

D⇤ = r, = Id Limits : TV Instability “Box”
F(s) = 1 " +1 1 +1 “Staircase” F(s) = 1 No noise stability even for small one +1 ⇥k : piecewise constant signals with k 1 step.

Fused Lasso argmin x 2RN 1 2 || y x
||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

Signal Model: Characteristic functions sum ⇥2 : x0 = 1[a,b]
+ 1[c,d] Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

Signal Model: Characteristic functions sum ⇥2 : x0 = 1[a,b]
+ 1[c,d] Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF Overlap 1 No overlap 1

[ a, b ] \ [ c, d ] 6=
; ) F ( x0) > 1 no noise robustness Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

[ a, b ] \ [ c, d ] =
; ) 2 situations Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

[ a, b ] \ [ c, d ] =
; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

[ a, b ] \ [ c, d ] =
; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") strong noise robustness F (sign( D ⇤ x0)) = ARC( I ) < 1 |c b| > ⇠(") Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

[ a, b ] \ [ c, d ] =
; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") strong noise robustness F (sign( D ⇤ x0)) = ARC( I ) < 1 |c b| > ⇠(") Haar : similar results Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

Take-Away Messages

— Analysis regularization is robust Take-Away Messages

— Analysis regularization is robust — Geometry (union of subspaces)
: key concept for recovery Take-Away Messages

— Analysis regularization is robust — Geometry (union of subspaces)
: key concept for recovery — Sparsity is not univoquely deﬁned Take-Away Messages

What’s Next ? Deterministic theorem ! treat the noise as
a random variable — Support identiﬁability with Gaussian, Poisson noise

— Total Variation identiﬁability Existence of a better criterion to
ensure noisy recovery ? What’s Next ? Deterministic theorem ! treat the noise as a random variable — Support identiﬁability with Gaussian, Poisson noise

ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model Deterministic theorem ! treat the noise as a random variable — Support identiﬁability with Gaussian, Poisson noise

ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model — Larger class of priors J Block sparsity || · ||p,q Deterministic theorem ! treat the noise as a random variable — Support identiﬁability with Gaussian, Poisson noise

ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model — Larger class of priors J Block sparsity || · ||p,q — Real-world recovery results Almost equal support recovery Deterministic theorem ! treat the noise as a random variable — Support identiﬁability with Gaussian, Poisson noise

Joint work with — Gabriel Peyr´ e (CEREMADE, Dauphine) —
Charles Dossal (IMB, Bordeaux I) — Jalal Fadili (GREYC, ENSICAEN) Any questions ? Thanks

x (¯ y ) = A ⇤ ¯ y ADI
s An Affine Implicit Mapping

x (¯ y ) = A ⇤ ¯ y ADI
s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping

x (¯ y ) = A ⇤ ¯ y ADI
s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ

B : ( B|IJ = |GJ 1 B|I? J =
0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ

B : ( B|IJ = |GJ 1 B|I? J =
0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) E cient computation y = Bx = argmin D⇤z=0 || z x ||2 2 An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ

B : ( B|IJ = |GJ 1 B|I? J =
0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) C ✓ y µ ◆ = ✓ x 0 ◆ where C = ✓ ⇤ D D ⇤ 0 ◆ E cient computation y = Bx = argmin D⇤z=0 || z x ||2 2 An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ

Robust Sparse Analysis Recovery

Robust Sparse Analysis Recovery

More Decks by Samuel Vaiter

Other Decks in Science

Featured

Transcript