Low Complexity Regularization of Inverse Problems

Gabriel Peyré www.numerical-tours.com Low Complexity Regularization of Inverse Problems Samuel
Vaiter Jalal Fadili Joint work with: VISI N

y = x0 + w 2 RP Inverse Problems Recovering
x0 RN from noisy observations

y = x0 + w 2 RP Examples: Inpainting, super-resolution,
. . . Inverse Problems Recovering x0 RN from noisy observations x0

x = ( p✓k )16k6K Inverse Problems in Medical Imaging

Magnetic resonance imaging (MRI): x = ( p✓k )16k6K x
= ( ˆ f ( ! ))!2⌦ Inverse Problems in Medical Imaging ˆ x

Magnetic resonance imaging (MRI): Other examples: MEG, EEG, . .
. x = ( p✓k )16k6K x = ( ˆ f ( ! ))!2⌦ Inverse Problems in Medical Imaging ˆ x

˜ x0 Compressed Sensing [Rice Univ.]

P measures N micro-mirrors ˜ x0 y [ i ]
= h x0, 'i i Compressed Sensing [Rice Univ.]

P/N = 0.16 P/N = 0.02 P/N = 1 P
measures N micro-mirrors ˜ x0 y [ i ] = h x0, 'i i Compressed Sensing [Rice Univ.]

Inverse Problem Regularization observations y parameter Estimator: x(y) depends only
on Observations: y = x0 + w 2 RP .

Inverse Problem Regularization observations y parameter Example: variational methods Estimator:
x(y) depends only on x ( y ) 2 argmin x 2RN 1 2 || y x ||2 + J ( x ) Data ﬁdelity Regularity Observations: y = x0 + w 2 RP .

J ( x0) Regularity of x0 Inverse Problem Regularization observations
y parameter Example: variational methods Estimator: x(y) depends only on x ( y ) 2 argmin x 2RN 1 2 || y x ||2 + J ( x ) Data ﬁdelity Regularity Observations: y = x0 + w 2 RP . Choice of : tradeo ||w|| Noise level

J ( x0) Regularity of x0 x ( y )
2 argmin x = y J ( x ) Inverse Problem Regularization observations y parameter Example: variational methods Estimator: x(y) depends only on x ( y ) 2 argmin x 2RN 1 2 || y x ||2 + J ( x ) Data ﬁdelity Regularity Observations: y = x0 + w 2 RP . No noise: 0+, minimize Choice of : tradeo ||w|| Noise level

J ( x0) Regularity of x0 x ( y )
2 argmin x = y J ( x ) Inverse Problem Regularization observations y parameter Example: variational methods Estimator: x(y) depends only on x ( y ) 2 argmin x 2RN 1 2 || y x ||2 + J ( x ) Data ﬁdelity Regularity Observations: y = x0 + w 2 RP . No noise: 0+, minimize Choice of : tradeo ||w|| Noise level ! Criteria on (x0, || w || , ) to ensure ||x(y) x0 || = O(||w||) model stability Performance analysis:

Overview • Low-complexity Convex Regularization • Performance Guarantees: L2 Error
• Performance Guarantees: Model Consistency

Coe cients x Image x M Union of Models for
Data Processing Synthesis sparsity: Union of models: M ⇢ RN subspaces or manifolds.

Data Processing Synthesis sparsity: Structured sparsity: Union of models: M ⇢ RN subspaces or manifolds.

Data Processing D Image x Gradient D ⇤ x Synthesis sparsity: Structured sparsity: Analysis sparsity: Union of models: M ⇢ RN subspaces or manifolds.

Coe cients x Image x Multi-spectral imaging: xi,· = Pr
j=1 Ai,jSj,· M Union of Models for Data Processing D Image x Gradient D ⇤ x Synthesis sparsity: Structured sparsity: Analysis sparsity: Low-rank: S1,· S2,· S3,· x Union of models: M ⇢ RN subspaces or manifolds.

J ( x ) = | x | Example: J
( x ) = | x |. Subdifferentials and Linear Models @J ( x ) Regularizer: J : RN ! R convex. Sub-di↵erential: @J ( x ) = { ⌘ ; 8 y, J ( y ) > J ( x ) + h ⌘, y x i}

I = supp( x ) = { i \ xi
6= 0} Example: J ( x ) = || x ||1 = P i | xi |. J ( x ) = | x | Example: J ( x ) = | x |. @J ( x ) Subdifferentials and Linear Models @ || x ||1 = ⇢ ⌘ \ supp( ⌘ ) = I, 8 j / 2 I, | ⌘j | 6 1 x @J ( x ) Regularizer: J : RN ! R convex. Sub-di↵erential: @J ( x ) = { ⌘ ; 8 y, J ( y ) > J ( x ) + h ⌘, y x i}

I = supp( x ) = { i \ xi
6= 0} Example: J ( x ) = || x ||1 = P i | xi |. J ( x ) = | x | Linear model: Example: J ( x ) = | x |. @J ( x ) T x Subdifferentials and Linear Models T x = {⌘ \ supp(⌘) = I} @ || x ||1 = ⇢ ⌘ \ supp( ⌘ ) = I, 8 j / 2 I, | ⌘j | 6 1 x @J ( x ) Tx = VectHull( @J ( x ))? Regularizer: J : RN ! R convex. Sub-di↵erential: @J ( x ) = { ⌘ ; 8 y, J ( y ) > J ( x ) + h ⌘, y x i}

[Lewis 2003] (i) J is C 2 along M x
around x ; Partly Smooth Functions J : RN ! R is partly smooth at x for a manifold M x x T x M x J(x) = max(0, || x || 1)

[Lewis 2003] (i) J is C 2 along M x
around x ; (ii) () 8 h 2 T x ?, t 7! J ( x + th ) is non-smooth at t = 0. (ii) VecHull( @J ( x ))? = Tx = Tangent x (M x ) ; Partly Smooth Functions J : RN ! R is partly smooth at x for a manifold M x x T x M x J(x) = max(0, || x || 1)

(iii) @J is continuous on M x around x. [Lewis
2003] (i) J is C 2 along M x around x ; (ii) () 8 h 2 T x ?, t 7! J ( x + th ) is non-smooth at t = 0. (ii) VecHull( @J ( x ))? = Tx = Tangent x (M x ) ; Partly Smooth Functions J : RN ! R is partly smooth at x for a manifold M x x T x M x J(x) = max(0, || x || 1)

(iii) @J is continuous on M x around x. [Lewis
2003] (i) J is C 2 along M x around x ; (ii) () 8 h 2 T x ?, t 7! J ( x + th ) is non-smooth at t = 0. (ii) VecHull( @J ( x ))? = Tx = Tangent x (M x ) ; Important: in general M x 6= T x Partly Smooth Functions J : RN ! R is partly smooth at x for a manifold M x x T x M x J(x) = max(0, || x || 1)

x M x Examples of Partly-smooth Regularizers J ( x
) = || x ||1 x 0 M x 0 ` 1 sparsity: J ( x ) = || x ||1 M x = Tx ={ z ; supp( z ) ⇢ supp( x )}

x M x same M x Examples of Partly-smooth Regularizers
J ( x ) = || x ||1 x 0 M x J ( x )=| x1 |+|| x2,3 || x 0 x M x 0 M x 0 ` 1 sparsity: J ( x ) = || x ||1 Structured sparsity: J ( x ) = P b || xb || M x = Tx ={ z ; supp( z ) ⇢ supp( x )}

x M x M x = { x ; rank(
z ) = rank( x )} same M x Examples of Partly-smooth Regularizers J ( x ) = || x ||1 x 0 M x J ( x ) = || x ||⇤ x M x J ( x )=| x1 |+|| x2,3 || x 0 x M x 0 M x 0 ` 1 sparsity: J ( x ) = || x ||1 Structured sparsity: J ( x ) = P b || xb || Nuclear norm: J ( x ) = || x ||⇤ M x = Tx ={ z ; supp( z ) ⇢ supp( x )}

x M x M x = { x ; rank(
z ) = rank( x )} same M x I = { i ; | xi | = || x ||1 } M x = Tx = { z ; zI / xI } Examples of Partly-smooth Regularizers J ( x ) = || x ||1 x 0 M x 0 M x J ( x ) = || x ||1 x x 0 M x J ( x ) = || x ||⇤ x M x J ( x )=| x1 |+|| x2,3 || x 0 x M x 0 M x 0 Anti-sparsity: J ( x ) = || x ||1 ` 1 sparsity: J ( x ) = || x ||1 Structured sparsity: J ( x ) = P b || xb || Nuclear norm: J ( x ) = || x ||⇤ M x = Tx ={ z ; supp( z ) ⇢ supp( x )}

x = x0 Dual Certificates Noiseless recovery: min x =
x0 J ( x ) (P0) x0

Dual certiﬁcates: x = x0 Proposition: D( x0) = Im(
⇤) \ @J ( x0) x0 solution of ( P0) () 9 ⌘ 2 D (x0) Dual Certificates ⌘ @J ( x0) Noiseless recovery: min x = x0 J ( x ) (P0) x0

Dual certiﬁcates: x = x0 Proposition: D( x0) = Im(
⇤) \ @J ( x0) x0 solution of ( P0) () 9 ⌘ 2 D (x0) Example: J ( x ) = || x ||1 x = x ? ' D( x0) = { ⌘ = x ? ' ; ⌘i = sign( x0,i) , || ⌘ ||1 6 1} Dual Certificates ⌘ @J ( x0) Noiseless recovery: min x = x0 J ( x ) (P0) x0 ⌘ ⌘

= interior for the topology of a↵( E ) ri(
E ) = relative interior of E Non degenerate dual certiﬁcate: Dual Certificates and L2 Stability x = x0 ⌘ @J ( x0) x ? ¯ D( x0) = Im( ⇤) \ ri( @J ( x0))

E ) = relative interior of E Non degenerate dual certiﬁcate: Dual Certificates and L2 Stability Theorem: [Fadili et al. 2013] If 9 ⌘ 2 ¯ D ( x0), for ⇠ ||w|| one has ||x? x0 || = O ( ||w|| ) x = x0 ⌘ @J ( x0) x ? ¯ D( x0) = Im( ⇤) \ ri( @J ( x0))

E ) = relative interior of E Non degenerate dual certiﬁcate: Dual Certificates and L2 Stability [Grassmair 2012]: J(x? x0) = O(||w||). [Grassmair, Haltmeier, Scherzer 2010]: J = || · ||1 . Theorem: [Fadili et al. 2013] If 9 ⌘ 2 ¯ D ( x0), for ⇠ ||w|| one has ||x? x0 || = O ( ||w|| ) x = x0 ⌘ @J ( x0) x ? ¯ D( x0) = Im( ⇤) \ ri( @J ( x0))

E ) = relative interior of E Non degenerate dual certiﬁcate: Dual Certificates and L2 Stability [Grassmair 2012]: J(x? x0) = O(||w||). [Grassmair, Haltmeier, Scherzer 2010]: J = || · ||1 . ! The constants depend on N . . . Theorem: [Fadili et al. 2013] If 9 ⌘ 2 ¯ D ( x0), for ⇠ ||w|| one has ||x? x0 || = O ( ||w|| ) x = x0 ⌘ @J ( x0) x ? ¯ D( x0) = Im( ⇤) \ ri( @J ( x0))

Random matrix: i,j ⇠ N(0, 1), i.i.d. 2 RP ⇥N
, Compressed Sensing Setting

Theorem: Random matrix: i,j ⇠ N(0, 1), i.i.d. Sparse vectors:
J = || · ||1. Let s = || x0 ||0. If Then 9 ⌘ 2 ¯ D (x0) with high probability on . 2 RP ⇥N , [Chandrasekaran et al. 2011] P > 2 s log ( N/s ) [Rudelson, Vershynin 2006] Compressed Sensing Setting

Theorem: Then 9 ⌘ 2 ¯ D (x0) with high
probability on . Theorem: Random matrix: i,j ⇠ N(0, 1), i.i.d. Sparse vectors: J = || · ||1. Low-rank matrices: J = || · ||⇤. Let s = || x0 ||0. If Let r = rank( x0). If Then 9 ⌘ 2 ¯ D (x0) with high probability on . 2 RP ⇥N , [Chandrasekaran et al. 2011] P > 2 s log ( N/s ) P > 3r(N1 + N2 r) x0 2 RN1 ⇥N2 [Rudelson, Vershynin 2006] Compressed Sensing Setting [Chandrasekaran et al. 2011]

Theorem: Then 9 ⌘ 2 ¯ D (x0) with high
probability on . Theorem: Random matrix: i,j ⇠ N(0, 1), i.i.d. Sparse vectors: J = || · ||1. Low-rank matrices: J = || · ||⇤. Let s = || x0 ||0. If Let r = rank( x0). If Then 9 ⌘ 2 ¯ D (x0) with high probability on . 2 RP ⇥N , ! Similar results for || · ||1,2, || · ||1. [Chandrasekaran et al. 2011] P > 2 s log ( N/s ) P > 3r(N1 + N2 r) x0 2 RN1 ⇥N2 [Rudelson, Vershynin 2006] Compressed Sensing Setting [Chandrasekaran et al. 2011]

From [Amelunxen et al. 2013] Phase Transitions P/N 0 1
s/N r/ p N J = || · ||1 J = || · ||⇤ THE GEOMETRY OF PHASE TRANSITIONS IN CONVEX OPTIMIZATION 0 25 50 75 100 0 25 50 75 100 0 10 20 30 0 300 600 900 FIGURE 2.2: Phase transitions for linear inverse problems. [left] Recovery of sparse vectors. The empiri probability that the `1 minimization problem (2.6) identiﬁes a sparse vector x 0 2 R100 given random line THE GEOMETRY OF PHASE TRANSITIONS IN CONVEX OPTIMIZATION 0 25 50 75 100 0 25 50 75 100 0 10 20 30 0 300 600 900 IGURE 2.2: Phase transitions for linear inverse problems. [left] Recovery of sparse vectors. The empirical robability that the `1 minimization problem (2.6) identiﬁes a sparse vector x 0 2 R100 given random linear 1 0 1 1 P/N

Minimal Norm Certificate ⌘0 = argmin ⌘ = ⇤ q
2 @J ( x0) ||q|| Minimal-norm certiﬁcate:

@J ( x0 ) ⇢ A( x0 ) = A↵Hull(
@J ( x0 )) Minimal Norm Certificate ⌘0 = argmin ⌘ = ⇤ q 2 @J ( x0) ||q|| Minimal-norm certiﬁcate: T =T x0 Case J = || · ||1 A( x0) x0 @J ( x0)

@J ( x0 ) ⇢ A( x0 ) = A↵Hull(
@J ( x0 )) Minimal Norm Certificate ⌘0 = argmin ⌘ = ⇤ q 2 @J ( x0) ||q|| Minimal-norm certiﬁcate: ⌘ F = argmin ⌘ = ⇤ q 2A( x0) ||q|| Linearized pre-certiﬁcate: T =T x0 Case J = || · ||1 A( x0) x0 @J ( x0)

! ⌘F is computed by solving a linear system. !
One does not always have ⌘F 2 D( x0 ) ! @J ( x0 ) ⇢ A( x0 ) = A↵Hull( @J ( x0 )) Minimal Norm Certificate ⌘0 = argmin ⌘ = ⇤ q 2 @J ( x0) ||q|| Minimal-norm certiﬁcate: ⌘ F = argmin ⌘ = ⇤ q 2A( x0) ||q|| Linearized pre-certiﬁcate: T =T x0 Case J = || · ||1 A( x0) x0 @J ( x0)

! ⌘F is computed by solving a linear system. !
One does not always have ⌘F 2 D( x0 ) ! @J ( x0 ) ⇢ A( x0 ) = A↵Hull( @J ( x0 )) Minimal Norm Certificate ⌘0 = argmin ⌘ = ⇤ q 2 @J ( x0) ||q|| Minimal-norm certiﬁcate: Theorem: If ker( ) \ T = {0}, ⇢ ⌘F 2 ¯ D( x0) =) ⌘F = ⌘0, ⌘0 2 ¯ D( x0) =) ⌘F = ⌘0. ⌘ F = argmin ⌘ = ⇤ q 2A( x0) ||q|| Linearized pre-certiﬁcate: T =T x0 Case J = || · ||1 A( x0) x0 @J ( x0)

[Vaiter et al. 2014] Model Stability Theorem: the unique solution
x ? of P (y) for y = x0 + w satisﬁes If ⌘F 2 ¯ D (x0), there exists C such that if max ( , ||w||/ ) 6 C x? 2 M x0 and ||x? x0 || = O(||w||, )

[Fuchs 2004]: J = || · ||1. [Bach 2008]: J
= || · ||1,2 and J = || · ||⇤. Previous works: [Vaiter et al. 2014] Model Stability Theorem: the unique solution x ? of P (y) for y = x0 + w satisﬁes If ⌘F 2 ¯ D (x0), there exists C such that if max ( , ||w||/ ) 6 C [Vaiter et al. 2011]: J = ||D⇤ · ||1. x? 2 M x0 and ||x? x0 || = O(||w||, )

Compressed Sensing Setting Theorem: Random matrix: i,j ⇠ N(0, 1),
i.i.d. Sparse vectors: J = || · ||1. Let s = || x0 ||0. If 2 RP ⇥N , Then ⌘0 2 ¯ D (x0) with high probability on . [Dossal et al. 2011] [Wainwright 2009] P > 2 s log( N )

P ⇠ 2 s log( N ) P ⇠ 2
s log( N/s ) L2 stability Model stability Phase transitions: vs. Compressed Sensing Setting Theorem: Random matrix: i,j ⇠ N(0, 1), i.i.d. Sparse vectors: J = || · ||1. Let s = || x0 ||0. If 2 RP ⇥N , Then ⌘0 2 ¯ D (x0) with high probability on . [Dossal et al. 2011] [Wainwright 2009] P > 2 s log( N )

! Similar results for || · ||1,2, || · ||⇤,
|| · ||1. P ⇠ 2 s log( N ) P ⇠ 2 s log( N/s ) L2 stability Model stability Phase transitions: vs. Compressed Sensing Setting Theorem: Random matrix: i,j ⇠ N(0, 1), i.i.d. Sparse vectors: J = || · ||1. Let s = || x0 ||0. If 2 RP ⇥N , Then ⌘0 2 ¯ D (x0) with high probability on . [Dossal et al. 2011] [Wainwright 2009] P > 2 s log( N )

! Similar results for || · ||1,2, || · ||⇤,
|| · ||1. ! Not using RIP technics (non-uniform result on x0). P ⇠ 2 s log( N ) P ⇠ 2 s log( N/s ) L2 stability Model stability Phase transitions: vs. Compressed Sensing Setting Theorem: Random matrix: i,j ⇠ N(0, 1), i.i.d. Sparse vectors: J = || · ||1. Let s = || x0 ||0. If 2 RP ⇥N , Then ⌘0 2 ¯ D (x0) with high probability on . [Dossal et al. 2011] [Wainwright 2009] P > 2 s log( N )

⇥x = i xi (· i) Increasing : reduces correlation.
reduces resolution. J ( x ) = || x ||1 1-D Sparse Spikes Deconvolution x0 x0

⇥x = i xi (· i) Increasing : reduces correlation.
reduces resolution. 0 10 2 support recovery. J ( x ) = || x ||1 () I = { j \ x0( j ) 6= 0} ||⌘F,Ic ||1 ||⌘F,Ic ||1 < 1 ⌘0 = ⌘F 2 ¯ D( x0) 1-D Sparse Spikes Deconvolution x0 x0 20 1 ()

Conclusion Partial smoothness: encodes models using singularities.

Performance measures L2 error model di↵erent CS guarantees Conclusion Partial
smoothness: encodes models using singularities.

Performance measures L2 error model di↵erent CS guarantees Speciﬁc certiﬁcate:
⌘0, ⌘F , . . . Conclusion Partial smoothness: encodes models using singularities.

Performance measures L2 error model di↵erent CS guarantees Specific certificate:
– CS performance with arbitrary gauges. – Infinite dimensional regularizations (BV, . . . ) – Convergence discrete ! continuous. ⌘0, ⌘F , . . . Conclusion Open problems: Partial smoothness: encodes models using singularities.

Low Complexity Regularization of Inverse Problems

Low Complexity Regularization of Inverse Problems

More Decks by Gabriel Peyré

Other Decks in Research

Featured

Transcript