y parameter Example: variational methods Estimator: x(y) depends only on x ( y ) 2 argmin x 2RN 1 2 || y x ||2 + J ( x ) Data fidelity Regularity Observations: y = x0 + w 2 RP . Choice of : tradeo ||w|| Noise level
2 argmin x = y J ( x ) Inverse Problem Regularization observations y parameter Example: variational methods Estimator: x(y) depends only on x ( y ) 2 argmin x 2RN 1 2 || y x ||2 + J ( x ) Data fidelity Regularity Observations: y = x0 + w 2 RP . No noise: 0+, minimize Choice of : tradeo ||w|| Noise level
2 argmin x = y J ( x ) Inverse Problem Regularization observations y parameter Example: variational methods Estimator: x(y) depends only on x ( y ) 2 argmin x 2RN 1 2 || y x ||2 + J ( x ) Data fidelity Regularity Observations: y = x0 + w 2 RP . No noise: 0+, minimize Choice of : tradeo ||w|| Noise level ! Criteria on (x0, || w || , ) to ensure ||x(y) x0 || = O(||w||) model stability Performance analysis:
j=1 Ai,jSj,· M Union of Models for Data Processing D Image x Gradient D ⇤ x Synthesis sparsity: Structured sparsity: Analysis sparsity: Low-rank: S1,· S2,· S3,· x Union of models: M ⇢ RN subspaces or manifolds.
( x ) = | x |. Subdifferentials and Linear Models @J ( x ) Regularizer: J : RN ! R convex. Sub-di↵erential: @J ( x ) = { ⌘ ; 8 y, J ( y ) > J ( x ) + h ⌘, y x i}
6= 0} Example: J ( x ) = || x ||1 = P i | xi |. J ( x ) = | x | Example: J ( x ) = | x |. @J ( x ) Subdifferentials and Linear Models @ || x ||1 = ⇢ ⌘ \ supp( ⌘ ) = I, 8 j / 2 I, | ⌘j | 6 1 x @J ( x ) Regularizer: J : RN ! R convex. Sub-di↵erential: @J ( x ) = { ⌘ ; 8 y, J ( y ) > J ( x ) + h ⌘, y x i}
6= 0} Example: J ( x ) = || x ||1 = P i | xi |. J ( x ) = | x | Linear model: Example: J ( x ) = | x |. @J ( x ) T x Subdifferentials and Linear Models T x = {⌘ \ supp(⌘) = I} @ || x ||1 = ⇢ ⌘ \ supp( ⌘ ) = I, 8 j / 2 I, | ⌘j | 6 1 x @J ( x ) Tx = VectHull( @J ( x ))? Regularizer: J : RN ! R convex. Sub-di↵erential: @J ( x ) = { ⌘ ; 8 y, J ( y ) > J ( x ) + h ⌘, y x i}
around x ; (ii) () 8 h 2 T x ?, t 7! J ( x + th ) is non-smooth at t = 0. (ii) VecHull( @J ( x ))? = Tx = Tangent x (M x ) ; Partly Smooth Functions J : RN ! R is partly smooth at x for a manifold M x x T x M x J(x) = max(0, || x || 1)
2003] (i) J is C 2 along M x around x ; (ii) () 8 h 2 T x ?, t 7! J ( x + th ) is non-smooth at t = 0. (ii) VecHull( @J ( x ))? = Tx = Tangent x (M x ) ; Partly Smooth Functions J : RN ! R is partly smooth at x for a manifold M x x T x M x J(x) = max(0, || x || 1)
2003] (i) J is C 2 along M x around x ; (ii) () 8 h 2 T x ?, t 7! J ( x + th ) is non-smooth at t = 0. (ii) VecHull( @J ( x ))? = Tx = Tangent x (M x ) ; Important: in general M x 6= T x Partly Smooth Functions J : RN ! R is partly smooth at x for a manifold M x x T x M x J(x) = max(0, || x || 1)
J ( x ) = || x ||1 x 0 M x J ( x )=| x1 |+|| x2,3 || x 0 x M x 0 M x 0 ` 1 sparsity: J ( x ) = || x ||1 Structured sparsity: J ( x ) = P b || xb || M x = Tx ={ z ; supp( z ) ⇢ supp( x )}
z ) = rank( x )} same M x Examples of Partly-smooth Regularizers J ( x ) = || x ||1 x 0 M x J ( x ) = || x ||⇤ x M x J ( x )=| x1 |+|| x2,3 || x 0 x M x 0 M x 0 ` 1 sparsity: J ( x ) = || x ||1 Structured sparsity: J ( x ) = P b || xb || Nuclear norm: J ( x ) = || x ||⇤ M x = Tx ={ z ; supp( z ) ⇢ supp( x )}
z ) = rank( x )} same M x I = { i ; | xi | = || x ||1 } M x = Tx = { z ; zI / xI } Examples of Partly-smooth Regularizers J ( x ) = || x ||1 x 0 M x 0 M x J ( x ) = || x ||1 x x 0 M x J ( x ) = || x ||⇤ x M x J ( x )=| x1 |+|| x2,3 || x 0 x M x 0 M x 0 Anti-sparsity: J ( x ) = || x ||1 ` 1 sparsity: J ( x ) = || x ||1 Structured sparsity: J ( x ) = P b || xb || Nuclear norm: J ( x ) = || x ||⇤ M x = Tx ={ z ; supp( z ) ⇢ supp( x )}
E ) = relative interior of E Non degenerate dual certificate: Dual Certificates and L2 Stability Theorem: [Fadili et al. 2013] If 9 ⌘ 2 ¯ D ( x0), for ⇠ ||w|| one has ||x? x0 || = O ( ||w|| ) x = x0 ⌘ @J ( x0) x ? ¯ D( x0) = Im( ⇤) \ ri( @J ( x0))
J = || · ||1. Let s = || x0 ||0. If Then 9 ⌘ 2 ¯ D (x0) with high probability on . 2 RP ⇥N , [Chandrasekaran et al. 2011] P > 2 s log ( N/s ) [Rudelson, Vershynin 2006] Compressed Sensing Setting
probability on . Theorem: Random matrix: i,j ⇠ N(0, 1), i.i.d. Sparse vectors: J = || · ||1. Low-rank matrices: J = || · ||⇤. Let s = || x0 ||0. If Let r = rank( x0). If Then 9 ⌘ 2 ¯ D (x0) with high probability on . 2 RP ⇥N , [Chandrasekaran et al. 2011] P > 2 s log ( N/s ) P > 3r(N1 + N2 r) x0 2 RN1 ⇥N2 [Rudelson, Vershynin 2006] Compressed Sensing Setting [Chandrasekaran et al. 2011]
probability on . Theorem: Random matrix: i,j ⇠ N(0, 1), i.i.d. Sparse vectors: J = || · ||1. Low-rank matrices: J = || · ||⇤. Let s = || x0 ||0. If Let r = rank( x0). If Then 9 ⌘ 2 ¯ D (x0) with high probability on . 2 RP ⇥N , ! Similar results for || · ||1,2, || · ||1. [Chandrasekaran et al. 2011] P > 2 s log ( N/s ) P > 3r(N1 + N2 r) x0 2 RN1 ⇥N2 [Rudelson, Vershynin 2006] Compressed Sensing Setting [Chandrasekaran et al. 2011]
s/N r/ p N J = || · ||1 J = || · ||⇤ THE GEOMETRY OF PHASE TRANSITIONS IN CONVEX OPTIMIZATION 0 25 50 75 100 0 25 50 75 100 0 10 20 30 0 300 600 900 FIGURE 2.2: Phase transitions for linear inverse problems. [left] Recovery of sparse vectors. The empiri probability that the `1 minimization problem (2.6) identifies a sparse vector x 0 2 R100 given random line THE GEOMETRY OF PHASE TRANSITIONS IN CONVEX OPTIMIZATION 0 25 50 75 100 0 25 50 75 100 0 10 20 30 0 300 600 900 IGURE 2.2: Phase transitions for linear inverse problems. [left] Recovery of sparse vectors. The empirical robability that the `1 minimization problem (2.6) identifies a sparse vector x 0 2 R100 given random linear 1 0 1 1 P/N
= || · ||1,2 and J = || · ||⇤. Previous works: [Vaiter et al. 2014] Model Stability Theorem: the unique solution x ? of P (y) for y = x0 + w satisfies If ⌘F 2 ¯ D (x0), there exists C such that if max ( , ||w||/ ) 6 C [Vaiter et al. 2011]: J = ||D⇤ · ||1. x? 2 M x0 and ||x? x0 || = O(||w||, )
i.i.d. Sparse vectors: J = || · ||1. Let s = || x0 ||0. If 2 RP ⇥N , Then ⌘0 2 ¯ D (x0) with high probability on . [Dossal et al. 2011] [Wainwright 2009] P > 2 s log( N )
s log( N/s ) L2 stability Model stability Phase transitions: vs. Compressed Sensing Setting Theorem: Random matrix: i,j ⇠ N(0, 1), i.i.d. Sparse vectors: J = || · ||1. Let s = || x0 ||0. If 2 RP ⇥N , Then ⌘0 2 ¯ D (x0) with high probability on . [Dossal et al. 2011] [Wainwright 2009] P > 2 s log( N )
|| · ||1. P ⇠ 2 s log( N ) P ⇠ 2 s log( N/s ) L2 stability Model stability Phase transitions: vs. Compressed Sensing Setting Theorem: Random matrix: i,j ⇠ N(0, 1), i.i.d. Sparse vectors: J = || · ||1. Let s = || x0 ||0. If 2 RP ⇥N , Then ⌘0 2 ¯ D (x0) with high probability on . [Dossal et al. 2011] [Wainwright 2009] P > 2 s log( N )
|| · ||1. ! Not using RIP technics (non-uniform result on x0). P ⇠ 2 s log( N ) P ⇠ 2 s log( N/s ) L2 stability Model stability Phase transitions: vs. Compressed Sensing Setting Theorem: Random matrix: i,j ⇠ N(0, 1), i.i.d. Sparse vectors: J = || · ||1. Let s = || x0 ||0. If 2 RP ⇥N , Then ⌘0 2 ¯ D (x0) with high probability on . [Dossal et al. 2011] [Wainwright 2009] P > 2 s log( N )