Robust Sparse Analysis Recovery

4807c637e2e5e8a5c5e68b287e8492a9?s=47 Samuel Vaiter
September 12, 2011

Robust Sparse Analysis Recovery

GT Image (MSc defense), Paris-Dauphine, Paris, September 2011

4807c637e2e5e8a5c5e68b287e8492a9?s=128

Samuel Vaiter

September 12, 2011
Tweet

Transcript

  1. Robust Sparse Analysis Recovery Samuel Vaiter

  2. Inverse Problems

  3. Inverse Problems Several problems Inpaiting Super-resolution

  4. Inverse Problems ill-posed Linear hypothesis One model y = x0

    + w Observations Operator Unknown signal Noise Several problems Inpaiting Super-resolution
  5. Inverse Problems ill-posed Linear hypothesis One model y = x0

    + w Observations Operator Unknown signal Noise Several problems Inpaiting Super-resolution Regularization x? 2 argmin x 2RN 1 2 || y x ||2 2 + J ( x )
  6. Inverse Problems ill-posed Linear hypothesis One model y = x0

    + w Observations Operator Unknown signal Noise x? 2 argmin x = y J ( x ) Noiseless 0 Several problems Inpaiting Super-resolution Regularization x? 2 argmin x 2RN 1 2 || y x ||2 2 + J ( x )
  7. Image Priors Sobolev J ( x ) = 1 2

    Z ||r x ||2
  8. Image Priors Sobolev J ( x ) = 1 2

    Z ||r x ||2 Total variation J ( x ) = Z ||r x ||
  9. Image Priors Sobolev J ( x ) = 1 2

    Z ||r x ||2 (ideal prior) Wavelet sparsity J ( x ) = | { i \ h x, i i 6= 0} | Total variation J ( x ) = Z ||r x ||
  10. Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

    Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
  11. Dictionary Redundant dictionary of RN : {di }P 1 i=0

    , P > N
  12. Dictionary Redundant dictionary of RN : {di }P 1 i=0

    , P > N Identity Id
  13. Dictionary Redundant dictionary of RN : {di }P 1 i=0

    , P > N Identity Id shift invariant wavelet frame
  14. finite di↵erence operator ⌦n DIF 0 B B B B

    B B @ 1 0 +1 1 +1 ... ... 1 0 +1 1 C C C C C C A Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id shift invariant wavelet frame
  15. finite di↵erence operator ⌦n DIF 0 B B B B

    B B @ 1 0 +1 1 +1 ... ... 1 0 +1 1 C C C C C C A Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id shift invariant wavelet frame fussed lasso ⌦DIF "Id
  16. Analysis versus Synthesis Two point of view

  17. Analysis versus Synthesis Two point of view “Generate” x Synthesis

    x = D↵ ↵ N P x ! non-unique if P > N
  18. Analysis versus Synthesis Two point of view “Generate” x Synthesis

    x = D↵ ↵ N P x ! non-unique if P > N “Analyze” x OR ? Analysis D ⇤ x = ↵ ↵ x N P
  19. “Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6=

    0} | A Bird’s Eye View of Sparsity
  20. `0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = |

    {i \ ↵i 6= 0} | A Bird’s Eye View of Sparsity
  21. `0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = |

    {i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity
  22. `0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = |

    {i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity d0 d1 q = 1 q = 0 q = 2 q = 1 5 . q = 0 5 .
  23. `1 norm: convexification of `0 prior `0 minimization is NP-hard

    “Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity d0 d1 q = 1 q = 0 q = 2 q = 1 5 . q = 0 5 .
  24. Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1

    = D x = D↵ Sparse Regularizations
  25. Analysis argmin x 2RN 1 2 || y x ||2

    2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations
  26. Analysis argmin x 2RN 1 2 || y x ||2

    2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵
  27. Analysis argmin x 2RN 1 2 || y x ||2

    2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵
  28. Analysis argmin x 2RN 1 2 || y x ||2

    2 + || D ⇤ x ||1 Sparse approx. of x ? in D Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵
  29. Analysis argmin x 2RN 1 2 || y x ||2

    2 + || D ⇤ x ||1 Correlation of x ? and D sparse Sparse approx. of x ? in D Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵
  30. Support and Signal Model I = supp( D ⇤ x

    ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
  31. Ker D⇤ J = GJ Definition Support and Signal Model

    I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
  32. Ker D⇤ J = GJ Definition Support and Signal Model

    I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? 2 GJ ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  33. Ker D⇤ J = GJ Definition Hypothesis: Ker \ Ker

    D⇤ = {0} Support and Signal Model I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? 2 GJ ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  34. Examples of Signal Model Identity ⇥k : k-sparse signals 1

  35. Examples of Signal Model Credit to G. Peyr´ e shift

    invariant wavelet frame Identity ⇥k : k-sparse signals 1
  36. Examples of Signal Model Credit to G. Peyr´ e shift

    invariant wavelet frame finite di↵erence operator ⇥k : piecewise constant signals with k 1 steps 1 a1 a2 a3 Identity ⇥k : k-sparse signals 1
  37. Examples of Signal Model Credit to G. Peyr´ e shift

    invariant wavelet frame ⇥k : sum of k interval characteristic functions fussed lasso a1 a2 a3 a4 a5 a6 a7 a8 1 finite di↵erence operator ⇥k : piecewise constant signals with k 1 steps 1 a1 a2 a3 Identity ⇥k : k-sparse signals 1
  38. Synthesis Analysis ! 0 x? = argmin x = y

    || D ⇤ x ||1 P(y, 0) Remember ! ↵? = argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 P(y, ) x? = argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1
  39. Local behavior ? Properties of x ? solution of P

    (y, ) as a function of y Toward a Better Understanding
  40. Local behavior ? Properties of x ? solution of P

    (y, ) as a function of y Noiseless identifiability ? Is x0 the unique solution of P ( x0, 0) ? Toward a Better Understanding
  41. Noise robustness ? What can we say about || x

    ? x0 || for noisy observations ? Local behavior ? Properties of x ? solution of P (y, ) as a function of y Noiseless identifiability ? Is x0 the unique solution of P ( x0, 0) ? Toward a Better Understanding
  42. [Fuchs, Tropp, Dossal]: address these questions — Previous works in

    synthesis From Synthesis to Analysis Results
  43. [Fuchs, Tropp, Dossal]: address these questions — Previous works in

    synthesis Geometry of the problem ? — Similar problem but much more di culties in analysis From Synthesis to Analysis Results
  44. From Synthesis to Analysis Results

  45. From Synthesis to Analysis Results d1 d2

  46. G2 G1 From Synthesis to Analysis Results d1 d2

  47. G2 G1 ||↵||1 = 1 From Synthesis to Analysis Results

    d1 d2
  48. G2 G1 ||↵||1 = 1 From Synthesis to Analysis Results

    d1 d2 y = ↵
  49. G2 G1 ||↵||1 = 1 From Synthesis to Analysis Results

    d1 d2 y = ↵
  50. G2 G1 ||↵||1 = 1 From Synthesis to Analysis Results

    d1 d2 y = ↵ ↵?
  51. G2 G1 ||↵||1 = 1 sparsest solution From Synthesis to

    Analysis Results d1 d2 y = ↵ ↵?
  52. From Synthesis to Analysis Results

  53. From Synthesis to Analysis Results d1 d2 d3

  54. From Synthesis to Analysis Results d1 d2 d3 G3 G2

    G1
  55. From Synthesis to Analysis Results d1 d2 d3 G3 G2

    G1 || D ⇤ x ||1 = 1
  56. From Synthesis to Analysis Results d1 d2 d3 G3 G2

    G1 || D ⇤ x ||1 = 1 y = x
  57. From Synthesis to Analysis Results d1 d2 d3 G3 G2

    G1 || D ⇤ x ||1 = 1 y = x
  58. From Synthesis to Analysis Results d1 d2 d3 G3 G2

    G1 || D ⇤ x ||1 = 1 y = x x ?
  59. Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

    Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
  60. Analysis is Piecewise Affine i.e solutions of P ( y,

    ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable,
  61. Analysis is Piecewise Affine i.e solutions of P ( y,

    ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0
  62. Analysis is Piecewise Affine i.e solutions of P ( y,

    ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y = x
  63. Analysis is Piecewise Affine i.e solutions of P ( y,

    ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x
  64. A ne function: ¯ y 7! x(¯ y) = A

    ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x
  65. Except for y 2 H , if ¯ y is

    close enough from y , then x(¯ y) is a solution of P (¯ y, ). Theorem 1 A ne function: ¯ y 7! x(¯ y) = A ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x
  66. Except for y 2 H , if ¯ y is

    close enough from y , then x(¯ y) is a solution of P (¯ y, ). Theorem 1 A ne function: ¯ y 7! x(¯ y) = A ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, definition in few minutes GJ GJ0 y + " = x y = x
  67. Sketch of the Proof Problem : Lasso x? 2 argmin

    x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
  68. Sketch of the Proof Support I = supp( D ⇤

    x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
  69. Sketch of the Proof Support I = supp( D ⇤

    x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ
  70. Hypothesis Ker \ GJ = {0} Sketch of the Proof

    Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ
  71. Hypothesis Ker \ GJ = {0} Sketch of the Proof

    Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ — I, J, s = sign(D ⇤ x ? ) are fixed by x ? — We fix observations y
  72. First Order Conditions x? 2 argmin x 2RN 1 2

    || y x ||2 2 + || D ⇤ x ||1 P(y, )
  73. Non di↵erentiable problem x ? is a minimum of P

    (y, ) if, and only if, 0 2 @f(x ? ) First Order Conditions x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
  74. Non di↵erentiable problem x ? is a minimum of P

    (y, ) if, and only if, 0 2 @f(x ? ) First Order Conditions x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? solution of P (y, ) , 9 2 ⌃y(x ? ), || ||1 6 1 ⌃y( x ) = n 2 R|J| \ ⇤( x y ) + DI s + DJ = 0o First-order conditions of Lasso Gradient Subdi↵erential
  75. x ( y ) 2 argmin x 2GJ 1 2

    || y x ||2 2 + || D ⇤ x ||1 A Solution of Lasso
  76. x ( y ) 2 argmin x 2GJ 1 2

    || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso
  77. ⇤ x ( y ) = ⇤ y DI s

    DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso
  78. ⇤ x ( y ) = ⇤ y DI s

    DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso Non-inversible
  79. ⇤ x ( y ) = ⇤ y DI s

    DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible
  80. ⇤ x ( y ) = ⇤ y DI s

    DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? x ( y ) = A ⇤ y ADI s ADJ A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible
  81. ⇤ x ( y ) = ⇤ y DI s

    DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? x ( y ) = A ⇤ y ADI s ADJ A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible = 0 ( x ( y ) 2 GJ )
  82. H = ⇢ y 2 RQ \ 9 x 2

    RN : min 2⌃y( x ) || ||1 = 1 Transition Space
  83. H : first order conditions saturation ! “jump” from GJ

    to GJ0 H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space
  84. H : first order conditions saturation ! “jump” from GJ

    to GJ0 H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space x = GJ GJ0 2 H =
  85. H : first order conditions saturation ! “jump” from GJ

    to GJ0 Open question Smallest union of subspace containing H ? H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space x = GJ GJ0 2 H =
  86. ¯ y 7! x (¯ y ) = A ⇤

    ¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) End of the Proof
  87. ¯ y 7! x (¯ y ) = A ⇤

    ¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) — Fix ¯ y close enough to have sign(D ⇤ x(y)) = sign(D ⇤ x(¯ y)) Sign stability End of the Proof
  88. ¯ y 7! x (¯ y ) = A ⇤

    ¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) — Check that x(¯ y) is indeed solution of P (¯ y, ) Use of first order conditions — Fix ¯ y close enough to have sign(D ⇤ x(y)) = sign(D ⇤ x(¯ y)) Sign stability End of the Proof
  89. x (¯ y ) = A ⇤ ¯ y ADI

    s Remember !
  90. x (¯ y ) = A ⇤ ¯ y ADI

    s Remember ! Inverse of on GJ
  91. — continuous y 7! x ?( y ) is :

    x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ
  92. — continuous y 7! x ?( y ) is :

    x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne
  93. — continuous y 7! x ?( y ) is :

    Property given by sign stability x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne
  94. — continuous y 7! x ?( y ) is :

    Property given by sign stability Useful for : — Robustness study — SURE denoising risk estimation — Inverse problem on x x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne
  95. Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

    Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
  96. Identifiability: x0 unique solution of P ( x0, 0) {

    x0 } ? = argmin x = x0 || D ⇤ x ||1 Identifiability
  97. Identifiability: x0 unique solution of P ( x0, 0) {

    x0 } ? = argmin x = x0 || D ⇤ x ||1 Strategy: P ( y, ) is almost P ( y, 0) for small values of Identifiability
  98. Identifiability: x0 unique solution of P ( x0, 0) {

    x0 } ? = argmin x = x0 || D ⇤ x ||1 ! Restrictive condition ! But gives a stability results for small noise. Assumption: GJ must be stable for small values of Strategy: P ( y, ) is almost P ( y, 0) for small values of Identifiability
  99. ⌦ = D+ J ( ⇤ A Id)DI F(s) =

    min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector Noiseless and Sign Criterion
  100. ⌦ = D+ J ( ⇤ A Id)DI F(s) =

    min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) Noiseless and Sign Criterion
  101. ⌦ = D+ J ( ⇤ A Id)DI F(s) =

    min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) If F (sign ( D ⇤ I x0)) < 1 then x0 is identifiable. Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Theorem 2 Suppose that Ker \ GJ = { 0 } . Noiseless and Sign Criterion
  102. ⌦ = D+ J ( ⇤ A Id)DI F(s) =

    min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) If F (sign ( D ⇤ I x0)) < 1 then x0 is identifiable. Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Theorem 2 Suppose that Ker \ GJ = { 0 } . Specializes to Fuchs results for synthesis ( D = Id) Noiseless and Sign Criterion
  103. Nam et al. Results Only other work on analysis recovery

    [Nam 2011] “Cosparse” model
  104. Nam et al. Results G(s) = || s||1 M⇤ orthonormal

    basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model
  105. Nam et al. Results G(s) = || s||1 M⇤ orthonormal

    basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model Theorem Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Suppose that Ker \ GJ = { 0 } . If G (sign ( D ⇤ I x0)) < 1 then x0 is identifiable.
  106. ! But no noise robustness, even for small ones More

    intrinsic criterion Nam et al. Results G(s) = || s||1 M⇤ orthonormal basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model Theorem Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Suppose that Ker \ GJ = { 0 } . If G (sign ( D ⇤ I x0)) < 1 then x0 is identifiable.
  107. x ( x0) = A ⇤ x0 ADI s Idea:

    Study P ( y, ) for ⇡ 0 Sketch of the Proof
  108. small enough to have sign(D ⇤ x ( x0)) =

    sign(D ⇤ x0) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 Sketch of the Proof
  109. small enough to have sign(D ⇤ x ( x0)) =

    sign(D ⇤ x0) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 Sketch of the Proof
  110. small enough to have sign(D ⇤ x ( x0)) =

    sign(D ⇤ x0) x ( x0) solution of P ( ) and x ( x0) ! !0 x0( x0) x0( x0) solution of P (0) ) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 Sketch of the Proof
  111. small enough to have sign(D ⇤ x ( x0)) =

    sign(D ⇤ x0) x ( x0) solution of P ( ) and x ( x0) ! !0 x0( x0) x0( x0) solution of P (0) ) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 F(sign(D ⇤ x ( x0)) < 1 ) x ( x0) unique solution Sketch of the Proof
  112. Suppose we observe y = x0 + w Does argmin

    x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
  113. Generalization of Theorem 2 : Yes, if ||w|| small enough

    Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
  114. Generalization of Theorem 2 : Yes, if ||w|| small enough

    Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) F (sign( D ⇤ x0)) < 1 gives • identifiability • small noise robustness Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
  115. Generalization of Theorem 2 : Yes, if ||w|| small enough

    Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) Question: And for an arbitrary noise ? F (sign( D ⇤ x0)) < 1 gives • identifiability • small noise robustness Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
  116. Settings: y = x0 + w, with w bounded noise.

    Noisy and Support Criterion
  117. identifiability of vector ! identifiability of support Settings: y =

    x0 + w, with w bounded noise. Noisy and Support Criterion
  118. ARC(I) = max x 2GJ F(sign(D ⇤ I x)) identifiability

    of vector ! identifiability of support Settings: y = x0 + w, with w bounded noise. Noisy and Support Criterion
  119. ARC(I) = max x 2GJ F(sign(D ⇤ I x)) identifiability

    of vector ! identifiability of support Settings: y = x0 + w, with w bounded noise. then x (y) is the unique solution of P (y, ) and ||x (¯ y) x0 || = O( ) Theorem 3 Suppose ARC( I ) < 1 and > K ||w|| 1 ARC( I ) Noisy and Support Criterion
  120. sign support Noiseless Noisy F(s) = min w2Ker DJ ||⌦s

    w||1 Vector identifiability Support identifiability Remember ! ARC(I) = max x 2GJ F(sign(D ⇤ I x))
  121. We give a su cient condition for identifiability. From Theory

    to Numerics
  122. How far are we from a necessary condition ? We

    give a su cient condition for identifiability. From Theory to Numerics
  123. Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

    Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
  124. f l.s.c convex function from C convex of an Hilbert

    H in R. Proximal Operator
  125. Proximal operator proxf (x) = argmin u2RN ⇢ f(u) +

    1 2 || u x ||2 2 f l.s.c convex function from C convex of an Hilbert H in R. Proximal Operator
  126. Proximal operator proxf (x) = argmin u2RN ⇢ f(u) +

    1 2 || u x ||2 2 f l.s.c convex function from C convex of an Hilbert H in R. Proximal Operator Fundamental examples: proxiC = PC prox||·||1 = S1 T .
  127. min x 2RN L( K ( x )) where ⇢

    L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?
  128. Alternating Direction Method of Multipliers un = prox L⇤ (un

    1 + K(zn 1)) xn = prox⌧G(xn 1 ⌧K ⇤ (un)) zn = xn + ✓(xn xn 1) [Chambolle, Pock] min x 2RN L( K ( x )) where ⇢ L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?
  129. For P ( y, 0), ||y g||2 ! i{y} Alternating

    Direction Method of Multipliers un = prox L⇤ (un 1 + K(zn 1)) xn = prox⌧G(xn 1 ⌧K ⇤ (un)) zn = xn + ✓(xn xn 1) [Chambolle, Pock] min x 2RN L( K ( x )) where ⇢ L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?
  130. Computing Criterions Unconstrained formulation F(s) = min w2RN ||⌦s w||1

    + iD(w) Prox P||·||1=1 PD
  131. Computing Criterions ARC(I) = max x 2GJ F(sign(D ⇤ I

    x)) 6 wARC(I) = max s 2{ 1 , 1}|J| F(s) 6 oARC(I) = || ⌦ ||1!1 easy non-convex non-convex ARC di cult to compute (non-convex) Unconstrained formulation F(s) = min w2RN ||⌦s w||1 + iD(w) Prox P||·||1=1 PD
  132. More on Signal Models ⇥ = [ k2{1...P } ⇥k

    where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  133. Sparsity || D ⇤ x0 ||0 is not a good

    parameter More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  134. D redundant Gaussian i.i.d matrix N ⇥ P || D

    ⇤ x0 ||0 < P N ) x0 = 0 ! Sparsity || D ⇤ x0 ||0 is not a good parameter More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  135. D redundant Gaussian i.i.d matrix N ⇥ P || D

    ⇤ x0 ||0 < P N ) x0 = 0 ! Sparsity || D ⇤ x0 ||0 is not a good parameter Good one : DOF( x ) = dim GJ More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  136. || x ||0 = DOF( x ) Credit to C.

    Dossal Recovery rate Identifiability F (sign( D ⇤ x )) < 1 ARC( I ( D ⇤ x )) < 1 Compressed sensing : Q ⌧ N 1) Synthesis results Random Settings
  137. Random Settings 2) Analysis results D, Gaussian i.i.d random matrices

  138. ! Strong unstability Many dependancies between columns Random Settings 2)

    Analysis results D, Gaussian i.i.d random matrices
  139. ! Strong unstability Many dependancies between columns Close to `2

    ball ! Random Settings 2) Analysis results D, Gaussian i.i.d random matrices
  140. D⇤ = r, = Id Limits : TV Instability

  141. D⇤ = r, = Id Limits : TV Instability ⇥k

    : piecewise constant signals with k 1 step.
  142. D⇤ = r, = Id Limits : TV Instability “Box”

    F(s) = 1 " +1 1 ⇥k : piecewise constant signals with k 1 step.
  143. D⇤ = r, = Id Limits : TV Instability “Box”

    F(s) = 1 " +1 1 +1 “Staircase” F(s) = 1 No noise stability even for small one +1 ⇥k : piecewise constant signals with k 1 step.
  144. Fused Lasso argmin x 2RN 1 2 || y x

    ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  145. Signal Model: Characteristic functions sum ⇥2 : x0 = 1[a,b]

    + 1[c,d] Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  146. Signal Model: Characteristic functions sum ⇥2 : x0 = 1[a,b]

    + 1[c,d] Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF Overlap 1 No overlap 1
  147. [ a, b ] \ [ c, d ] 6=

    ; ) F ( x0) > 1 no noise robustness Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  148. [ a, b ] \ [ c, d ] =

    ; ) 2 situations Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  149. [ a, b ] \ [ c, d ] =

    ; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  150. [ a, b ] \ [ c, d ] =

    ; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") strong noise robustness F (sign( D ⇤ x0)) = ARC( I ) < 1 |c b| > ⇠(") Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  151. [ a, b ] \ [ c, d ] =

    ; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") strong noise robustness F (sign( D ⇤ x0)) = ARC( I ) < 1 |c b| > ⇠(") Haar : similar results Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  152. Take-Away Messages

  153. — Analysis regularization is robust Take-Away Messages

  154. — Analysis regularization is robust — Geometry (union of subspaces)

    : key concept for recovery Take-Away Messages
  155. — Analysis regularization is robust — Geometry (union of subspaces)

    : key concept for recovery — Sparsity is not univoquely defined Take-Away Messages
  156. Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

    Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
  157. What’s Next ? Deterministic theorem ! treat the noise as

    a random variable — Support identifiability with Gaussian, Poisson noise
  158. — Total Variation identifiability Existence of a better criterion to

    ensure noisy recovery ? What’s Next ? Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise
  159. — Total Variation identifiability Existence of a better criterion to

    ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise
  160. — Total Variation identifiability Existence of a better criterion to

    ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model — Larger class of priors J Block sparsity || · ||p,q Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise
  161. — Total Variation identifiability Existence of a better criterion to

    ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model — Larger class of priors J Block sparsity || · ||p,q — Real-world recovery results Almost equal support recovery Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise
  162. Joint work with — Gabriel Peyr´ e (CEREMADE, Dauphine) —

    Charles Dossal (IMB, Bordeaux I) — Jalal Fadili (GREYC, ENSICAEN) Any questions ? Thanks
  163. x (¯ y ) = A ⇤ ¯ y ADI

    s An Affine Implicit Mapping
  164. x (¯ y ) = A ⇤ ¯ y ADI

    s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping
  165. x (¯ y ) = A ⇤ ¯ y ADI

    s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ
  166. B : ( B|IJ = |GJ 1 B|I? J =

    0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ
  167. B : ( B|IJ = |GJ 1 B|I? J =

    0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) E cient computation y = Bx = argmin D⇤z=0 || z x ||2 2 An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ
  168. B : ( B|IJ = |GJ 1 B|I? J =

    0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) C ✓ y µ ◆ = ✓ x 0 ◆ where C = ✓ ⇤ D D ⇤ 0 ◆ E cient computation y = Bx = argmin D⇤z=0 || z x ||2 2 An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ