Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Robust Sparse Analysis Recovery

Sponsored · Your Podcast. Everywhere. Effortlessly. Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
Avatar for Samuel Vaiter Samuel Vaiter
September 12, 2011

Robust Sparse Analysis Recovery

GT Image (MSc defense), Paris-Dauphine, Paris, September 2011

Avatar for Samuel Vaiter

Samuel Vaiter

September 12, 2011
Tweet

More Decks by Samuel Vaiter

Other Decks in Science

Transcript

  1. Inverse Problems ill-posed Linear hypothesis One model y = x0

    + w Observations Operator Unknown signal Noise Several problems Inpaiting Super-resolution
  2. Inverse Problems ill-posed Linear hypothesis One model y = x0

    + w Observations Operator Unknown signal Noise Several problems Inpaiting Super-resolution Regularization x? 2 argmin x 2RN 1 2 || y x ||2 2 + J ( x )
  3. Inverse Problems ill-posed Linear hypothesis One model y = x0

    + w Observations Operator Unknown signal Noise x? 2 argmin x = y J ( x ) Noiseless 0 Several problems Inpaiting Super-resolution Regularization x? 2 argmin x 2RN 1 2 || y x ||2 2 + J ( x )
  4. Image Priors Sobolev J ( x ) = 1 2

    Z ||r x ||2 Total variation J ( x ) = Z ||r x ||
  5. Image Priors Sobolev J ( x ) = 1 2

    Z ||r x ||2 (ideal prior) Wavelet sparsity J ( x ) = | { i \ h x, i i 6= 0} | Total variation J ( x ) = Z ||r x ||
  6. Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

    Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
  7. Dictionary Redundant dictionary of RN : {di }P 1 i=0

    , P > N Identity Id shift invariant wavelet frame
  8. finite di↵erence operator ⌦n DIF 0 B B B B

    B B @ 1 0 +1 1 +1 ... ... 1 0 +1 1 C C C C C C A Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id shift invariant wavelet frame
  9. finite di↵erence operator ⌦n DIF 0 B B B B

    B B @ 1 0 +1 1 +1 ... ... 1 0 +1 1 C C C C C C A Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id shift invariant wavelet frame fussed lasso ⌦DIF "Id
  10. Analysis versus Synthesis Two point of view “Generate” x Synthesis

    x = D↵ ↵ N P x ! non-unique if P > N “Analyze” x OR ? Analysis D ⇤ x = ↵ ↵ x N P
  11. “Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6=

    0} | A Bird’s Eye View of Sparsity
  12. `0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = |

    {i \ ↵i 6= 0} | A Bird’s Eye View of Sparsity
  13. `0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = |

    {i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity
  14. `0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = |

    {i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity d0 d1 q = 1 q = 0 q = 2 q = 1 5 . q = 0 5 .
  15. `1 norm: convexification of `0 prior `0 minimization is NP-hard

    “Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity d0 d1 q = 1 q = 0 q = 2 q = 1 5 . q = 0 5 .
  16. Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1

    = D x = D↵ Sparse Regularizations
  17. Analysis argmin x 2RN 1 2 || y x ||2

    2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations
  18. Analysis argmin x 2RN 1 2 || y x ||2

    2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵
  19. Analysis argmin x 2RN 1 2 || y x ||2

    2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵
  20. Analysis argmin x 2RN 1 2 || y x ||2

    2 + || D ⇤ x ||1 Sparse approx. of x ? in D Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵
  21. Analysis argmin x 2RN 1 2 || y x ||2

    2 + || D ⇤ x ||1 Correlation of x ? and D sparse Sparse approx. of x ? in D Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵
  22. Support and Signal Model I = supp( D ⇤ x

    ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
  23. Ker D⇤ J = GJ Definition Support and Signal Model

    I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
  24. Ker D⇤ J = GJ Definition Support and Signal Model

    I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? 2 GJ ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  25. Ker D⇤ J = GJ Definition Hypothesis: Ker \ Ker

    D⇤ = {0} Support and Signal Model I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? 2 GJ ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  26. Examples of Signal Model Credit to G. Peyr´ e shift

    invariant wavelet frame Identity ⇥k : k-sparse signals 1
  27. Examples of Signal Model Credit to G. Peyr´ e shift

    invariant wavelet frame finite di↵erence operator ⇥k : piecewise constant signals with k 1 steps 1 a1 a2 a3 Identity ⇥k : k-sparse signals 1
  28. Examples of Signal Model Credit to G. Peyr´ e shift

    invariant wavelet frame ⇥k : sum of k interval characteristic functions fussed lasso a1 a2 a3 a4 a5 a6 a7 a8 1 finite di↵erence operator ⇥k : piecewise constant signals with k 1 steps 1 a1 a2 a3 Identity ⇥k : k-sparse signals 1
  29. Synthesis Analysis ! 0 x? = argmin x = y

    || D ⇤ x ||1 P(y, 0) Remember ! ↵? = argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 P(y, ) x? = argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1
  30. Local behavior ? Properties of x ? solution of P

    (y, ) as a function of y Toward a Better Understanding
  31. Local behavior ? Properties of x ? solution of P

    (y, ) as a function of y Noiseless identifiability ? Is x0 the unique solution of P ( x0, 0) ? Toward a Better Understanding
  32. Noise robustness ? What can we say about || x

    ? x0 || for noisy observations ? Local behavior ? Properties of x ? solution of P (y, ) as a function of y Noiseless identifiability ? Is x0 the unique solution of P ( x0, 0) ? Toward a Better Understanding
  33. [Fuchs, Tropp, Dossal]: address these questions — Previous works in

    synthesis From Synthesis to Analysis Results
  34. [Fuchs, Tropp, Dossal]: address these questions — Previous works in

    synthesis Geometry of the problem ? — Similar problem but much more di culties in analysis From Synthesis to Analysis Results
  35. G2 G1 ||↵||1 = 1 sparsest solution From Synthesis to

    Analysis Results d1 d2 y = ↵ ↵?
  36. From Synthesis to Analysis Results d1 d2 d3 G3 G2

    G1 || D ⇤ x ||1 = 1 y = x x ?
  37. Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

    Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
  38. Analysis is Piecewise Affine i.e solutions of P ( y,

    ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable,
  39. Analysis is Piecewise Affine i.e solutions of P ( y,

    ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0
  40. Analysis is Piecewise Affine i.e solutions of P ( y,

    ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y = x
  41. Analysis is Piecewise Affine i.e solutions of P ( y,

    ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x
  42. A ne function: ¯ y 7! x(¯ y) = A

    ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x
  43. Except for y 2 H , if ¯ y is

    close enough from y , then x(¯ y) is a solution of P (¯ y, ). Theorem 1 A ne function: ¯ y 7! x(¯ y) = A ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x
  44. Except for y 2 H , if ¯ y is

    close enough from y , then x(¯ y) is a solution of P (¯ y, ). Theorem 1 A ne function: ¯ y 7! x(¯ y) = A ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, definition in few minutes GJ GJ0 y + " = x y = x
  45. Sketch of the Proof Problem : Lasso x? 2 argmin

    x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
  46. Sketch of the Proof Support I = supp( D ⇤

    x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
  47. Sketch of the Proof Support I = supp( D ⇤

    x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ
  48. Hypothesis Ker \ GJ = {0} Sketch of the Proof

    Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ
  49. Hypothesis Ker \ GJ = {0} Sketch of the Proof

    Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ — I, J, s = sign(D ⇤ x ? ) are fixed by x ? — We fix observations y
  50. First Order Conditions x? 2 argmin x 2RN 1 2

    || y x ||2 2 + || D ⇤ x ||1 P(y, )
  51. Non di↵erentiable problem x ? is a minimum of P

    (y, ) if, and only if, 0 2 @f(x ? ) First Order Conditions x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )
  52. Non di↵erentiable problem x ? is a minimum of P

    (y, ) if, and only if, 0 2 @f(x ? ) First Order Conditions x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? solution of P (y, ) , 9 2 ⌃y(x ? ), || ||1 6 1 ⌃y( x ) = n 2 R|J| \ ⇤( x y ) + DI s + DJ = 0o First-order conditions of Lasso Gradient Subdi↵erential
  53. x ( y ) 2 argmin x 2GJ 1 2

    || y x ||2 2 + || D ⇤ x ||1 A Solution of Lasso
  54. x ( y ) 2 argmin x 2GJ 1 2

    || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso
  55. ⇤ x ( y ) = ⇤ y DI s

    DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso
  56. ⇤ x ( y ) = ⇤ y DI s

    DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso Non-inversible
  57. ⇤ x ( y ) = ⇤ y DI s

    DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible
  58. ⇤ x ( y ) = ⇤ y DI s

    DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? x ( y ) = A ⇤ y ADI s ADJ A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible
  59. ⇤ x ( y ) = ⇤ y DI s

    DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? x ( y ) = A ⇤ y ADI s ADJ A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible = 0 ( x ( y ) 2 GJ )
  60. H = ⇢ y 2 RQ \ 9 x 2

    RN : min 2⌃y( x ) || ||1 = 1 Transition Space
  61. H : first order conditions saturation ! “jump” from GJ

    to GJ0 H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space
  62. H : first order conditions saturation ! “jump” from GJ

    to GJ0 H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space x = GJ GJ0 2 H =
  63. H : first order conditions saturation ! “jump” from GJ

    to GJ0 Open question Smallest union of subspace containing H ? H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space x = GJ GJ0 2 H =
  64. ¯ y 7! x (¯ y ) = A ⇤

    ¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) End of the Proof
  65. ¯ y 7! x (¯ y ) = A ⇤

    ¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) — Fix ¯ y close enough to have sign(D ⇤ x(y)) = sign(D ⇤ x(¯ y)) Sign stability End of the Proof
  66. ¯ y 7! x (¯ y ) = A ⇤

    ¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) — Check that x(¯ y) is indeed solution of P (¯ y, ) Use of first order conditions — Fix ¯ y close enough to have sign(D ⇤ x(y)) = sign(D ⇤ x(¯ y)) Sign stability End of the Proof
  67. x (¯ y ) = A ⇤ ¯ y ADI

    s Remember ! Inverse of on GJ
  68. — continuous y 7! x ?( y ) is :

    x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ
  69. — continuous y 7! x ?( y ) is :

    x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne
  70. — continuous y 7! x ?( y ) is :

    Property given by sign stability x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne
  71. — continuous y 7! x ?( y ) is :

    Property given by sign stability Useful for : — Robustness study — SURE denoising risk estimation — Inverse problem on x x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne
  72. Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

    Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
  73. Identifiability: x0 unique solution of P ( x0, 0) {

    x0 } ? = argmin x = x0 || D ⇤ x ||1 Identifiability
  74. Identifiability: x0 unique solution of P ( x0, 0) {

    x0 } ? = argmin x = x0 || D ⇤ x ||1 Strategy: P ( y, ) is almost P ( y, 0) for small values of Identifiability
  75. Identifiability: x0 unique solution of P ( x0, 0) {

    x0 } ? = argmin x = x0 || D ⇤ x ||1 ! Restrictive condition ! But gives a stability results for small noise. Assumption: GJ must be stable for small values of Strategy: P ( y, ) is almost P ( y, 0) for small values of Identifiability
  76. ⌦ = D+ J ( ⇤ A Id)DI F(s) =

    min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector Noiseless and Sign Criterion
  77. ⌦ = D+ J ( ⇤ A Id)DI F(s) =

    min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) Noiseless and Sign Criterion
  78. ⌦ = D+ J ( ⇤ A Id)DI F(s) =

    min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) If F (sign ( D ⇤ I x0)) < 1 then x0 is identifiable. Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Theorem 2 Suppose that Ker \ GJ = { 0 } . Noiseless and Sign Criterion
  79. ⌦ = D+ J ( ⇤ A Id)DI F(s) =

    min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) If F (sign ( D ⇤ I x0)) < 1 then x0 is identifiable. Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Theorem 2 Suppose that Ker \ GJ = { 0 } . Specializes to Fuchs results for synthesis ( D = Id) Noiseless and Sign Criterion
  80. Nam et al. Results G(s) = || s||1 M⇤ orthonormal

    basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model
  81. Nam et al. Results G(s) = || s||1 M⇤ orthonormal

    basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model Theorem Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Suppose that Ker \ GJ = { 0 } . If G (sign ( D ⇤ I x0)) < 1 then x0 is identifiable.
  82. ! But no noise robustness, even for small ones More

    intrinsic criterion Nam et al. Results G(s) = || s||1 M⇤ orthonormal basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model Theorem Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Suppose that Ker \ GJ = { 0 } . If G (sign ( D ⇤ I x0)) < 1 then x0 is identifiable.
  83. x ( x0) = A ⇤ x0 ADI s Idea:

    Study P ( y, ) for ⇡ 0 Sketch of the Proof
  84. small enough to have sign(D ⇤ x ( x0)) =

    sign(D ⇤ x0) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 Sketch of the Proof
  85. small enough to have sign(D ⇤ x ( x0)) =

    sign(D ⇤ x0) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 Sketch of the Proof
  86. small enough to have sign(D ⇤ x ( x0)) =

    sign(D ⇤ x0) x ( x0) solution of P ( ) and x ( x0) ! !0 x0( x0) x0( x0) solution of P (0) ) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 Sketch of the Proof
  87. small enough to have sign(D ⇤ x ( x0)) =

    sign(D ⇤ x0) x ( x0) solution of P ( ) and x ( x0) ! !0 x0( x0) x0( x0) solution of P (0) ) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 F(sign(D ⇤ x ( x0)) < 1 ) x ( x0) unique solution Sketch of the Proof
  88. Suppose we observe y = x0 + w Does argmin

    x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
  89. Generalization of Theorem 2 : Yes, if ||w|| small enough

    Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
  90. Generalization of Theorem 2 : Yes, if ||w|| small enough

    Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) F (sign( D ⇤ x0)) < 1 gives • identifiability • small noise robustness Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
  91. Generalization of Theorem 2 : Yes, if ||w|| small enough

    Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) Question: And for an arbitrary noise ? F (sign( D ⇤ x0)) < 1 gives • identifiability • small noise robustness Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery
  92. Settings: y = x0 + w, with w bounded noise.

    Noisy and Support Criterion
  93. identifiability of vector ! identifiability of support Settings: y =

    x0 + w, with w bounded noise. Noisy and Support Criterion
  94. ARC(I) = max x 2GJ F(sign(D ⇤ I x)) identifiability

    of vector ! identifiability of support Settings: y = x0 + w, with w bounded noise. Noisy and Support Criterion
  95. ARC(I) = max x 2GJ F(sign(D ⇤ I x)) identifiability

    of vector ! identifiability of support Settings: y = x0 + w, with w bounded noise. then x (y) is the unique solution of P (y, ) and ||x (¯ y) x0 || = O( ) Theorem 3 Suppose ARC( I ) < 1 and > K ||w|| 1 ARC( I ) Noisy and Support Criterion
  96. sign support Noiseless Noisy F(s) = min w2Ker DJ ||⌦s

    w||1 Vector identifiability Support identifiability Remember ! ARC(I) = max x 2GJ F(sign(D ⇤ I x))
  97. How far are we from a necessary condition ? We

    give a su cient condition for identifiability. From Theory to Numerics
  98. Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

    Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
  99. Proximal operator proxf (x) = argmin u2RN ⇢ f(u) +

    1 2 || u x ||2 2 f l.s.c convex function from C convex of an Hilbert H in R. Proximal Operator
  100. Proximal operator proxf (x) = argmin u2RN ⇢ f(u) +

    1 2 || u x ||2 2 f l.s.c convex function from C convex of an Hilbert H in R. Proximal Operator Fundamental examples: proxiC = PC prox||·||1 = S1 T .
  101. min x 2RN L( K ( x )) where ⇢

    L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?
  102. Alternating Direction Method of Multipliers un = prox L⇤ (un

    1 + K(zn 1)) xn = prox⌧G(xn 1 ⌧K ⇤ (un)) zn = xn + ✓(xn xn 1) [Chambolle, Pock] min x 2RN L( K ( x )) where ⇢ L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?
  103. For P ( y, 0), ||y g||2 ! i{y} Alternating

    Direction Method of Multipliers un = prox L⇤ (un 1 + K(zn 1)) xn = prox⌧G(xn 1 ⌧K ⇤ (un)) zn = xn + ✓(xn xn 1) [Chambolle, Pock] min x 2RN L( K ( x )) where ⇢ L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?
  104. Computing Criterions ARC(I) = max x 2GJ F(sign(D ⇤ I

    x)) 6 wARC(I) = max s 2{ 1 , 1}|J| F(s) 6 oARC(I) = || ⌦ ||1!1 easy non-convex non-convex ARC di cult to compute (non-convex) Unconstrained formulation F(s) = min w2RN ||⌦s w||1 + iD(w) Prox P||·||1=1 PD
  105. More on Signal Models ⇥ = [ k2{1...P } ⇥k

    where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  106. Sparsity || D ⇤ x0 ||0 is not a good

    parameter More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  107. D redundant Gaussian i.i.d matrix N ⇥ P || D

    ⇤ x0 ||0 < P N ) x0 = 0 ! Sparsity || D ⇤ x0 ||0 is not a good parameter More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  108. D redundant Gaussian i.i.d matrix N ⇥ P || D

    ⇤ x0 ||0 < P N ) x0 = 0 ! Sparsity || D ⇤ x0 ||0 is not a good parameter Good one : DOF( x ) = dim GJ More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”
  109. || x ||0 = DOF( x ) Credit to C.

    Dossal Recovery rate Identifiability F (sign( D ⇤ x )) < 1 ARC( I ( D ⇤ x )) < 1 Compressed sensing : Q ⌧ N 1) Synthesis results Random Settings
  110. ! Strong unstability Many dependancies between columns Random Settings 2)

    Analysis results D, Gaussian i.i.d random matrices
  111. ! Strong unstability Many dependancies between columns Close to `2

    ball ! Random Settings 2) Analysis results D, Gaussian i.i.d random matrices
  112. D⇤ = r, = Id Limits : TV Instability ⇥k

    : piecewise constant signals with k 1 step.
  113. D⇤ = r, = Id Limits : TV Instability “Box”

    F(s) = 1 " +1 1 ⇥k : piecewise constant signals with k 1 step.
  114. D⇤ = r, = Id Limits : TV Instability “Box”

    F(s) = 1 " +1 1 +1 “Staircase” F(s) = 1 No noise stability even for small one +1 ⇥k : piecewise constant signals with k 1 step.
  115. Fused Lasso argmin x 2RN 1 2 || y x

    ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  116. Signal Model: Characteristic functions sum ⇥2 : x0 = 1[a,b]

    + 1[c,d] Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  117. Signal Model: Characteristic functions sum ⇥2 : x0 = 1[a,b]

    + 1[c,d] Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF Overlap 1 No overlap 1
  118. [ a, b ] \ [ c, d ] 6=

    ; ) F ( x0) > 1 no noise robustness Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  119. [ a, b ] \ [ c, d ] =

    ; ) 2 situations Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  120. [ a, b ] \ [ c, d ] =

    ; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  121. [ a, b ] \ [ c, d ] =

    ; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") strong noise robustness F (sign( D ⇤ x0)) = ARC( I ) < 1 |c b| > ⇠(") Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  122. [ a, b ] \ [ c, d ] =

    ; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") strong noise robustness F (sign( D ⇤ x0)) = ARC( I ) < 1 |c b| > ⇠(") Haar : similar results Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF
  123. — Analysis regularization is robust — Geometry (union of subspaces)

    : key concept for recovery Take-Away Messages
  124. — Analysis regularization is robust — Geometry (union of subspaces)

    : key concept for recovery — Sparsity is not univoquely defined Take-Away Messages
  125. Overview • Analysis vs. Synthesis Regularization • Local Parameterization of

    Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives
  126. What’s Next ? Deterministic theorem ! treat the noise as

    a random variable — Support identifiability with Gaussian, Poisson noise
  127. — Total Variation identifiability Existence of a better criterion to

    ensure noisy recovery ? What’s Next ? Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise
  128. — Total Variation identifiability Existence of a better criterion to

    ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise
  129. — Total Variation identifiability Existence of a better criterion to

    ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model — Larger class of priors J Block sparsity || · ||p,q Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise
  130. — Total Variation identifiability Existence of a better criterion to

    ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model — Larger class of priors J Block sparsity || · ||p,q — Real-world recovery results Almost equal support recovery Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise
  131. Joint work with — Gabriel Peyr´ e (CEREMADE, Dauphine) —

    Charles Dossal (IMB, Bordeaux I) — Jalal Fadili (GREYC, ENSICAEN) Any questions ? Thanks
  132. x (¯ y ) = A ⇤ ¯ y ADI

    s An Affine Implicit Mapping
  133. x (¯ y ) = A ⇤ ¯ y ADI

    s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping
  134. x (¯ y ) = A ⇤ ¯ y ADI

    s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ
  135. B : ( B|IJ = |GJ 1 B|I? J =

    0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ
  136. B : ( B|IJ = |GJ 1 B|I? J =

    0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) E cient computation y = Bx = argmin D⇤z=0 || z x ||2 2 An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ
  137. B : ( B|IJ = |GJ 1 B|I? J =

    0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) C ✓ y µ ◆ = ✓ x 0 ◆ where C = ✓ ⇤ D D ⇤ 0 ◆ E cient computation y = Bx = argmin D⇤z=0 || z x ||2 2 An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ