Recovery Guarantees for Low Complexity Models

4807c637e2e5e8a5c5e68b287e8492a9?s=47 Samuel Vaiter
October 24, 2013

Recovery Guarantees for Low Complexity Models

GT Image, ENSICAEN, Caen, October 2013.

4807c637e2e5e8a5c5e68b287e8492a9?s=128

Samuel Vaiter

October 24, 2013
Tweet

Transcript

  1. Recovery Guarantees for Low Complexity Models Samuel Vaiter CEREMADE, Univ.

    Paris-Dauphine vaiter@ceremade.dauphine.fr October 24, 2013 Séminaire Image, GREYC, ENSICAEN
  2. People J. Fadili G. Peyré C. Dossal M. Golbabaee C.

    Deledalle IMB GREYC CEREMADE
  3. Papers V., M. Golbabaee, M. J. Fadili et G. Peyré,

    Model Selection with Piecewise Regular Gauges, Tech. report, http://arxiv.org/abs/1307.2342, 2013 J. Fadili, V. and G. Peyré, Linear Convergence Rates for Gauge Regularization, ongoing work
  4. Outline Variational Estimator Gauge and Model Space 2 Robustness and

    Model Selection Some Examples
  5. Linear Inverse Problem denoising inpainting deblurring

  6. Linear Inverse Problem : Forward Model y = Φ x0

    + w y ∈ RQ observations Φ ∈ RQ×N linear operator x0 ∈ RN unknown vector w ∈ RQ realization of a noise (bounded here)
  7. Linear Inverse Problem : Forward Model y = Φ x0

    + w y ∈ RQ observations Φ ∈ RQ×N linear operator x0 ∈ RN unknown vector w ∈ RQ realization of a noise (bounded here) Objective: Recover x0 from y.
  8. Linear Inverse Problem : Forward Model y = Φ x0

    + w y ∈ RQ observations Φ ∈ RQ×N linear operator x0 ∈ RN unknown vector w ∈ RQ realization of a noise (bounded here) Objective: Recover x0 from y. Φ ill-posed
  9. Designing an Estimator x0 y degradation

  10. Designing an Estimator x0 y degradation xM (y) model M

  11. Designing an Estimator x0 y degradation xM (y) model M

    error estimation
  12. The Variational Approach x ∈ argmin x∈RN F(y, x) +

    λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization
  13. The Variational Approach x ∈ argmin x∈RN F(y, x) +

    λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization • Data fidelity: 2 loss, logistic, etc. F(y, x) = 1 2 ||y − Φx||2 2
  14. The Variational Approach x ∈ argmin x∈RN F(y, x) +

    λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization • Data fidelity: 2 loss, logistic, etc. F(y, x) = 1 2 ||y − Φx||2 2 • Parameter: By hand or automatic like SURE.
  15. The Variational Approach x ∈ argmin x∈RN F(y, x) +

    λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization • Data fidelity: 2 loss, logistic, etc. F(y, x) = 1 2 ||y − Φx||2 2 • Parameter: By hand or automatic like SURE. • Regularization: ?
  16. A Zoo ... ? Block Sparsity Nuclear Norm Trace Lasso

    Polyhedral Antisparsity Total Variation
  17. Relations to Previous Works • Fuchs, J. J. (2004). On

    sparse representations in arbitrary redundant bases. • Tropp, J. A. (2006). Just relax: Convex programming methods for identifying sparse signals in noise. • Grasmair, M. and al. (2008). Sparse regularization with q penalty term. • Bach, F. R. (2008). Consistency of the group lasso and multiple kernel learning & Consistency of trace norm minimization. • V. and al. (2011). Robust sparse analysis regularization. • Grasmair, M. and al. (2011). Necessary and sufficient conditions for linear convergence of 1-regularization. • Grasmair, M. (2011). Linear convergence rates for Tikhonov regularization with positively homogeneous functionals. (and more !)
  18. Outline Variational Estimator Gauge and Model Space 2 Robustness and

    Model Selection Some Examples
  19. The Sparse Way Sparse approximation: most of wavelets coefficients are

    0
  20. Back to the Source: Union of Linear Models 2 components

    signal
  21. Back to the Source: Union of Linear Models 2 components

    signal T0 0 is the only 0-sparse vector
  22. Back to the Source: Union of Linear Models 2 components

    signal Te1 Te2 Axis points are 1-sparse (except 0)
  23. Back to the Source: Union of Linear Models 2 components

    signal The whole space points minus the axis are 2-sparse
  24. 0 to 1 Combinatorial penalty associated to the previous union

    of model J(x) = ||x||0 = | {i : xi = 0} |
  25. 0 to 1 Combinatorial penalty associated to the previous union

    of model J(x) = ||x||0 = | {i : xi = 0} | → non-convex → no regularity → NP-hard regularization Encode Union of Model in a good functional
  26. 0 to 1 Combinatorial penalty associated to the previous union

    of model J(x) = ||x||0 = | {i : xi = 0} | → non-convex → no regularity → NP-hard regularization Encode Union of Model in a good functional x ||x||0 1 ||x||1
  27. Union of Linear Models to Regularizations Union of Model Gauges

    Combinatorial world Functional world
  28. Gauge J(x) 0 J(λx) = λJ(x) for λ 0 J

    convex x → J(x) 1 C = {x : J(x) 1} C a convex set
  29. Gauge J(x) 0 J(λx) = λJ(x) for λ 0 J

    convex x → J(x) 1 C = {x : J(x) 1} C a convex set
  30. Subdifferential x f (x) (x, x2)

  31. Subdifferential x f (x) (x, x2)

  32. Subdifferential x f (x)

  33. Subdifferential ∂f (x) = η : f (x ) f

    (x) + η, x − x lines below the graphical representation x f (x)
  34. Some Properties of the Subdifferential f bounded ⇒ ∂f (x)

    non-empty convex set f Gateaux-differentiable ⇔ ∂f (x) = {∇f (x)} 0 ∈ ∂f (x) ⇔ x minimum of f
  35. Some Properties of the Subdifferential f bounded ⇒ ∂f (x)

    non-empty convex set f Gateaux-differentiable ⇔ ∂f (x) = {∇f (x)} 0 ∈ ∂f (x) ⇔ x minimum of f ∂| · |(x) = sign(x) if x = 0 [−1, 1] if x = 0
  36. The Model Linear Space 0 x

  37. The Model Linear Space 0 x ∂J(x)

  38. The Model Linear Space 0 x ∂J(x)

  39. The Model Linear Space 0 x ∂J(x) Tx Tx =

    VectHull(∂J(x))⊥
  40. The Model Linear Space 0 x ∂J(x) Tx ex Tx

    = VectHull(∂J(x))⊥ ex = PTx (∂J(x))
  41. Special cases Sparsity Tx = {η : supp(η) ⊆ supp(x)}

    ex = sign(x)
  42. Special cases Sparsity Tx = {η : supp(η) ⊆ supp(x)}

    ex = sign(x) (Aniso/Iso)tropic Total Variation Tx = {η : supp(∇η) ⊆ supp(∇x)} ex = sign(∇x) or ex = ∇x ||∇x||
  43. Special cases Sparsity Tx = {η : supp(η) ⊆ supp(x)}

    ex = sign(x) (Aniso/Iso)tropic Total Variation Tx = {η : supp(∇η) ⊆ supp(∇x)} ex = sign(∇x) or ex = ∇x ||∇x|| Trace Norm SVD: x = UΛV ∗ Tx = {η : U∗ ⊥ ηV⊥ = 0} ex = UV ∗
  44. Algebraic Stability Composition by a linear operator • ||∇ ·

    ||1 — Anisotropic TV • ||∇ · ||1,2 — Istotropic TV • ||Udiag(·)||∗ — Trace Lasso
  45. Algebraic Stability Composition by a linear operator • ||∇ ·

    ||1 — Anisotropic TV • ||∇ · ||1,2 — Istotropic TV • ||Udiag(·)||∗ — Trace Lasso Sum of gauges (Composite priors) • || · ||1 + || · ||1 — sparse TV • || · ||1 + || · ||2 — Elastic net • || · ||1 + || · ||∗ — Sparse + Low-rank
  46. Outline Variational Estimator Gauge and Model Space 2 Robustness and

    Model Selection Some Examples
  47. What’s the Robustness ? x0 y degradation xM (y) model

    M error estimation
  48. What’s the Robustness ? x0 y degradation xM (y) model

    M error estimation Data fidelity loss: ||x − x0|| Prediction loss: ||Φx − Φx0|| Regularization loss: J(x − x0) Taylor/Bregman metric Model selection: Tx = Tx0
  49. Certificate x ∈ argmin Φx=Φx0 J(x) (P0(y)) x Φx =

    Φx0
  50. Certificate x ∈ argmin Φx=Φx0 J(x) (P0(y)) ∂J(x) x Φx

    = Φx0 η Dual certificates: D = Im Φ∗ ∩ ∂J(x0)
  51. Certificate x ∈ argmin Φx=Φx0 J(x) (P0(y)) ∂J(x) x Φx

    = Φx0 η Dual certificates: D = Im Φ∗ ∩ ∂J(x0) Proposition ∃η ∈ D ⇔ x0 solution de (P0(y))
  52. Tight Certificate and Restricted Injectivity Tight dual certificates ¯ D

    = Im Φ∗ ∩ ri ∂J(x)
  53. Tight Certificate and Restricted Injectivity Tight dual certificates ¯ D

    = Im Φ∗ ∩ ri ∂J(x) Restricted Injectivity Ker Φ ∩ Tx = {0} (RICx )
  54. Tight Certificate and Restricted Injectivity Tight dual certificates ¯ D

    = Im Φ∗ ∩ ri ∂J(x) Restricted Injectivity Ker Φ ∩ Tx = {0} (RICx ) Proposition ∃η ∈ ¯ D ∧ (RICx ) ⇒ x unique solution of (Pλ(y))
  55. 2 stability Theorem If ∃η ∈ ¯ D ∧ (RICx

    ), then λ ∼ ||w|| ⇒ ||x − x || = O(||w||)
  56. Minimal-norm Certificate η ∈ D ⇐⇒ η = Φ∗q, ηT

    = e and J◦(η) 1
  57. Minimal-norm Certificate η ∈ D ⇐⇒ η = Φ∗q, ηT

    = e and J◦(η) 1 Minimal-norm precertificate η0 = argmin η=Φ∗q ηT =e ||q||
  58. Minimal-norm Certificate η ∈ D ⇐⇒ η = Φ∗q, ηT

    = e and J◦(η) 1 Minimal-norm precertificate η0 = argmin η=Φ∗q ηT =e ||q|| Proposition If (RICx ), then η0 = (Φ+ T Φ)∗e
  59. Model Selection Theorem If η0 ∈ ¯ D, the noise-to-signal

    ratio is low enough and λ ∼ ||w||, the unique solution x of (Pλ(y)) satifies Tx = Tx0 and ||x − x || = O(||w||)
  60. A Better Certificate ? • With model selection: no

  61. A Better Certificate ? • With model selection: no •

    Without: ongoing works Duval-Peyré: Sparse Deconvolution Dossal: Sparse Tomography
  62. Outline Variational Estimator Gauge and Model Space 2 Robustness and

    Model Selection Some Examples
  63. Sparse Spike Deconvolution (Dossal, 2005) x0

  64. Sparse Spike Deconvolution (Dossal, 2005) Φx = i xi ϕ(·

    − ∆i) J(x) = ||x||1 x0 γ Φx0
  65. Sparse Spike Deconvolution (Dossal, 2005) Φx = i xi ϕ(·

    − ∆i) J(x) = ||x||1 x0 γ Φx0 η0 ∈ ¯ D ⇔ ||Φ+,∗ Ic ΦI s||∞ < 1 γ ||η0,Ic ||∞ γcrit 1
  66. 1D TV Denoising (V. et al., 2011) Φ = Id

    J(x) = ||∇x||1 i xi i xi
  67. 1D TV Denoising (V. et al., 2011) Φ = Id

    J(x) = ||∇x||1 i xi i xi k mk k mk +1 −1
  68. 1D TV Denoising (V. et al., 2011) Φ = Id

    J(x) = ||∇x||1 i xi i xi k mk k mk +1 −1 Support stability No support stability
  69. 1D TV Denoising (V. et al., 2011) Φ = Id

    J(x) = ||∇x||1 i xi i xi k mk k mk +1 −1 Support stability No support stability Both are 2-stable
  70. 2D TV Denoising Φ = Id J(x) = || ∇→

    ∇↑ x||1 ∇→ ∇↑
  71. 2D TV Denoising Φ = Id J(x) = || ∇→

    ∇↑ x||1 ∇→ ∇↑
  72. Open Problems Union of Model Gauges Combinatorial world Functional world

  73. Thanks for your attention !