Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Recovery Guarantees for Low Complexity Models

Samuel Vaiter
October 17, 2013

Recovery Guarantees for Low Complexity Models

GT Image, IMB, Bordeaux, October 2013.

Samuel Vaiter

October 17, 2013
Tweet

More Decks by Samuel Vaiter

Other Decks in Science

Transcript

  1. Recovery Guarantees for Low Complexity Models Samuel Vaiter CEREMADE, Univ.

    Paris-Dauphine October 17, 2013 Séminaire Image, IMB, Bordeaux I
  2. People & Papers • V., M. Golbabaee, M. J. Fadili

    et G. Peyré, Model Selection with Piecewise Regular Gauges, Tech. report, http://arxiv.org/abs/1307.2342 , 2013 • J. Fadili, V. and G. Peyré, Linear Convergence Rates for Gauge Regularization, ongoing work
  3. Our work • Study of models, not algorithms • Impact

    of the noise • Understand the geometry 3 parts of this talk 1. Defining our framework of interest 2. Noise robustness results 3. Some examples
  4. Linear Inverse Problem : Forward Model y = Φ x0

    + w y ∈ RQ observations Φ ∈ RQ×N linear operator x0 ∈ RN unknown vector w ∈ RQ realization of a noise (bounded here)
  5. Linear Inverse Problem : Forward Model y = Φ x0

    + w y ∈ RQ observations Φ ∈ RQ×N linear operator x0 ∈ RN unknown vector w ∈ RQ realization of a noise (bounded here) Objective: Recover x0 from y.
  6. The Variational Approach x ∈ argmin x∈RN F(y, x) +

    λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization
  7. The Variational Approach x ∈ argmin x∈RN F(y, x) +

    λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization • Data fidelity: 2 loss, logistic, etc. F(y, x) = 1 2 ||y − Φx||2 2
  8. The Variational Approach x ∈ argmin x∈RN F(y, x) +

    λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization • Data fidelity: 2 loss, logistic, etc. F(y, x) = 1 2 ||y − Φx||2 2 • Parameter: By hand or automatic like SURE.
  9. The Variational Approach x ∈ argmin x∈RN F(y, x) +

    λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization • Data fidelity: 2 loss, logistic, etc. F(y, x) = 1 2 ||y − Φx||2 2 • Parameter: By hand or automatic like SURE. • Regularization: ?
  10. A Zoo ... ? Block Sparsity Nuclear Norm Trace Lasso

    Polyhedral Antisparsity Total Variation
  11. Relations to Previous Works • Fuchs, J. J. (2004). On

    sparse representations in arbitrary redundant bases. • Tropp, J. A. (2006). Just relax: Convex programming methods for identifying sparse signals in noise. • Grasmair, M. and al. (2008). Sparse regularization with q penalty term. • Bach, F. R. (2008). Consistency of the group lasso and multiple kernel learning & Consistency of trace norm minimization. • V. and al. (2011). Robust sparse analysis regularization. • Grasmair, M. and al. (2011). Necessary and sufficient conditions for linear convergence of 1-regularization. • Grasmair, M. (2011). Linear convergence rates for Tikhonov regularization with positively homogeneous functionals. (and more !)
  12. Back to the Source: Union of Linear Models 2 components

    signal T0 0 is the only 0-sparse vector
  13. Back to the Source: Union of Linear Models 2 components

    signal Te1 Te2 Axis points are 1-sparse (except 0)
  14. Back to the Source: Union of Linear Models 2 components

    signal The whole space points minus the axis are 2-sparse
  15. 0 to 1 Combinatorial penalty associated to the previous union

    of model J(x) = ||x||0 = | {i : xi = 0} |
  16. 0 to 1 Combinatorial penalty associated to the previous union

    of model J(x) = ||x||0 = | {i : xi = 0} | → non-convex → no regularity → NP-hard regularization Encode Union of Model in a good functional
  17. 0 to 1 Combinatorial penalty associated to the previous union

    of model J(x) = ||x||0 = | {i : xi = 0} | → non-convex → no regularity → NP-hard regularization Encode Union of Model in a good functional x ||x||0 1 ||x||1
  18. Gauge J(x) 0 J(λx) = λJ(x) for λ 0 J

    convex x → J(x) 1 C = {x : J(x) 1} C a convex set
  19. Gauge J(x) 0 J(λx) = λJ(x) for λ 0 J

    convex x → J(x) 1 C = {x : J(x) 1} C a convex set
  20. The Model Linear Space 0 x ∂J(x) Tx ex Tx

    = VectHull(∂J(x))⊥ ex = PTx (∂J(x))
  21. Special cases Sparsity Tx = {η : supp(η) ⊆ supp(x)}

    ex = sign(x) (Aniso/Iso)tropic Total Variation Tx = {η : supp(∇η) ⊆ supp(∇x)} ex = sign(∇x) or ex = ∇x ||∇x||
  22. Special cases Sparsity Tx = {η : supp(η) ⊆ supp(x)}

    ex = sign(x) (Aniso/Iso)tropic Total Variation Tx = {η : supp(∇η) ⊆ supp(∇x)} ex = sign(∇x) or ex = ∇x ||∇x|| Trace Norm SVD: x = UΛV ∗ Tx = {η : U∗ ⊥ ηV⊥ = 0} ex = UV ∗
  23. Algebraic Stability Composition by a linear operator • ||∇ ·

    ||1 — Anisotropic TV • ||∇ · ||1,2 — Istotropic TV • ||Udiag(·)||∗ — Trace Lasso
  24. Algebraic Stability Composition by a linear operator • ||∇ ·

    ||1 — Anisotropic TV • ||∇ · ||1,2 — Istotropic TV • ||Udiag(·)||∗ — Trace Lasso Sum of gauges (Composite priors) • || · ||1 + || · ||1 — sparse TV • || · ||1 + || · ||2 — Elastic net • || · ||1 + || · ||∗ — Sparse + Low-rank
  25. Certificate x ∈ argmin Φx=Φx0 J(x) (P0(y)) ∂J(x) x Φx

    = Φx0 η Dual certificates: D = Im Φ∗ ∩ ∂J(x0)
  26. Certificate x ∈ argmin Φx=Φx0 J(x) (P0(y)) ∂J(x) x Φx

    = Φx0 η Dual certificates: D = Im Φ∗ ∩ ∂J(x0) Proposition ∃η ∈ D ⇔ x0 solution de (P0(y))
  27. Tight Certificate and Restricted Injectivity Tight dual certificates ¯ D

    = Im Φ∗ ∩ ri ∂J(x) Restricted Injectivity Ker Φ ∩ Tx = {0} (RICx )
  28. Tight Certificate and Restricted Injectivity Tight dual certificates ¯ D

    = Im Φ∗ ∩ ri ∂J(x) Restricted Injectivity Ker Φ ∩ Tx = {0} (RICx ) Proposition ∃η ∈ ¯ D ∧ (RICx ) ⇒ x unique solution of (Pλ(y))
  29. 2 stability Theorem If ∃η ∈ ¯ D ∧ (RICx

    ), then λ ∼ ||w|| ⇒ ||x − x || = O(||w||)
  30. Minimal-norm Certificate η ∈ D ⇐⇒ η = Φ∗q, ηT

    = e and J◦(η) 1 Minimal-norm precertificate η0 = argmin η=Φ∗q ηT =e ||q||
  31. Minimal-norm Certificate η ∈ D ⇐⇒ η = Φ∗q, ηT

    = e and J◦(η) 1 Minimal-norm precertificate η0 = argmin η=Φ∗q ηT =e ||q|| Proposition If (RICx ), then η0 = (Φ+ T Φ)∗e
  32. Model Selection Theorem If η0 ∈ ¯ D, the noise-to-signal

    ratio is low enough and λ ∼ ||w||, the unique solution x of (Pλ(y)) satifies Tx = Tx0 and ||x − x || = O(||w||)
  33. A Better Certificate ? • With model selection: no •

    Without: ongoing works Duval-Peyré: Sparse Deconvolution Dossal: Sparse Tomography
  34. Sparse Spike Deconvolution (Dossal, 2005) Φx = i xi ϕ(·

    − ∆i) J(x) = ||x||1 x0 γ Φx0 η0 ∈ ¯ D ⇔ ||Φ+,∗ Ic ΦI s||∞ < 1 γ ||η0,Ic ||∞ γcrit 1
  35. TV Denoising (V. et al., 2011) Φ = Id J(x)

    = ||∇x||1 i xi i xi k mk k mk +1 −1
  36. TV Denoising (V. et al., 2011) Φ = Id J(x)

    = ||∇x||1 i xi i xi k mk k mk +1 −1 Support stability No support stability
  37. TV Denoising (V. et al., 2011) Φ = Id J(x)

    = ||∇x||1 i xi i xi k mk k mk +1 −1 Support stability No support stability Both are 2-stable