Pro Yearly is on sale from $80 to $50! »

Model Selection with Partly Smooth Functions

Model Selection with Partly Smooth Functions

ITWIST'14, The Arsenal, Namur, August 2014.

4807c637e2e5e8a5c5e68b287e8492a9?s=128

Samuel Vaiter

August 27, 2014
Tweet

Transcript

  1. 1 Model Selection with Partly Smooth Functions Samuel Vaiter, Gabriel

    Peyré and Jalal Fadili vaiter@ceremade.dauphine.fr August 27, 2014 ITWIST’14 Model Consistency of Partly Smooth Regularizers, arXiv:1405.1004, 2014
  2. 2 Linear Inverse Problems Forward model y = Φ x0

    + w Forward operator Φ : Rn → Rq linear (q n)
  3. 2 Linear Inverse Problems Forward model y = Φ x0

    + w Forward operator Φ : Rn → Rq linear (q n) → ill-posed problem
  4. 2 Linear Inverse Problems Forward model y = Φ x0

    + w Forward operator Φ : Rn → Rq linear (q n) → ill-posed problem denoising inpainting deblurring
  5. 3 Variational Regularization Trade-off between prior regularization and data fidelity

  6. 3 Variational Regularization Trade-off between prior regularization and data fidelity

    x ∈ Argmin x∈Rn J(x) + 1 2λ ||y − Φx||2 (Py,λ)
  7. 3 Variational Regularization Trade-off between prior regularization and data fidelity

    x ∈ Argmin x∈Rn J(x) + 1 2λ ||y − Φx||2 (Py,λ) λ → 0+ x ∈ Argmin x∈Rn J(x) subject to y = Φx (Py,0)
  8. 3 Variational Regularization Trade-off between prior regularization and data fidelity

    x ∈ Argmin x∈Rn J(x) + 1 2λ ||y − Φx||2 (Py,λ) λ → 0+ x ∈ Argmin x∈Rn J(x) subject to y = Φx (Py,0) J convex, bounded from below and finite-valued function, typically non-smooth.
  9. 4 Objective x0 y x

  10. 5 Low Complexity Models Sparsity J(x) = i=1,...,n |xi |

    Mx = x : supp(x ) ⊆ supp(x)
  11. 5 Low Complexity Models Sparsity J(x) = i=1,...,n |xi |

    Group sparsity J(x) = b∈B ||xb|| Mx = x : supp(x ) ⊆ supp(x)
  12. 5 Low Complexity Models Sparsity J(x) = i=1,...,n |xi |

    Group sparsity J(x) = b∈B ||xb|| Low rank J(x) = i=1,...,n |σi (x)| Mx = x : supp(x ) ⊆ supp(x) Mx = x : rank(x ) = rank(x)
  13. 6 Partly Smooth Functions [Lewis 2002] x M TMx J

    is partly smooth at x relative to a C2-manifold M if Smoothness. J restricted to M is C2 around x Sharpness. ∀h ∈ (TMx)⊥, t → J(x + th) is non-smooth at t = 0. Continuity. ∂J on M is continuous around x.
  14. 6 Partly Smooth Functions [Lewis 2002] x M TMx J

    is partly smooth at x relative to a C2-manifold M if Smoothness. J restricted to M is C2 around x Sharpness. ∀h ∈ (TMx)⊥, t → J(x + th) is non-smooth at t = 0. Continuity. ∂J on M is continuous around x. J, G partly smooth ⇒      J + G J ◦ D∗ with D linear operator J ◦ σ (spectral lift) partly smooth || · ||1, ||∇ · ||1, || · ||1,2, || · ||∗, || · ||∞, maxi ( di , x )+ partly smooth.
  15. 7 Dual Certificates x ∈ Argmin x∈Rn J(x) subject to

    y = Φx (Py,0)
  16. 7 Dual Certificates x ∈ Argmin x∈Rn J(x) subject to

    y = Φx (Py,0) Source condition Φ∗p ∈ ∂J(x) ∂J(x) x Φx = Φx0 Φ∗p
  17. 7 Dual Certificates x ∈ Argmin x∈Rn J(x) subject to

    y = Φx (Py,0) Source condition Φ∗p ∈ ∂J(x) ∂J(x) x Φx = Φx0 Φ∗p Proposition There exists a dual certificate p if, and only if, x0 is a solution of (Py,0).
  18. 7 Dual Certificates x ∈ Argmin x∈Rn J(x) subject to

    y = Φx (Py,0) Source condition Φ∗p ∈ ∂J(x) Non-degenerate source condition Φ∗p ∈ ri ∂J(x) ∂J(x) x Φx = Φx0 Φ∗p Proposition There exists a dual certificate p if, and only if, x0 is a solution of (Py,0).
  19. 8 Linearized Precertificate Minimal norm certificate p0 = argmin ||p||

    subject to Φ∗p ∈ ∂J(x0)
  20. 8 Linearized Precertificate Minimal norm certificate p0 = argmin ||p||

    subject to Φ∗p ∈ ∂J(x0) Linearized precertificate pF = argmin ||p|| subject to Φ∗p ∈ aff ∂J(x0)
  21. 8 Linearized Precertificate Minimal norm certificate p0 = argmin ||p||

    subject to Φ∗p ∈ ∂J(x0) Linearized precertificate pF = argmin ||p|| subject to Φ∗p ∈ aff ∂J(x0) Proposition Assume Ker Φ ∩ TMx0 = {0}. Then, pF ∈ ri ∂J(x0) ⇒ pF = p0
  22. 9 Manifold Selection Theorem Assume J is partly smooth at

    x0 relative to M. If Φ∗pF ∈ ri ∂J(x0) and Ker Φ ∩ TMx0 = {0}. There exists C > 0 such that if max(λ, ||w||/λ) C, the unique solution x of (Py,λ) satisfies x ∈ M and ||x − x0|| = O(||w||).
  23. 9 Manifold Selection Theorem Assume J is partly smooth at

    x0 relative to M. If Φ∗pF ∈ ri ∂J(x0) and Ker Φ ∩ TMx0 = {0}. There exists C > 0 such that if max(λ, ||w||/λ) C, the unique solution x of (Py,λ) satisfies x ∈ M and ||x − x0|| = O(||w||). Almost sharp analysis (Φ∗pF ∈ ∂J(x0) ⇒ x ∈ Mx0 )
  24. 9 Manifold Selection Theorem Assume J is partly smooth at

    x0 relative to M. If Φ∗pF ∈ ri ∂J(x0) and Ker Φ ∩ TMx0 = {0}. There exists C > 0 such that if max(λ, ||w||/λ) C, the unique solution x of (Py,λ) satisfies x ∈ M and ||x − x0|| = O(||w||). Almost sharp analysis (Φ∗pF ∈ ∂J(x0) ⇒ x ∈ Mx0 ) [Fuchs 2004]: 1 [Bach 2008]: 1 − 2 and nuclear norm.
  25. 10 Sparse Spike Deconvolution x0

  26. 10 Sparse Spike Deconvolution Φx = i xi ϕ(· −

    ∆i) J(x) = ||x||1 x0 γ Φx0
  27. 10 Sparse Spike Deconvolution Φx = i xi ϕ(· −

    ∆i) J(x) = ||x||1 x0 γ Φx0 Φ∗ηF ∈ ri ∂J(x0) ⇔ ||Φ+,∗ Ic ΦI sign(x0,I )||∞ < 1 ⇔ stable recovery I = supp(x0) γ ||η0,Ic ||∞ γcrit 1
  28. 11 1D Total Variation and Jump Set J = ||∇d

    · ||1, Mx = x : supp(∇d x ) ⊆ supp(∇d x) , Φ = Id
  29. 11 1D Total Variation and Jump Set J = ||∇d

    · ||1, Mx = x : supp(∇d x ) ⊆ supp(∇d x) , Φ = Id i xi k uk stable jump +1 −1 unstable jump Φ∗pF = div u
  30. 12 Take-away Message Partial smoothness: encodes models using singularities −→

  31. 13 Future Work Extended-valued functions: minimization under constraints min x∈Rn

    1 2 ||y − Φx||2 + λJ(x) subject to x 0
  32. 13 Future Work Extended-valued functions: minimization under constraints min x∈Rn

    1 2 ||y − Φx||2 + λJ(x) subject to x 0 Non-convexity: Fidelity and regularization, dictionary learning min xk ∈Rn,D∈D k 1 2 ||y − ΦDxk||2 + λJ(xk)
  33. 13 Future Work Extended-valued functions: minimization under constraints min x∈Rn

    1 2 ||y − Φx||2 + λJ(x) subject to x 0 Non-convexity: Fidelity and regularization, dictionary learning min xk ∈Rn,D∈D k 1 2 ||y − ΦDxk||2 + λJ(xk) Infinite dimensional problems: partial smoothness for BV, Besov min f ∈BV(Ω)∩L2(Ω) 1 2 ||g − Ψf ||L2(Ω) + λ|Df |(Ω)
  34. 13 Future Work Extended-valued functions: minimization under constraints min x∈Rn

    1 2 ||y − Φx||2 + λJ(x) subject to x 0 Non-convexity: Fidelity and regularization, dictionary learning min xk ∈Rn,D∈D k 1 2 ||y − ΦDxk||2 + λJ(xk) Infinite dimensional problems: partial smoothness for BV, Besov min f ∈BV(Ω)∩L2(Ω) 1 2 ||g − Ψf ||L2(Ω) + λ|Df |(Ω) Compressed sensing: Optimal bounds for partly smooth regularizers
  35. 14 Thanks for your attention