Model Selection with Partly Smooth Functions

Slide 1

Slide 1 text

1 Model Selection with Partly Smooth Functions Samuel Vaiter, Gabriel Peyré and Jalal Fadili [email protected] August 27, 2014 ITWIST’14 Model Consistency of Partly Smooth Regularizers, arXiv:1405.1004, 2014

Slide 2

Slide 2 text

2 Linear Inverse Problems Forward model y = Φ x0 + w Forward operator Φ : Rn → Rq linear (q n)

Slide 3

Slide 3 text

2 Linear Inverse Problems Forward model y = Φ x0 + w Forward operator Φ : Rn → Rq linear (q n) → ill-posed problem

Slide 4

Slide 4 text

2 Linear Inverse Problems Forward model y = Φ x0 + w Forward operator Φ : Rn → Rq linear (q n) → ill-posed problem denoising inpainting deblurring

Slide 5

Slide 5 text

3 Variational Regularization Trade-oﬀ between prior regularization and data ﬁdelity

Slide 6

Slide 6 text

3 Variational Regularization Trade-oﬀ between prior regularization and data ﬁdelity x ∈ Argmin x∈Rn J(x) + 1 2λ ||y − Φx||2 (Py,λ)

Slide 7

Slide 7 text

3 Variational Regularization Trade-oﬀ between prior regularization and data ﬁdelity x ∈ Argmin x∈Rn J(x) + 1 2λ ||y − Φx||2 (Py,λ) λ → 0+ x ∈ Argmin x∈Rn J(x) subject to y = Φx (Py,0)

Slide 8

Slide 8 text

3 Variational Regularization Trade-off between prior regularization and data fidelity x ∈ Argmin x∈Rn J(x) + 1 2λ ||y − Φx||2 (Py,λ) λ → 0+ x ∈ Argmin x∈Rn J(x) subject to y = Φx (Py,0) J convex, bounded from below and finite-valued function, typically non-smooth.

Slide 9

Slide 9 text

4 Objective x0 y x

Slide 10

Slide 10 text

5 Low Complexity Models Sparsity J(x) = i=1,...,n |xi | Mx = x : supp(x ) ⊆ supp(x)

Slide 11

Slide 11 text

5 Low Complexity Models Sparsity J(x) = i=1,...,n |xi | Group sparsity J(x) = b∈B ||xb|| Mx = x : supp(x ) ⊆ supp(x)

Slide 12

Slide 12 text

5 Low Complexity Models Sparsity J(x) = i=1,...,n |xi | Group sparsity J(x) = b∈B ||xb|| Low rank J(x) = i=1,...,n |σi (x)| Mx = x : supp(x ) ⊆ supp(x) Mx = x : rank(x ) = rank(x)

Slide 13

Slide 13 text

Slide 14

Slide 14 text

6 Partly Smooth Functions [Lewis 2002] x M TMx J is partly smooth at x relative to a C2-manifold M if Smoothness. J restricted to M is C2 around x Sharpness. ∀h ∈ (TMx)⊥, t → J(x + th) is non-smooth at t = 0. Continuity. ∂J on M is continuous around x. J, G partly smooth ⇒      J + G J ◦ D∗ with D linear operator J ◦ σ (spectral lift) partly smooth || · ||1, ||∇ · ||1, || · ||1,2, || · ||∗, || · ||∞, maxi ( di , x )+ partly smooth.

Slide 15

Slide 15 text

7 Dual Certiﬁcates x ∈ Argmin x∈Rn J(x) subject to y = Φx (Py,0)

Slide 16

Slide 16 text

7 Dual Certiﬁcates x ∈ Argmin x∈Rn J(x) subject to y = Φx (Py,0) Source condition Φ∗p ∈ ∂J(x) ∂J(x) x Φx = Φx0 Φ∗p

Slide 17

Slide 17 text

7 Dual Certiﬁcates x ∈ Argmin x∈Rn J(x) subject to y = Φx (Py,0) Source condition Φ∗p ∈ ∂J(x) ∂J(x) x Φx = Φx0 Φ∗p Proposition There exists a dual certiﬁcate p if, and only if, x0 is a solution of (Py,0).

Slide 18

Slide 18 text

7 Dual Certiﬁcates x ∈ Argmin x∈Rn J(x) subject to y = Φx (Py,0) Source condition Φ∗p ∈ ∂J(x) Non-degenerate source condition Φ∗p ∈ ri ∂J(x) ∂J(x) x Φx = Φx0 Φ∗p Proposition There exists a dual certiﬁcate p if, and only if, x0 is a solution of (Py,0).

Slide 19

Slide 19 text

8 Linearized Precertiﬁcate Minimal norm certiﬁcate p0 = argmin ||p|| subject to Φ∗p ∈ ∂J(x0)

Slide 20

Slide 20 text

8 Linearized Precertificate Minimal norm certificate p0 = argmin ||p|| subject to Φ∗p ∈ ∂J(x0) Linearized precertificate pF = argmin ||p|| subject to Φ∗p ∈ aff ∂J(x0)

Slide 21

Slide 21 text

8 Linearized Precertificate Minimal norm certificate p0 = argmin ||p|| subject to Φ∗p ∈ ∂J(x0) Linearized precertificate pF = argmin ||p|| subject to Φ∗p ∈ aff ∂J(x0) Proposition Assume Ker Φ ∩ TMx0 = {0}. Then, pF ∈ ri ∂J(x0) ⇒ pF = p0

Slide 22

Slide 22 text

Slide 23

Slide 23 text

9 Manifold Selection Theorem Assume J is partly smooth at x0 relative to M. If Φ∗pF ∈ ri ∂J(x0) and Ker Φ ∩ TMx0 = {0}. There exists C > 0 such that if max(λ, ||w||/λ) C, the unique solution x of (Py,λ) satisﬁes x ∈ M and ||x − x0|| = O(||w||). Almost sharp analysis (Φ∗pF ∈ ∂J(x0) ⇒ x ∈ Mx0 )

Slide 24

Slide 24 text

Slide 25

Slide 25 text

10 Sparse Spike Deconvolution x0

Slide 26

Slide 26 text

10 Sparse Spike Deconvolution Φx = i xi ϕ(· − ∆i) J(x) = ||x||1 x0 γ Φx0

Slide 27

Slide 27 text

10 Sparse Spike Deconvolution Φx = i xi ϕ(· − ∆i) J(x) = ||x||1 x0 γ Φx0 Φ∗ηF ∈ ri ∂J(x0) ⇔ ||Φ+,∗ Ic ΦI sign(x0,I )||∞ < 1 ⇔ stable recovery I = supp(x0) γ ||η0,Ic ||∞ γcrit 1

Slide 28

Slide 28 text

11 1D Total Variation and Jump Set J = ||∇d · ||1, Mx = x : supp(∇d x ) ⊆ supp(∇d x) , Φ = Id

Slide 29

Slide 29 text

11 1D Total Variation and Jump Set J = ||∇d · ||1, Mx = x : supp(∇d x ) ⊆ supp(∇d x) , Φ = Id i xi k uk stable jump +1 −1 unstable jump Φ∗pF = div u

Slide 30

Slide 30 text

12 Take-away Message Partial smoothness: encodes models using singularities −→

Slide 31

Slide 31 text

13 Future Work Extended-valued functions: minimization under constraints min x∈Rn 1 2 ||y − Φx||2 + λJ(x) subject to x 0

Slide 32

Slide 32 text

Slide 33

Slide 33 text

13 Future Work Extended-valued functions: minimization under constraints min x∈Rn 1 2 ||y − Φx||2 + λJ(x) subject to x 0 Non-convexity: Fidelity and regularization, dictionary learning min xk ∈Rn,D∈D k 1 2 ||y − ΦDxk||2 + λJ(xk) Inﬁnite dimensional problems: partial smoothness for BV, Besov min f ∈BV(Ω)∩L2(Ω) 1 2 ||g − Ψf ||L2(Ω) + λ|Df |(Ω)