Slide 1

Slide 1 text

Recovery Guarantees for Low Complexity Models Samuel Vaiter CEREMADE, Univ. Paris-Dauphine October 17, 2013 Séminaire Image, IMB, Bordeaux I

Slide 2

Slide 2 text

People & Papers • V., M. Golbabaee, M. J. Fadili et G. Peyré, Model Selection with Piecewise Regular Gauges, Tech. report, http://arxiv.org/abs/1307.2342 , 2013 • J. Fadili, V. and G. Peyré, Linear Convergence Rates for Gauge Regularization, ongoing work

Slide 3

Slide 3 text

Our work • Study of models, not algorithms • Impact of the noise • Understand the geometry 3 parts of this talk 1. Defining our framework of interest 2. Noise robustness results 3. Some examples

Slide 4

Slide 4 text

Linear Inverse Problem denoising inpainting deblurring

Slide 5

Slide 5 text

Linear Inverse Problem : Forward Model y = Φ x0 + w y ∈ RQ observations Φ ∈ RQ×N linear operator x0 ∈ RN unknown vector w ∈ RQ realization of a noise (bounded here)

Slide 6

Slide 6 text

Linear Inverse Problem : Forward Model y = Φ x0 + w y ∈ RQ observations Φ ∈ RQ×N linear operator x0 ∈ RN unknown vector w ∈ RQ realization of a noise (bounded here) Objective: Recover x0 from y.

Slide 7

Slide 7 text

The Variational Approach x ∈ argmin x∈RN F(y, x) + λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization

Slide 8

Slide 8 text

The Variational Approach x ∈ argmin x∈RN F(y, x) + λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization • Data fidelity: 2 loss, logistic, etc. F(y, x) = 1 2 ||y − Φx||2 2

Slide 9

Slide 9 text

The Variational Approach x ∈ argmin x∈RN F(y, x) + λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization • Data fidelity: 2 loss, logistic, etc. F(y, x) = 1 2 ||y − Φx||2 2 • Parameter: By hand or automatic like SURE.

Slide 10

Slide 10 text

The Variational Approach x ∈ argmin x∈RN F(y, x) + λ J(x) (Pλ(y)) Trade-off between data fidelity and prior regularization • Data fidelity: 2 loss, logistic, etc. F(y, x) = 1 2 ||y − Φx||2 2 • Parameter: By hand or automatic like SURE. • Regularization: ?

Slide 11

Slide 11 text

A Zoo ... ? Block Sparsity Nuclear Norm Trace Lasso Polyhedral Antisparsity Total Variation

Slide 12

Slide 12 text

Relations to Previous Works • Fuchs, J. J. (2004). On sparse representations in arbitrary redundant bases. • Tropp, J. A. (2006). Just relax: Convex programming methods for identifying sparse signals in noise. • Grasmair, M. and al. (2008). Sparse regularization with q penalty term. • Bach, F. R. (2008). Consistency of the group lasso and multiple kernel learning & Consistency of trace norm minimization. • V. and al. (2011). Robust sparse analysis regularization. • Grasmair, M. and al. (2011). Necessary and sufficient conditions for linear convergence of 1-regularization. • Grasmair, M. (2011). Linear convergence rates for Tikhonov regularization with positively homogeneous functionals. (and more !)

Slide 13

Slide 13 text

First Part: Gauges and Model Space

Slide 14

Slide 14 text

Back to the Source: Union of Linear Models 2 components signal

Slide 15

Slide 15 text

Back to the Source: Union of Linear Models 2 components signal T0 0 is the only 0-sparse vector

Slide 16

Slide 16 text

Back to the Source: Union of Linear Models 2 components signal Te1 Te2 Axis points are 1-sparse (except 0)

Slide 17

Slide 17 text

Back to the Source: Union of Linear Models 2 components signal The whole space points minus the axis are 2-sparse

Slide 18

Slide 18 text

0 to 1 Combinatorial penalty associated to the previous union of model J(x) = ||x||0 = | {i : xi = 0} |

Slide 19

Slide 19 text

0 to 1 Combinatorial penalty associated to the previous union of model J(x) = ||x||0 = | {i : xi = 0} | → non-convex → no regularity → NP-hard regularization Encode Union of Model in a good functional

Slide 20

Slide 20 text

0 to 1 Combinatorial penalty associated to the previous union of model J(x) = ||x||0 = | {i : xi = 0} | → non-convex → no regularity → NP-hard regularization Encode Union of Model in a good functional x ||x||0 1 ||x||1

Slide 21

Slide 21 text

Union of Linear Models to Regularizations Union of Model Gauges Combinatorial world Functional world

Slide 22

Slide 22 text

Gauge J(x) 0 J(λx) = λJ(x) for λ 0 J convex x → J(x) 1 C = {x : J(x) 1} C a convex set

Slide 23

Slide 23 text

Gauge J(x) 0 J(λx) = λJ(x) for λ 0 J convex x → J(x) 1 C = {x : J(x) 1} C a convex set

Slide 24

Slide 24 text

Subdifferential t f (t) (t, t2)

Slide 25

Slide 25 text

Subdifferential t f (t) (t, t2)

Slide 26

Slide 26 text

Subdifferential t f (t)

Slide 27

Slide 27 text

Subdifferential t f (t) ∂f (t) = η : f (t ) f (t) + η, t − t

Slide 28

Slide 28 text

The Model Linear Space 0 x

Slide 29

Slide 29 text

The Model Linear Space 0 x ∂J(x)

Slide 30

Slide 30 text

The Model Linear Space 0 x ∂J(x)

Slide 31

Slide 31 text

The Model Linear Space 0 x ∂J(x) Tx Tx = VectHull(∂J(x))⊥

Slide 32

Slide 32 text

The Model Linear Space 0 x ∂J(x) Tx ex Tx = VectHull(∂J(x))⊥ ex = PTx (∂J(x))

Slide 33

Slide 33 text

Special cases Sparsity Tx = {η : supp(η) ⊆ supp(x)} ex = sign(x)

Slide 34

Slide 34 text

Special cases Sparsity Tx = {η : supp(η) ⊆ supp(x)} ex = sign(x) (Aniso/Iso)tropic Total Variation Tx = {η : supp(∇η) ⊆ supp(∇x)} ex = sign(∇x) or ex = ∇x ||∇x||

Slide 35

Slide 35 text

Special cases Sparsity Tx = {η : supp(η) ⊆ supp(x)} ex = sign(x) (Aniso/Iso)tropic Total Variation Tx = {η : supp(∇η) ⊆ supp(∇x)} ex = sign(∇x) or ex = ∇x ||∇x|| Trace Norm SVD: x = UΛV ∗ Tx = {η : U∗ ⊥ ηV⊥ = 0} ex = UV ∗

Slide 36

Slide 36 text

Algebraic Stability Composition by a linear operator • ||∇ · ||1 — Anisotropic TV • ||∇ · ||1,2 — Istotropic TV • ||Udiag(·)||∗ — Trace Lasso

Slide 37

Slide 37 text

Algebraic Stability Composition by a linear operator • ||∇ · ||1 — Anisotropic TV • ||∇ · ||1,2 — Istotropic TV • ||Udiag(·)||∗ — Trace Lasso Sum of gauges (Composite priors) • || · ||1 + || · ||1 — sparse TV • || · ||1 + || · ||2 — Elastic net • || · ||1 + || · ||∗ — Sparse + Low-rank

Slide 38

Slide 38 text

Second Part: 2 Robustness and Model Selection

Slide 39

Slide 39 text

Certificate x ∈ argmin Φx=Φx0 J(x) (P0(y)) x Φx = Φx0

Slide 40

Slide 40 text

Certificate x ∈ argmin Φx=Φx0 J(x) (P0(y)) ∂J(x) x Φx = Φx0 η Dual certificates: D = Im Φ∗ ∩ ∂J(x0)

Slide 41

Slide 41 text

Certificate x ∈ argmin Φx=Φx0 J(x) (P0(y)) ∂J(x) x Φx = Φx0 η Dual certificates: D = Im Φ∗ ∩ ∂J(x0) Proposition ∃η ∈ D ⇔ x0 solution de (P0(y))

Slide 42

Slide 42 text

Tight Certificate and Restricted Injectivity Tight dual certificates ¯ D = Im Φ∗ ∩ ri ∂J(x)

Slide 43

Slide 43 text

Tight Certificate and Restricted Injectivity Tight dual certificates ¯ D = Im Φ∗ ∩ ri ∂J(x) Restricted Injectivity Ker Φ ∩ Tx = {0} (RICx )

Slide 44

Slide 44 text

Tight Certificate and Restricted Injectivity Tight dual certificates ¯ D = Im Φ∗ ∩ ri ∂J(x) Restricted Injectivity Ker Φ ∩ Tx = {0} (RICx ) Proposition ∃η ∈ ¯ D ∧ (RICx ) ⇒ x unique solution of (Pλ(y))

Slide 45

Slide 45 text

2 stability Theorem If ∃η ∈ ¯ D ∧ (RICx ), then λ ∼ ||w|| ⇒ ||x − x || = O(||w||)

Slide 46

Slide 46 text

Minimal-norm Certificate η ∈ D ⇐⇒ η = Φ∗q, ηT = e and J◦(η) 1

Slide 47

Slide 47 text

Minimal-norm Certificate η ∈ D ⇐⇒ η = Φ∗q, ηT = e and J◦(η) 1 Minimal-norm precertificate η0 = argmin η=Φ∗q ηT =e ||q||

Slide 48

Slide 48 text

Minimal-norm Certificate η ∈ D ⇐⇒ η = Φ∗q, ηT = e and J◦(η) 1 Minimal-norm precertificate η0 = argmin η=Φ∗q ηT =e ||q|| Proposition If (RICx ), then η0 = (Φ+ T Φ)∗e

Slide 49

Slide 49 text

Model Selection Theorem If η0 ∈ ¯ D, the noise-to-signal ratio is low enough and λ ∼ ||w||, the unique solution x of (Pλ(y)) satifies Tx = Tx0 and ||x − x || = O(||w||)

Slide 50

Slide 50 text

A Better Certificate ? • With model selection: no

Slide 51

Slide 51 text

A Better Certificate ? • With model selection: no • Without: ongoing works Duval-Peyré: Sparse Deconvolution Dossal: Sparse Tomography

Slide 52

Slide 52 text

Third Part: Some Examples

Slide 53

Slide 53 text

Sparse Spike Deconvolution (Dossal, 2005) x0

Slide 54

Slide 54 text

Sparse Spike Deconvolution (Dossal, 2005) Φx = i xi ϕ(· − ∆i) J(x) = ||x||1 x0 γ Φx0

Slide 55

Slide 55 text

Sparse Spike Deconvolution (Dossal, 2005) Φx = i xi ϕ(· − ∆i) J(x) = ||x||1 x0 γ Φx0 η0 ∈ ¯ D ⇔ ||Φ+,∗ Ic ΦI s||∞ < 1 γ ||η0,Ic ||∞ γcrit 1

Slide 56

Slide 56 text

TV Denoising (V. et al., 2011) Φ = Id J(x) = ||∇x||1 i xi i xi

Slide 57

Slide 57 text

TV Denoising (V. et al., 2011) Φ = Id J(x) = ||∇x||1 i xi i xi k mk k mk +1 −1

Slide 58

Slide 58 text

TV Denoising (V. et al., 2011) Φ = Id J(x) = ||∇x||1 i xi i xi k mk k mk +1 −1 Support stability No support stability

Slide 59

Slide 59 text

TV Denoising (V. et al., 2011) Φ = Id J(x) = ||∇x||1 i xi i xi k mk k mk +1 −1 Support stability No support stability Both are 2-stable

Slide 60

Slide 60 text

Thanks for your attention !