Covariant LEAst-square Re-fitting for image restoration

Covariant LEAst-square Re-ﬁtting for image restoration N. Papadakis1, C.-A. Deledalle2,
J. Salmon3 & S. Vaiter4 1CNRS, Institut de Mathématiques de Bordeaux 2 University California San Diego 3 Université de Montpellier, IMAG 4 CNRS, Institut de Mathématiques de Bourgogne

1. Introduction to Re-fitting Introduction to Re-fitting Invariant LEAst square
Re-fitting Covariant LEAst-square Re-fitting Practical considerations and experiments Conclusions N. Papadakis CLEAR

1. Introduction to Re-ﬁtting What is Re-ﬁtting? 0 20 40
60 80 100 120 140 160 180 200 −50 0 50 100 150 200 Posi ti on Val u e Original: x0 True piecewise constant signal x0 N. Papadakis CLEAR 2 / 37

60 80 100 120 140 160 180 200 −50 0 50 100 150 200 Posi ti on Val u e Original: x0 Observation: y = x0 + w Observations y = x0 + w N. Papadakis CLEAR 2 / 37

60 80 100 120 140 160 180 200 −50 0 50 100 150 200 Posi ti on Val u e Original: x0 Observation: y = x0 + w T V: ˆ x(y) Total Variation (TV) restoration ˆ x(y) = argmin x 1 2 ||y − x||2 + λ||∇x||1 N. Papadakis CLEAR 2 / 37

60 80 100 120 140 160 180 200 −50 0 50 100 150 200 Posi ti on Val u e Original: x0 Observation: y = x0 + w T V: ˆ x(y) Total Variation (TV) restoration ˆ x(y) = argmin x 1 2 ||y − x||2 + λ||∇x||1 Problem: ˆ x(y) is biased [Strong and Chan 1996], i.e., Eˆ x(y) = x0 . N. Papadakis CLEAR 2 / 37

60 80 100 120 140 160 180 200 −50 0 50 100 150 200 Posi ti on Val u e Original: x0 Observation: y = x0 + w T V: ˆ x(y) R e- ﬁtting: R ˆ x (y) Re-ﬁtting Rˆ x (y) Problem: ˆ x(y) is biased [Strong and Chan 1996], i.e., Eˆ x(y) = x0 . A solution: Compute Rˆ x (y) as the mean of y on each piece of ˆ x(y) N. Papadakis CLEAR 2 / 37

1. Introduction to Re-ﬁtting Which bias reduction with Re-ﬁtting? Noise
free image Noisy data Anisotropic TV Error N. Papadakis CLEAR 3 / 37

1. Introduction to Re-fitting Which bias reduction with Re-fitting? Noise
free image Noisy data Re-fitting Error N. Papadakis CLEAR 3 / 37

1. Introduction to Re-fitting Re-fitting arbitrary models Can we generalize
this approach to other estimators than TV? General setting ˆ x(y) = argmin x F(x, y) + λG(x) Data fidelity w.r.t observations y: F convex Prior model on the solution: G convex Regularization parameter: λ > 0 For inverse problems F(x, y) = F(Φx − y) Φ is a linear operator: convolution, mask... N. Papadakis CLEAR 4 / 37

1. Introduction to Re-ﬁtting Related works Twicing [Tukey, 1977], Bregman
iterations [Osher et al. 2005, 2015, Burger et al. 2006], Boosting [Elad et al. 2007, 2015, Milanfar 2013]... 0 20 40 60 80 100 120 140 160 180 200 −50 0 50 100 150 200 Posi ti on Val u e Observation: y = x0 + w T V: ˆ x(y) 0 20 40 60 80 100 120 140 160 180 200 −60 −40 −20 0 20 40 60 Apply the model to the residual y − Φˆ x(y) N. Papadakis CLEAR 5 / 37

iterations [Osher et al. 2005, 2015, Burger et al. 2006], Boosting [Elad et al. 2007, 2015, Milanfar 2013]... 0 20 40 60 80 100 120 140 160 180 200 −50 0 50 100 150 200 Posi ti on Val u e Observation: y = x0 + w T V: ˆ x(y) 0 20 40 60 80 100 120 140 160 180 200 −60 −40 −20 0 20 40 60 Apply the model to the residual y − Φˆ x(y) Iterative process x0 = argmin x F(Φx − y) + λG(x) xk = xk−1 + argmin x F(Φx − (y − Φxk−1 )) + λk G(x) N. Papadakis CLEAR 5 / 37

iterations [Osher et al. 2005, 2015, Burger et al. 2006], Boosting [Elad et al. 2007, 2015, Milanfar 2013]... 0 20 40 60 80 100 120 140 160 180 200 −50 0 50 100 150 200 Posi ti on Val u e Observation: y = x0 + w T V: ˆ x(y) 0 20 40 60 80 100 120 140 160 180 200 −60 −40 −20 0 20 40 60 Apply the model to the residual y − Φˆ x(y) Iterative process x0 = argmin x F(Φx − y) + λG(x) xk = xk−1 + argmin x F(Φx − (y − Φxk−1 )) + λk G(x) Limitations Support not preserved How many iterations? Varying parameters λk N. Papadakis CLEAR 5 / 37

1. Introduction to Re-ﬁtting Objectives Automatic re-ﬁtting process ˆ x(y)
= argmin x F(x, y) + λG(x) Keep structures and regularity present in the biased solution Correct the model bias No additional parameter N. Papadakis CLEAR 6 / 37

= argmin x F(x, y) + λG(x) Keep structures and regularity present in the biased solution Correct the model bias No additional parameter Handle existing “black box” algorithms ˆ x(y) Non-Local Means [Buades et al. 2005] BM3D [Dabov et al. 2007], DDID [Knaus & Zwicker 2013] Trained Deep Networks N. Papadakis CLEAR 6 / 37

= argmin x F(x, y) + λG(x) Keep structures and regularity present in the biased solution Correct the model bias No additional parameter Handle existing “black box” algorithms ˆ x(y) Non-Local Means [Buades et al. 2005] BM3D [Dabov et al. 2007], DDID [Knaus & Zwicker 2013] Trained Deep Networks Bias-variance trade-oﬀ Bias reduction is not always favorable in terms of MSE Re-ﬁtting re-injects part of the variance N. Papadakis CLEAR 6 / 37

= argmin x F(x, y) + λG(x) Keep structures and regularity present in the biased solution Correct the model bias No additional parameter Handle existing “black box” algorithms ˆ x(y) Non-Local Means [Buades et al. 2005] BM3D [Dabov et al. 2007], DDID [Knaus & Zwicker 2013] Trained Deep Networks Bias-variance trade-off Bias reduction is not always favorable in terms of MSE Re-fitting re-injects part of the variance MSE is not expected to be reduced with re-fitting N. Papadakis CLEAR 6 / 37

2. Invariant LEAst square Re-fitting Introduction to Re-fitting Invariant LEAst
square Re-fitting Covariant LEAst-square Re-fitting Practical considerations and experiments Conclusions N. Papadakis CLEAR

2. Invariant LEAst square Re-fitting Formalizing Re-fitting for TV Linear
inverse problem: y = Φx0 + w, y ∈ Rp observation , x0 ∈ Rn signal of interest , Φ ∈ Rp×n linear operator , E[w] = 0p white noise TV regularization: ˆ x(y) = argmin x 1 2 ||Φx − y||2 + λ||∇x||1 Re-fitting TV (constrained least-square [Efron et al. 2004, Lederer 2013]) ˜ x(y) ∈ argmin x∈Mˆ x(y) ||Φx − y||2 with Mˆ x (y) the model subspace: Mˆ x (y) = {x ∈ Rn \ ∀i, (∇ˆ x(y))i = 0 ⇒ (∇x)i = 0} Set of signals with same jumps (co-support) N. Papadakis CLEAR 7 / 37

2. Invariant LEAst square Re-ﬁtting Generalizing TV Re-ﬁtting procedure Reformulating
without the notion of jumps Understanding what is captured by Mˆ x (y) Idea: Mˆ x (y) captures linear invariances of ˆ x(y) w.r.t small perturbations on y N. Papadakis CLEAR 8 / 37

2. Invariant LEAst square Re-fitting Invariant Re-fitting [Deledalle, P. &
Salmon 2015] Piece-wise affine mapping y → ˆ x(y) ˆ x(y) = argmin x F(Φx, y) + λG(x) 0 20 40 60 80 100 120 140 160 180 200 -50 0 50 100 150 200 250 N. Papadakis CLEAR 9 / 37

Salmon 2015] Piece-wise aﬃne mapping y → ˆ x(y) ˆ x(y) = argmin x F(Φx, y) + λG(x) Jacobian of the estimator: (Jˆ x(y) )ij = ∂ˆ x(y)i ∂yj 0 20 40 60 80 100 120 140 160 180 200 -50 0 50 100 150 200 250 N. Papadakis CLEAR 9 / 37

Salmon 2015] Piece-wise aﬃne mapping y → ˆ x(y) ˆ x(y) = argmin x F(Φx, y) + λG(x) Jacobian of the estimator: (Jˆ x(y) )ij = ∂ˆ x(y)i ∂yj Model subspace: Mˆ x (y) = ˆ x(y) + Im[Jˆ x(y) ] 0 20 40 60 80 100 120 140 160 180 200 -50 0 50 100 150 200 250 Tangent space of the mapping N. Papadakis CLEAR 9 / 37

Salmon 2015] Piece-wise aﬃne mapping y → ˆ x(y) ˆ x(y) = argmin x F(Φx, y) + λG(x) Jacobian of the estimator: (Jˆ x(y) )ij = ∂ˆ x(y)i ∂yj Model subspace: Mˆ x (y) = ˆ x(y) + Im[Jˆ x(y) ] Invariant Least square Re-ﬁtting: Rinv ˆ x (y) = argmin x∈Mˆ x(y) Φx − y 2 0 20 40 60 80 100 120 140 160 180 200 -50 0 50 100 150 200 250 Tangent space of the mapping N. Papadakis CLEAR 9 / 37

2. Invariant LEAst square Re-fitting Practical Re-fitting for 1 analysis
minimization ˆ x(y) = argmin x 1 2 ||Φx − y||2 + λ||Γx||1 (1) Remark: Γ = Id is the LASSO, Γ = [∇x , ∇y ] is the Anisotropic TV. Numerical stability issue Piecewise constant solution [Strong & Chan 2003, Caselles et al., 2009] Re-fitting ˆ x(y) requires the support: ˆ I = {i ∈ [1, m], s.t. |Γˆ x|i > 0} But in practice, ˆ x is only approximated through a converging sequence ˆ xk Unfortunately, ˆ xk ≈ ˆ x ˆ Ik ≈ ˆ I Illustration for Anisotropic TV denoising (Φ = Id): Blurry obs. y Biased ˆ x estimate ˆ xk Re-fitting ˆ x Re-fitting ˆ xk N. Papadakis CLEAR 10 / 37

minimization ˆ x(y) = argmin x 1 2 ||Φx − y||2 + λ||Γx||1 (1) Remark: Γ = Id is the LASSO, Γ = [∇x , ∇y ] is the Anisotropic TV. Proposed approach Provided Ker Φ ∩ Ker Γ={0}, one has for (1) [Vaiter et al. 2016]: Rinv ˆ x (y) = Jy [y] with Jy = ∂ˆ x(y) ∂y y Interpretation: Rinv ˆ x (y) is the derivative of ˆ x(y) in the direction of y Algorithm: Compute ˜ xk by chain rule as the derivative of ˆ xk(y) in the direction of y Question: Does ˜ xk converge towards Rinv ˆ x (y)? N. Papadakis CLEAR 11 / 37

minimization ˆ x(y) = argmin x 1 2 ||Φx − y||2 + λ||Γx||1 (1) Remark: Γ = Id is the LASSO, Γ = [∇x , ∇y ] is the Anisotropic TV. Proposed approach Provided Ker Φ ∩ Ker Γ={0}, one has for (1) [Vaiter et al. 2016]: Rinv ˆ x (y) = Jy [y] with Jy = ∂ˆ x(y) ∂y y Interpretation: Rinv ˆ x (y) is the derivative of ˆ x(y) in the direction of y Algorithm: Compute ˜ xk by chain rule as the derivative of ˆ xk(y) in the direction of y Question: Does ˜ xk converge towards Rinv ˆ x (y)? yes, at least for the ADMM or the Chambolle-Pock sequences N. Papadakis CLEAR 11 / 37

2. Invariant LEAst square Re-ﬁtting Implementation for Anisotropic TV ˆ
x(y) = argmin x 1 2 ||y − x||2 + λ||Γx||1 Primal-dual implementation [Chambolle and Pock 2011]:          zk+1 = ProjBλ zk + σΓxk xk+1 = xk+τ(y−Γ (zk+1)) 1+τ Projection: ProjBλ (z)i = zi if |zi | λ λsign(zi ) otherwise N. Papadakis CLEAR 12 / 37

x(y) = argmin x 1 2 ||y − x||2 + λ||Γx||1 Primal-dual implementation [Chambolle and Pock 2011]:          zk+1 = ProjBλ zk + σΓxk ˜ zk+1 = Pzk+σΓxk ˜ zk + σΓ˜ xk xk+1 = xk+τ(y−Γ (zk+1)) 1+τ ˜ xk+1 = ˜ xk+τ(y−Γ (˜ zk+1)) 1+τ Projection: ProjBλ (z)i = zi if |zi | λ λsign(zi ) otherwise Pz = Id if |zi | λ + β 0 otherwise N. Papadakis CLEAR 12 / 37

x(y) = argmin x 1 2 ||y − x||2 + λ||Γx||1 Primal-dual implementation [Chambolle and Pock 2011]:          zk+1 = ProjBλ zk + σΓxk ˜ zk+1 = Pzk+σΓxk ˜ zk + σΓ˜ xk xk+1 = xk+τ(y−Γ (zk+1)) 1+τ ˜ xk+1 = ˜ xk+τ(y−Γ (˜ zk+1)) 1+τ Projection: ProjBλ (z)i = zi if |zi | λ λsign(zi ) otherwise Pz = Id if |zi | λ + β 0 otherwise Theorem: The sequence ˜ xk converges to the re-ﬁtting Rinv ˆ x (y) of ˆ x(y), ∀β > 0 s.t. β < σ|Γˆ x(y)|i , ∀i ∈ I N. Papadakis CLEAR 12 / 37

x(y) = argmin x 1 2 ||y − x||2 + λ||Γx||1 Primal-dual implementation [Chambolle and Pock 2011]:          zk+1 = ProjBλ zk + σΓxk ˜ zk+1 = Pzk+σΓxk ˜ zk + σΓ˜ xk xk+1 = xk+τ(y−Γ (zk+1)) 1+τ ˜ xk+1 = ˜ xk+τ(y−Γ (˜ zk+1)) 1+τ Projection: ProjBλ (z)i = zi if |zi | λ λsign(zi ) otherwise Pz = Id if |zi | λ + β 0 otherwise Theorem: The sequence ˜ xk converges to the re-ﬁtting Rinv ˆ x (y) of ˆ x(y), ∀β > 0 s.t. β < σ|Γˆ x(y)|i , ∀i ∈ I Complexity: twice that of the Chambolle-Pock algorithm. N. Papadakis CLEAR 12 / 37

2. Invariant LEAst square Re-ﬁtting Anisotropic TV: illustration y ˆ
x(y) Rinv ˆ x (y) N. Papadakis CLEAR 13 / 37

2. Invariant LEAst square Re-fitting Anisotropic TV: Bias-variance trade-off PSNR:
22.45, SSIM: 0.416 PSNR: 24.80, SSIM: 0.545 PSNR: 27.00, SSIM: 0.807 Regularization parameter λ 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Quadratic cost 80 100 120 140 160 180 200 220 240 Original ˆ x(y) Re-fitted Rˆ x (y) Optimum original Optimum re-fitted Sub-optimum original Sub-optimum re-fitted PSNR: 28.89, SSIM: 0.809 Anisotropic TV PSNR: 28.28, SSIM: 0.823 CLEAR N. Papadakis CLEAR 14 / 37

2. Invariant LEAst square Re-ﬁtting Anisotropic TV: Bias-variance trade-oﬀ PSNR:
22.84, SSIM: 0.312 PSNR: 35.99, SSIM: 0.938 PSNR: 23.96, SSIM: 0.694 PSNR: 38.22, SSIM: 0.935 Anisotropic TV PSNR: 46.42, SSIM: 0.986 CLEAR N. Papadakis CLEAR 15 / 37

2. Invariant LEAst square Re-ﬁtting Example: Anisotropic TGV x0 y
ˆ x(y) Rinv ˆ x (y) N. Papadakis CLEAR 16 / 37

2. Invariant LEAst square Re-ﬁtting Example: Isotropic TV Restoration model
for image y ˆ x(y) = argmin x 1 2 ||y − x||2 + λ||∇x||1,2 Model subspace of isotropic TV is the same than anisotropic TV: signals whose gradients share their support with ∇ˆ x(y) N. Papadakis CLEAR 17 / 37

2. Invariant LEAst square Re-ﬁtting Example: Isotropic TV Restoration model
for image y ˆ x(y) = argmin x 1 2 ||y − x||2 + λ||∇x||1,2 Model subspace of isotropic TV is the same than anisotropic TV: signals whose gradients share their support with ∇ˆ x(y) Noise free Noisy data y ˆ x(y) Rinv ˆ x (y) Non sparse support: noise is re-injected Illustration done with an ugly standard (i.e. non Condat and non Chambolle-Pock) discretization of isotropic TV N. Papadakis CLEAR 17 / 37

2. Invariant LEAst square Re-ﬁtting Limitations Model subspace Only captures
linear invariances w.r.t. small perturbations of y Jacobian matrix Captures desirable covariant relationships between the entries of y and the entries of ˆ x(y) that should be preserved [Deledalle, P., Salmon and Vaiter, 2017, 2019] N. Papadakis CLEAR 18 / 37

3. Covariant LEAst-square Re-fitting Introduction to Re-fitting Invariant LEAst square
Re-fitting Covariant LEAst-square Re-fitting Practical considerations and experiments Conclusions N. Papadakis CLEAR

3. Covariant LEAst-square Re-ﬁtting Least-square Re-ﬁtting General problem ˆ x(y)
= argmin x F(Φx, y) + λG(x) Φ linear operator, F and G convex N. Papadakis CLEAR 19 / 37

= argmin x F(Φx, y) + λG(x) Φ linear operator, F and G convex Desirable properties of Re-fitting operator h ∈ Hˆ x iff 1 h ∈ Mˆ x (y) 2 Affine map: h(y) = Ay + b 3 Preservation of covariants: Jh (y) = ρJˆ x (y) 4 Coherence: h(Φˆ x(y)) = ˆ x(y) N. Papadakis CLEAR 19 / 37

= argmin x F(Φx, y) + λG(x) Φ linear operator, F and G convex Desirable properties of Re-fitting operator h ∈ Hˆ x iff 1 h ∈ Mˆ x (y) 2 Affine map: h(y) = Ay + b 3 Preservation of covariants: Jh (y) = ρJˆ x (y) 4 Coherence: h(Φˆ x(y)) = ˆ x(y) Covariant LEAst-square Re-fitting Rcov ˆ x (y) = argmin x∈Hˆ x 1 2 ||Φx − y||2 N. Papadakis CLEAR 19 / 37

3. Covariant LEAst-square Re-fitting Covariant LEAst-square Re-fitting Proposition The covariant
Re-fitting has an explicit formulation Rcov ˆ x (y) = ˆ x(y) + ρJ(y − Φˆ x(y)) = argmin x∈Hˆ x 1 2 ||Φx − y||2 where for δ = y − Φˆ x(y): ρ = ΦJδ,δ ||ΦJδ||2 if ΦJδ = 0 1 otherwise N. Papadakis CLEAR 20 / 37

Re-ﬁtting has an explicit formulation Rcov ˆ x (y) = ˆ x(y) + ρJ(y − Φˆ x(y)) = argmin x∈Hˆ x 1 2 ||Φx − y||2 where for δ = y − Φˆ x(y): ρ = ΦJδ,δ ||ΦJδ||2 if ΦJδ = 0 1 otherwise Properties If ΦJ is an orthogonal projector, ρ = 1 and ΦRcov ˆ x (y) = ΦRinv ˆ x (y) N. Papadakis CLEAR 20 / 37

Re-ﬁtting has an explicit formulation Rcov ˆ x (y) = ˆ x(y) + ρJ(y − Φˆ x(y)) = argmin x∈Hˆ x 1 2 ||Φx − y||2 where for δ = y − Φˆ x(y): ρ = ΦJδ,δ ||ΦJδ||2 if ΦJδ = 0 1 otherwise Properties If ΦJ is an orthogonal projector, ρ = 1 and ΦRcov ˆ x (y) = ΦRinv ˆ x (y) If F convex, G convex and 1-homogenous and ˆ x(y) = argmin x F(Φx − y) + G(x), then JΦˆ x(y) = ˆ x(y) a.e. so that: Rcov ˆ x (y) = (1 − ρ)ˆ x(y) + ρJy N. Papadakis CLEAR 20 / 37

3. Covariant LEAst-square Re-ﬁtting Statistical interpretation Theorem (Bias reduction) If
ΦJ is an orthogonal projector or ρ satisﬁes technical conditions ||Φ(E[Dˆ x (Y )] − x0 )||2 ||Φ(E[Tˆ x (Y )] − x0 )||2 N. Papadakis CLEAR 21 / 37

3. Covariant LEAst-square Re-ﬁtting Example: Isotropic TV Noise free Noisy
data y ˆ x(y) Rinv ˆ x (y) Rcov ˆ x (y) N. Papadakis CLEAR 22 / 37

3. Covariant LEAst-square Re-ﬁtting Why not iterating as Boosting approaches?
Diﬀerentiable estimator w.r.t y: ˜ x0 = ˆ x(y) = argmin x ||x−y||2+λ||∇x||1,2 N. Papadakis CLEAR 23 / 37

Diﬀerentiable estimator w.r.t y: ˜ x0 = ˆ x(y) = argmin x ||x−y||2+λ||∇x||1,2 Bregman iterations: ˜ xk+1 = argmin x ||x−y||2+λD||∇.||1,2 (x, ˜ xk ) N. Papadakis CLEAR 23 / 37

Diﬀerentiable estimator w.r.t y: ˜ x0 = ˆ x(y) = argmin x ||x−y||2+λ||∇x||1,2 Bregman iterations: ˜ xk+1 = argmin x ||x−y||2+λD||∇.||1,2 (x, ˜ xk ) Convergence: ˜ xk → y N. Papadakis CLEAR 23 / 37

Diﬀerentiable estimator w.r.t y: ˜ x0 = ˆ x(y) = argmin x ||x−y||2+λ||∇x||1,2 Covariant iterations: ˜ xk+1(y) = ˜ xk(y) + ρJ(y − Φ˜ xk(y)) N. Papadakis CLEAR 23 / 37

Diﬀerentiable estimator w.r.t y: ˜ x0 = ˆ x(y) = argmin x ||x−y||2+λ||∇x||1,2 Covariant iterations: ˜ xk+1(y) = ˜ xk(y) + ρJ(y − Φ˜ xk(y)) Convergence: ˜ xk(y) → Rinv ˆ x (y) N. Papadakis CLEAR 23 / 37

4. Practical considerations and experiments Introduction to Re-fitting Invariant LEAst
square Re-fitting Covariant LEAst-square Re-fitting Practical considerations and experiments Conclusions N. Papadakis CLEAR

4. Practical considerations and experiments How computing the covariant Re-ﬁtting?
Explicit expression: Rcov ˆ x (y) = ˆ x(y) + ρJδ with J = ∂ˆ x(y) ∂y , δ = y − Φˆ x(y) and ρ = ΦJδ,δ ||ΦJδ||2 if ΦJδ = 0 1 otherwise Main issue Being able to compute Jδ N. Papadakis CLEAR 24 / 37

4. Practical considerations and experiments Application of the Jacobian matrix
to a vector Algorithmic diﬀerentiation Iterative algorithm to obtain ˆ x(y): xk+1 = ψ(xk, y) Diﬀerentiation in the direction δ: xk+1 = ψ(xk, y) ˜ xk+1 = ∂1 ψ(xk, y)˜ xk + ∂2 ψ(xk, y)δ Jxk (y)δ = ˜ xk Joint estimation of xk and Jxk (y)δ Double the computational cost N. Papadakis CLEAR 25 / 37

4. Practical considerations and experiments Application of the Jacobian matrix
to a vector Finite diﬀerence based diﬀerentiation ˆ x(y) can be any black box algorithm Directional derivative w.r.t to direction δ: Jˆ x (y)δ ≈ ˆ x(y + εδ) − ˆ x(y) ε Need to apply twice the algorithm N. Papadakis CLEAR 26 / 37

4. Practical considerations and experiments Computation of the Re-fitting Covariant
LEAst-square Re-fitting Rcov ˆ x (y) = ˆ x(y) + ρJδ, with δ = y − Φˆ x(y) and ρ = Jδ, δ ||Jδ||2 2 Two-steps with any denoising algorithm 1 Apply algorithm to y to get ˆ x(y) and set δ = y − Φˆ x(y) 2 Compute Jδ (with algorithmic or finite difference based differentiation) N. Papadakis CLEAR 27 / 37

4. Practical considerations and experiments Computation of the Re-fitting Covariant
LEAst-square Re-fitting Rcov ˆ x (y) = ˆ x(y) + ρJδ, with δ = y − Φˆ x(y) and ρ = Jδ, δ ||Jδ||2 2 Two-steps with any denoising algorithm 1 Apply algorithm to y to get ˆ x(y) and set δ = y − Φˆ x(y) 2 Compute Jδ (with algorithmic or finite difference based differentiation) One-step for absolutely 1-homogeneous regularizer Re-fitting simplifies to Rcov ˆ x (y) = (1 − ρ)ˆ x(y) + ρJy 1 Estimate jointly ˆ x(y) and Jy with the differentiated algorithm N. Papadakis CLEAR 27 / 37

4. Practical considerations and experiments 1 step Implementation: Anisotropic TV
Model with 1-homogeneous regularizer: ˆ x(y) = argmin x 1 2 ||y − x||2 + λ||∇x||1 Primal-dual implementation [Chambolle and Pock 2011]:          zk+1 = ProjBλ zk + σ∇xk ˜ zk+1 = Pzk+σ∇xk ˜ zk + σ∇˜ xk xk+1 = xk+τ(y+div(zk+1)) 1+τ ˜ xk+1 = ˜ xk+τ(y+div(˜ zk+1)) 1+τ Projection: ProjBλ (z)i = zi if |zi | λ λsign(zi ) otherwise Pz = Id if |zi | λ + β 0 otherwise xk → ˆ x(y) and ˜ xk = Jxk y → Jˆ x y J is an orthogonal projector: Rcov ˆ x (y) = Rinv ˆ x (y) = Jˆ x y N. Papadakis CLEAR 28 / 37

4. Practical considerations and experiments 1 step Implementation: Isotropic TV
Model with 1-homogeneous regularizer: ˆ x(y) = argmin x 1 2 ||y − x||2 + λ||∇x||1,2 Primal-dual implementation [Chambolle and Pock 2011]:          zk+1 = ProjBλ zk + σ∇xk xk+1 = xk+τ(y+div(zk+1)) 1+τ Projection: ProjBλ (z)i = zi if ||zi ||2 λ λ zi ||zi||2 otherwise N. Papadakis CLEAR 29 / 37

Model with 1-homogeneous regularizer: ˆ x(y) = argmin x 1 2 ||y − x||2 + λ||∇x||1,2 Primal-dual implementation [Chambolle and Pock 2011]:          zk+1 = ProjBλ zk + σ∇xk ˜ zk+1 = Pzk+σ∇xk ˜ zk + σ∇˜ xk xk+1 = xk+τ(y+div(zk+1)) 1+τ ˜ xk+1 = ˜ xk+τ(y+div(˜ zk+1)) 1+τ Projection: ProjBλ (z)i = zi if ||zi ||2 λ λ zi ||zi||2 otherwise Pz = Id if ||zi || λ + β λ ||zi||2 Id − zizi ||zi||2 2 otherwise xk → ˆ x(y) and ˜ xk = Jxk y → ˜ x N. Papadakis CLEAR 29 / 37

Model with 1-homogeneous regularizer: ˆ x(y) = argmin x 1 2 ||y − x||2 + λ||∇x||1,2 Primal-dual implementation [Chambolle and Pock 2011]:          zk+1 = ProjBλ zk + σ∇xk ˜ zk+1 = Pzk+σ∇xk ˜ zk + σ∇˜ xk xk+1 = xk+τ(y+div(zk+1)) 1+τ ˜ xk+1 = ˜ xk+τ(y+div(˜ zk+1)) 1+τ Projection: ProjBλ (z)i = zi if ||zi ||2 λ λ zi ||zi||2 otherwise Pz = Id if ||zi || λ + β λ ||zi||2 Id − zizi ||zi||2 2 otherwise xk → ˆ x(y) and ˜ xk = Jxk y → ˜ x Covariant re-ﬁtting: Rcov ˆ x (y) = (1 − ρ)ˆ x + ρ˜ x, with ρ = ˜ x−ˆ x,y−ˆ x ||˜ x−ˆ x||2 2 N. Papadakis CLEAR 29 / 37

4. Practical considerations and experiments Inpainting with Isotropic TV y
ˆ x(y) Rcov ˆ x (y) Attenuated structures Residual lost structures x0 ||ˆ x(y) − x0 ||2 ||Rcov ˆ x (y) − x0 ||2 N. Papadakis CLEAR 30 / 37

4. Practical considerations and experiments Extension to chrominance [Pierre, Aujol,
Deledalle, P., 2017] Noise free image Noisy image Denoised image N. Papadakis CLEAR 31 / 37

4. Practical considerations and experiments Extension to chrominance [Pierre, Aujol,
Deledalle, P., 2017] Noise free image Noisy image Re-ﬁtting N. Papadakis CLEAR 31 / 37

4. Practical considerations and experiments 2 steps Implementation: Non-Local Means
Model without 1-homogeneous regularizer: ˆ x(y)i = j wy ij yj j wy ij , wy i,j = exp −||Pi y − Pj y||2 2 /h2 Differentiate NLM code Algorithm: Re-fitting with algorithmic differentiation: 1: Run NLM code ˆ x(y) and set δ = y − ˆ x(y) 2: Run differentiated NLM code in the direction δ to get Jδ 3: Set ρ = Jδ,δ ||Jδ||2 2 4: Covariant re-fitting: Rcov ˆ x (y) = ˆ x(y) + ρJδ N. Papadakis CLEAR 32 / 37

4. Practical considerations and experiments Non-Local Means y ˆ x(y)
Rcov ˆ x (y) Attenuated structures Residual lost structures x0 ||ˆ x(y) − x0 ||2 ||Rcov ˆ x (y) − x0 ||2 N. Papadakis CLEAR 33 / 37

4. Practical considerations and experiments Non-Local Means: Bias-variance trade-off PSNR:
22.18, SSIM: 0.397 PSNR: 25.07, SSIM: 0.564 PSNR: 26.62, SSIM: 0.724 Filtering parameter h 1 2 3 4 5 6 Quadratic cost 100 150 200 Original ˆ x(y) Re-fitted Rˆ x (y) Optimum original Optimum re-fitted Sub-optimum original Sub-optimum re-fitted PSNR: 30.12, SSIM: 0.815 Non-Local Means PSNR: 29.20, SSIM: 0.823 CLEAR N. Papadakis CLEAR 34 / 37

4. Practical considerations and experiments Bias-variance trade-oﬀ: Non-Local Means PSNR:
22.18, SSIM: 0.847 PSNR: 24.68, SSIM: 0.914 PSNR: 24.64, SSIM: 0.910 PSNR: 27.04, SSIM: 0.946 Non-Local Means PSNR: 27.29, SSIM: 0.955 CLEAR N. Papadakis CLEAR 35 / 37

4. Practical considerations and experiments 2 steps Implementation for Black
Box algorithm Denoising algorithm: y → ˆ x(y) Re-fitting with finite difference: 1: δ = y − ˆ x(y) 2: Jδ = ˆ x(y+εδ)−ˆ x(y)) ε 3: ρ = Jδ,δ ||Jδ||2 2 4: Rcov ˆ x (y) = ˆ x(y) + ρJδ N. Papadakis CLEAR 36 / 37

4. Practical considerations and experiments BM3D [Dabov et al. 2007,
Lebrun 2012] PSNR: 22.17, SSIM: 0.528 PSNR: 25.00, SSIM: 0.7523 PSNR: 26.92, SSIM: 0.861 Filtering parameter log γ -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Quadratic cost 50 100 150 200 250 Original ˆ x(y) Re-fitted R[ˆ x] (y) Optimum original Optimum re-fitted Sub-optimum original Sub-optimum re-fitted PSNR: 30.29, SSIM: 0.920 BM3D PSNR: 29.41, SSIM: 0.918 CLEAR N. Papadakis CLEAR 37 / 37

4. Practical considerations and experiments DDID [Knaus & Zwicker 2013]
PSNR: 22.16, SSIM: 0.452 PSNR: 26.33, SSIM: 0.716 PSNR: 26.60, SSIM: 0.721 Filtering parameter log γ -2 -1 0 1 2 3 4 5 6 Quadratic cost 40 60 80 100 120 140 160 180 200 PSNR: 31.02, SSIM: 0.858 DDID PSNR: 29.91, SSIM: 0.845 CLEAR N. Papadakis CLEAR 38 / 37

4. Practical considerations and experiments DnCNN [Zhang et al., 2017]
Residual Network learning noise to remove Noise level 25 DnCNN N. Papadakis CLEAR 39 / 37

Residual Network learning noise to remove Noise level 25 CLEAR N. Papadakis CLEAR 39 / 37

Residual Network learning noise to remove Noise level 150 CLEAR No interesting structural information to recover from noise model N. Papadakis CLEAR 39 / 37

4. Practical considerations and experiments FFDNet [Zhang et al., 2018]
Network learning denoised image for Gaussian noise of variance [0; 75] Noise level 25 FFDNet N. Papadakis CLEAR 40 / 37

Network learning denoised image for Gaussian noise of variance [0; 75] Noise level 25 CLEAR N. Papadakis CLEAR 40 / 37

5. Conclusions Introduction to Re-fitting Invariant LEAst square Re-fitting Covariant
LEAst-square Re-fitting Practical considerations and experiments Conclusions N. Papadakis CLEAR

5. Conclusions Conclusions Covariant LEAst-square Re-ﬁtting Correct part of the
bias of restoration models No aditionnal parameter Stability for a larger range of parameters Double the computational cost N. Papadakis CLEAR 41 / 37

5. Conclusions Conclusions Covariant LEAst-square Re-fitting Correct part of the
bias of restoration models No aditionnal parameter Stability for a larger range of parameters Double the computational cost When using re-fitting? Differentiable estimators: no algorithm with quantization Regularization prior adapted to data Respect data range: oceanography, radiotherapy... N. Papadakis CLEAR 41 / 37

5. Conclusions Main related references [Brinkmann, Burger, Rasch and Sutour]
Bias-reduction in variational regularization.JMIV, 2017. [Deledalle, P. and Salmon] On debiasing restoration algorithms: applications to total-variation and nonlocal-means. SSVM, 2015. [Deledalle, P., Salmon and Vaiter] CLEAR: Covariant LEAst-square Re-ﬁtting. SIAM SIIMS, 2017. [Deledalle, P., Salmon and Vaiter] Reﬁtting solutions with block penalties, SSVM, 2019. [Osher, Burger, Goldfarb, Xu, and Yin] An iterative regularization method for total variation-based image restoration. SIAM MMS, 2005 [Romano and Elad] Boosting of image denoising algorithms. SIAM SIIMS, 2015. [Talebi, Zhu and Milanfar] How to SAIF-ly boost denoising performance. IEEE TIP, 2013. [Vaiter, Deledalle, Peyré, Fadili and Dossal] The degrees of freedom of partly smooth regularizers. Annals of the Institute of Statistical Mathematics, 2016. N. Papadakis CLEAR

Covariant LEAst-square Re-fitting for image res...

Covariant LEAst-square Re-fitting for image restoration

More Decks by npapadakis

Other Decks in Science

Featured

Transcript