Claire Boyer

Spike deconvolution with unknown noise level Claire Boyer, Yohann de
Castro, Joseph Salmon, UPMC, LSTA, France January 20, 2017

1 Introduction 2 The proposed approach Results Some words about
the proof 3 Numerical aspects 4 Numerical experiments

Introduction Motivations Point sources in applications Astronomy Microscopy (ﬂuorescent molecules)
Spectroscopy 2 / 37 Spike deconvolution

Introduction Motivations Point sources in applications Astronomy Microscopy (ﬂuorescent molecules)
Spectroscopy Oﬀ-the-grid Point sources do not live on a cartesian grid. 2 / 37 Spike deconvolution

Introduction What is spike deconvolution? Setting Aim: reconstruct a discrete
measure µ0 = s i=1 a0 i δt0 i supp(µ0) = {t0 1 , . . . , t0 s } ⊂ T (a0 i )1≤i≤s ∈ Cs 3 / 37 Spike deconvolution

measure µ0 = s i=1 a0 i δt0 i supp(µ0) = {t0 1 , . . . , t0 s } ⊂ T (a0 i )1≤i≤s ∈ Cs From linear measurements y = Φµ0 + ε Φ is a ﬁlter ε is a complex Gaussian vector, ε = ε(1) + iε(2) with ε(i) ∼ N (0, σ2 0 Id) 3 / 37 Spike deconvolution

measure µ0 = s i=1 a0 i δt0 i supp(µ0) = {t0 1 , . . . , t0 s } ⊂ T (a0 i )1≤i≤s ∈ Cs From linear measurements y = Φµ0 + ε Φ is a ﬁlter ε is a complex Gaussian vector, ε = ε(1) + iε(2) with ε(i) ∼ N (0, σ2 0 Id) In this work Low-pass ﬁlter : Φ = Fn Fn(µ) := (ck (µ))|k|≤fc , ck (µ) := T exp(−2πıkt)µ(dt) 3 / 37 Spike deconvolution

Introduction How to reconstruct µ0? 1-deconvolution: Beurling lasso (Blasso) min
µ 1 2 y − Fn (µ) 2 2 + λ µ TV µ TV = sup | j µ(Bj )| over all ﬁnite partitions (Bj ) of [0, 1] µ TV = s i=1 |ai | when the measure is discrete 4 / 37 Spike deconvolution

Introduction Beurling minimal extrapolation Beurling minimal extrapolation (1938) µ ∈
argmin µ µ TV s.t. T Φdµ = y with Φ = (ϕ1, . . . , ϕn ) Inﬁnite-dimensional variable µ Finitely many constraints 5 / 37 Spike deconvolution

Introduction Beurling minimal extrapolation Beurling minimal extrapolation (1938) µ ∈
argmin µ µ TV s.t. T Φdµ = y with Φ = (ϕ1, . . . , ϕn ) Inﬁnite-dimensional variable µ Finitely many constraints Fenchel dual program ˆ u ∈ arg max y, u s.t. n k=0 uk ϕk ∞ ≤ 1 Finite-dimensional variable Inﬁnitely many constraints Strong duality 5 / 37 Spike deconvolution

µ 1 2 y − Fn (µ) 2 2 + λ µ TV µ TV = sup | j µ(Bj )| over all ﬁnite partitions (Bj ) of [0, 1] µ TV = s i=1 |ai | when the measure is discrete Algorithms proximal-based [Bredies, Pikkarainen 2012] root-ﬁnding [Candés, Fernandez-Granda 2012] 6 / 37 Spike deconvolution

µ 1 2 y − Fn (µ) 2 2 + λ µ TV µ TV = sup | j µ(Bj )| over all ﬁnite partitions (Bj ) of [0, 1] µ TV = s i=1 |ai | when the measure is discrete Algorithms proximal-based [Bredies, Pikkarainen 2012] root-ﬁnding [Candés, Fernandez-Granda 2012] Others based on Prony’s methods : MUSIC, ESPRIT, FRI 6 / 37 Spike deconvolution

µ 1 2 y − Fn (µ) 2 2 + λ µ TV µ TV = sup | j µ(Bj )| over all ﬁnite partitions (Bj ) of [0, 1] µ TV = s i=1 |ai | when the measure is discrete Algorithms proximal-based [Bredies, Pikkarainen 2012] root-ﬁnding [Candés, Fernandez-Granda 2012] Others based on Prony’s methods : MUSIC, ESPRIT, FRI Link with 1-regularization ? 6 / 37 Spike deconvolution

Introduction Link to compressed sensing Spike deconvolution Compressed sensing Setting
∞-dim ﬁnite-dim (or easy ∞-dim) Object of interest a discrete measure a sparse signal (mainly) µ = s i=1 ai δti x ∈ Cn 1illustrations from Carlos Fernandez-Granda 7 / 37 Spike deconvolution

∞-dim ﬁnite-dim (or easy ∞-dim) Object of interest a discrete measure a sparse signal (mainly) µ = s i=1 ai δti x ∈ Cn Measurements low-pass ﬁlter random 1 spectrum extrapolation spectrum interpolation 1illustrations from Carlos Fernandez-Granda 7 / 37 Spike deconvolution

∞-dim ﬁnite-dim (or easy ∞-dim) Object of interest a discrete measure a sparse signal (mainly) µ = s i=1 ai δti x ∈ Cn Measurements low-pass ﬁlter random 1 spectrum extrapolation spectrum interpolation Resolution minµ 1 2 y − Fn (µ) 2 2 + λ µ TV minx 1 2 y − Frd (x) 2 2 + λ x 1 Beurling Lasso estimator Lasso Key feature Minimum separation Measurement incoherence + degree of sparsity 1illustrations from Carlos Fernandez-Granda 7 / 37 Spike deconvolution

∞-dim finite-dim (or easy ∞-dim) Object of interest a discrete measure a sparse signal (mainly) µ = s i=1 ai δti x ∈ Cn Measurements low-pass filter random 1 spectrum extrapolation spectrum interpolation Resolution minµ 1 2 y − Fn (µ) 2 2 + λ µ TV minx 1 2 y − Frd (x) 2 2 + λ x 1 Beurling Lasso estimator Lasso Key feature Minimum separation Measurement incoherence + degree of sparsity Algorithm Root-finding (...) Easy convex programming 1illustrations from Carlos Fernandez-Granda 7 / 37 Spike deconvolution

Introduction Contributions Handling unknown noise level 8 / 37 Spike
deconvolution

Introduction Contributions Handling unknown noise level Assessing the noise level
using the Rice method for a non-Gaussian process 8 / 37 Spike deconvolution

using the Rice method for a non-Gaussian process Prediction & strong localization accuracy 8 / 37 Spike deconvolution

using the Rice method for a non-Gaussian process Prediction & strong localization accuracy Numerics : the root-ﬁnding can always be done (open question of [1, Eq. (4.5)]) E. J. Candès and C. Fernandez-Granda. Towards a mathematical theory of super-resolution. Communications on Pure and Applied Mathematics, 67(6):906–956, 2014. 8 / 37 Spike deconvolution

using the Rice method for a non-Gaussian process Prediction & strong localization accuracy Numerics : the root-ﬁnding can always be done (open question of [1, Eq. (4.5)]) Claire Boyer, Yohann de Castro, and Joseph Salmon. Adapting to unknown noise level in sparse deconvolution. Information and Inference, abs/1606.04760, 2016. E. J. Candès and C. Fernandez-Granda. Towards a mathematical theory of super-resolution. Communications on Pure and Applied Mathematics, 67(6):906–956, 2014. 8 / 37 Spike deconvolution

The proposed approach The proposed method Data y = Fnµ0
+ ε y ∈ Cn with n = 2fc + 1 ε is a complex Gaussian vector, ε = ε(1) + iε(2) with ε(j) ∼ N(0, σ2 0 Id) 9 / 37 Spike deconvolution

The proposed approach The proposed method Data y = Fnµ0
+ ε y ∈ Cn with n = 2fc + 1 ε is a complex Gaussian vector, ε = ε(1) + iε(2) with ε(j) ∼ N(0, σ2 0 Id) Handling unknown noise level : CBLasso (ˆ µ, ˆ σ) ∈ argmin (µ,σ)∈E∗×R++ 1 2nσ y − Fn (µ) 2 2 + σ 2 + λ µ TV , Owen 07, Antoniadis 10, Belloni Chernozhukov Wang 11, Sun Zhang 12, van de Geer 15 : square-root lasso and scaled-lasso. 9 / 37 Spike deconvolution

The proposed approach Where is the square-root? Why square-root? The
CBLasso reads ([2] ∼ scaled lasso) (ˆ µ, ˆ σ) ∈ argmin (µ,σ)∈E∗×R++ 1 2nσ y − Fn (µ) 2 2 + σ 2 + λ µ TV . When the solution is reached for ˆ σ > 0, ([1] ∼ square-root lasso) ˆ σ = y − Fn (ˆ µ) 2/ √ n ˆ µ ∈ argmin µ∈E∗ y − Fn (µ) 2/ √ n + λ µ TV A. Belloni, V. Chernozhukov, and L. Wang. Square-root Lasso: Pivotal recovery of sparse signals via conic programming. Biometrika, 98(4):791–806, 2011. T. Sun and C.-H. Zhang. Scaled sparse linear regression. Biometrika, 99(4):879–898, 2012. 10 / 37 Spike deconvolution

The proposed approach Compatibility limits Suﬃcient conditions for oracle inequalities
RIP =⇒ REC =⇒ Compatibility Compatibility condition : C(L, S) > 0 C(L, S) := inf |S| Fn(ν) 2 2 /n s.t. supp(ν) = S, νS TV = 1, νSc TV ≤ L . Consider ν as a diﬀerence of two measures: ˆ µ and µ0 11 / 37 Spike deconvolution

The proposed approach Compatibility limits Sufficient conditions for oracle inequalities
RIP =⇒ REC =⇒ Compatibility Compatibility condition : C(L, S) > 0 C(L, S) := inf |S| Fn(ν) 2 2 /n s.t. supp(ν) = S, νS TV = 1, νSc TV ≤ L . Consider ν as a difference of two measures: ˆ µ and µ0 Compatibility does not hold νε = δ−ε + δε @@@@@ Compatibility =⇒ $$ REC =⇒ ¨ ¨ RIP highly coherent designs: close Dirac masses share almost the same Fourier coefficients Non-uniform approach Measure-dependent reconstruction 11 / 37 Spike deconvolution

The proposed approach – Results Standard assumptions Assumption (Sampling rate
condition). λSNR ≤ √ 17 − 4 2 0.0616 with SNR := µ0 TV E[ ε 2 2 ]/n = µ0 TV √ 2σ0 . λ ≥ 2 2 log n/n =⇒ n/log n ≥ C SNR2 12 / 37 Spike deconvolution

condition). λSNR ≤ √ 17 − 4 2 0.0616 with SNR := µ0 TV E[ ε 2 2 ]/n = µ0 TV √ 2σ0 . λ ≥ 2 2 log n/n =⇒ n/log n ≥ C SNR2 Assumption (No-overﬁtting). ˆ σ > 0 12 / 37 Spike deconvolution

condition). λSNR ≤ √ 17 − 4 2 0.0616 with SNR := µ0 TV E[ ε 2 2 ]/n = µ0 TV √ 2σ0 . λ ≥ 2 2 log n/n =⇒ n/log n ≥ C SNR2 Assumption (No-overﬁtting). ˆ σ > 0 Assumption (Separation condition). For supp(µ0) = {t0 1 , . . . , t0 s0 }, ∀i = j, d(t0 i , t0 j ) ≥ 1.26 fc 12 / 37 Spike deconvolution

The proposed approach – Results Prediction result Theorem (B., de
Castro, Salmon, 2016). Let C > 2 √ 2. Set C > 0, that may depend on C. Assume the sampling rate condition, the separation condition. The estimator ˆ µ solution to CBLasso with a choice λ ≥ C log n/n satisﬁes 1 n Fn (ˆ µ − µ0) 2 2 ≤ C s0 λ2 σ0, with high probability. 13 / 37 Spike deconvolution

The proposed approach – Results Prediction result Theorem (B., de
Castro, Salmon, 2016). Let C > 2 √ 2. Set C > 0, that may depend on C. Assume the sampling rate condition, the separation condition. The estimator ˆ µ solution to CBLasso with a choice λ ≥ C log n/n satisﬁes 1 n Fn (ˆ µ − µ0) 2 2 ≤ C s0 λ2 σ2 0 = O s0σ2 0 log n n , with high probability. "fast rate of convergence" of [1] G. Tang, B. N. Bhaskar, and B. Recht. Near minimax line spectral estimation. Information Theory, IEEE Transactions on, 61(1):499–512, 2015. 13 / 37 Spike deconvolution

The proposed approach – Results Localization results Near regions ∀j
∈ [s0 ], Nj := t ∈ [0, 1]; d(t, t0 j ) ≤ c1 fc , with 0 < c1 < 1.26/2. Far region F := [0, 1] \ j∈[s0] Nj 14 / 37 Spike deconvolution

∈ [s0 ], Nj := t ∈ [0, 1]; d(t, t0 j ) ≤ c1 fc , with 0 < c1 < 1.26/2. Far region F := [0, 1] \ j∈[s0] Nj Theorem (B., de Castro, Salmon, 2016). Let C > 2 √ 2. Let C > 0 may depend on C. The estimator ˆ µ, solution to CBLasso with a choice λ ≥ C log n/n, satisﬁes 1 ∀j ∈ [s0], a0 j − {k: ˆ tk ∈Nj } ˆ ak ≤ C σ0s0 log n/n , 14 / 37 Spike deconvolution

∈ [s0 ], Nj := t ∈ [0, 1]; d(t, t0 j ) ≤ c1 fc , with 0 < c1 < 1.26/2. Far region F := [0, 1] \ j∈[s0] Nj Theorem (B., de Castro, Salmon, 2016). Let C > 2 √ 2. Let C > 0 may depend on C. The estimator ˆ µ, solution to CBLasso with a choice λ ≥ C log n/n, satisﬁes 1 ∀j ∈ [s0], a0 j − {k: ˆ tk ∈Nj } ˆ ak ≤ C σ0s0 log n/n , 2 ∀j ∈ [s0], {k: ˆ tk ∈Nj } |ˆ ak |d2(t0 j , ˆ tk ) ≤ C σ0s0 log n/n/n2 , 14 / 37 Spike deconvolution

∈ [s0 ], Nj := t ∈ [0, 1]; d(t, t0 j ) ≤ c1 fc , with 0 < c1 < 1.26/2. Far region F := [0, 1] \ j∈[s0] Nj Theorem (B., de Castro, Salmon, 2016). Let C > 2 √ 2. Let C > 0 may depend on C. The estimator ˆ µ, solution to CBLasso with a choice λ ≥ C log n/n, satisﬁes 1 ∀j ∈ [s0], a0 j − {k: ˆ tk ∈Nj } ˆ ak ≤ C σ0s0 log n/n , 2 ∀j ∈ [s0], {k: ˆ tk ∈Nj } |ˆ ak |d2(t0 j , ˆ tk ) ≤ C σ0s0 log n/n/n2 , 3 {k: ˆ tk ∈F} |ˆ ak | ≤ C σ0s0 log n/n , with high probability. 14 / 37 Spike deconvolution

The proposed approach – Results Localization results Corollary (B., de
Castro, Salmon, 2016). For any t0 j in the support of µ0 such that a0 j > C σ0 s0λ, there exists an element ˆ tk in the support of ˆ µ such that d(t0 j , ˆ tk ) ≤ C σ0 s0λ |a0 j | − C σ0 s0λ 1 n , with high probability. independent of the other spikes magnitude 15 / 37 Spike deconvolution

The proposed approach – Results Noise level estimation Proposition (B.,
de Castro, Salmon, 2016). Under the sampling rate assumption, if n > −8 log(α), √ nˆ σ ε 2 − 1 ≤ 1 2 , with probability larger than 1 − α 2 √ 2 n + 2 √ 3+3 3 . 16 / 37 Spike deconvolution

The proposed approach – Some words about the proof Sketch
of proof KKT condition 1 n F∗ n (y − Fn (ˆ µ)) = ˆ σλˆ p ˆ p ∞ ≤ 1 R T ˆ p(t)ˆ µ(dt) = ˆ µ TV (ˆ p subgradient of · TV at ˆ µ) 17 / 37 Spike deconvolution

of proof KKT condition 1 n F∗ n (y − Fn (ˆ µ)) = ˆ σλˆ p ˆ p ∞ ≤ 1 R T ˆ p(t)ˆ µ(dt) = ˆ µ TV (ˆ p subgradient of · TV at ˆ µ) Dual certiﬁcate Construction of q(t) = fc k=−fc ck e2iπkt obeying to q(tj ) = v0 j := a0 j |a0 j | ∀tj ∈ supp µ0 |q(t)| < 1 ∀t / ∈ supp µ0 17 / 37 Spike deconvolution

of proof KKT condition 1 n F∗ n (y − Fn (ˆ µ)) = ˆ σλˆ p ˆ p ∞ ≤ 1 R T ˆ p(t)ˆ µ(dt) = ˆ µ TV (ˆ p subgradient of · TV at ˆ µ) Dual certiﬁcate Construction of q(t) = fc k=−fc ck e2iπkt obeying to q(tj ) = v0 j := a0 j |a0 j | ∀tj ∈ supp µ0 q (tj ) = 0 |q(t)| < 1 ∀t / ∈ supp µ0 17 / 37 Spike deconvolution

of proof Spike localization Proofs are based on the work of [1, 2, 3] amended by Rice method Prediction adapt the proof of [3] to our setting Rice method for a non-Gaussian process J.-M. Azaïs, Y. De Castro, and F. Gamboa. Spike detection from inaccurate samplings. Applied and Computational Harmonic Analysis, 38(2):177–195, 2015. C. Fernandez-Granda. Support detection in super-resolution. arXiv preprint arXiv:1302.3921, 2013. G. Tang, B. N. Bhaskar, and B. Recht. Near minimax line spectral estimation. Information Theory, IEEE Transactions on, 61(1):499–512, 2015. 18 / 37 Spike deconvolution

Numerical aspects Primal-dual Proposition. Denoting Dn = c ∈ Cn
: F∗ n (c) ∞ ≤ 1, nλ2 c 2 ≤ 1 , the dual formulation of the CBLasso reads ˆ c ∈ arg max c∈Dn λ y, c . (1) Then, we have the link-equation between primal and dual solutions y = nλˆ σˆ c + Fn (ˆ µ) . (2) as well as a link between the coeﬃcient and the polynomial F∗ n (ˆ c) = ˆ p . (3) 19 / 37 Spike deconvolution

Numerical aspects Primal-dual Proposition. Denoting Dn = c ∈ Cn
: F∗ n (c) ∞ ≤ 1, nλ2 c 2 ≤ 1 , the dual formulation of the CBLasso reads ˆ c ∈ arg max c∈Dn λ y, c . (1) Then, we have the link-equation between primal and dual solutions y = nλˆ σˆ c + Fn (ˆ µ) . (2) as well as a link between the coefficient and the polynomial F∗ n (ˆ c) = ˆ p . (3) If the dual polynomial ˆ p associated is not constant the support of ˆ µ is finite, the support of ˆ µ is included in the set of its derivative roots, i.e. where the polynomial saturates at 1. the proof based on a dual certificate construction is possible. 19 / 37 Spike deconvolution

Numerical aspects Primal-dual : comparison with Blasso Dual of the
Blasso ˆ c ∈ arg max F∗ n (c) ∞≤1 y, c . Dual of the CBLasso ˆ c ∈ arg max F∗ n (c) ∞≤1 nλ2 c 2≤1 λ y, c . 20 / 37 Spike deconvolution

Numerical aspects Primal-dual : comparison with Blasso Dual of the
Blasso ˆ c ∈ arg max F∗ n (c) ∞≤1 y, c . Open question When the dual polynomial is non- constant ? Dual of the CBLasso ˆ c ∈ arg max F∗ n (c) ∞≤1 nλ2 c 2≤1 λ y, c . For the CBLasso We answer this. 20 / 37 Spike deconvolution

Numerical aspects Showing that the dual polynomial is non-constant |ˆ
p|2 is of constant modulus 1 ⇒ ˆ p = vϕk with v ∈ C and ϕk (·) = exp(2πık·) for some −fc ≤ k ≤ fc . if |v| < 1, using Holder’s inequality on R T ˆ p(t)ˆ µ(dt) = ˆ µ TV leads to ˆ µ = 0. if |v| = 1, we also have ˆ c ∈ Dn, so ˆ c 2 ≤ 1/( √ nλ), leading to |v| ≤ 1/( √ nλ). Since λ > 2 √ logn/ √ n ⇒ |v| < 1, which contradicts |v| = 1. One can then conclude that a dual polynomial of constant modulus never occurs in the CBLasso setup 21 / 37 Spike deconvolution

Numerical aspects Discussion on the value of λ 22 /
37 Spike deconvolution

Numerical aspects SDP formulation of the CBLasso One can represent
the dual feasible set Dn as an SDP condition. The dual problem can be cast as follows max c∈Cn λ y, c such that ∃Q ∈ Cn×n Q c c∗ 1 0 n−j i=1 Qi,i+j = 1 if j = 0 0 j = 1, . . . , n − 1. 23 / 37 Spike deconvolution

Numerical aspects SDP formulation of the CBLasso One can represent
the dual feasible set Dn as an SDP condition. The dual problem can be cast as follows max c∈Cn λ y, c such that ∃Q ∈ Cn×n Q c c∗ 1 0 n−j i=1 Qi,i+j = 1 if j = 0 0 j = 1, . . . , n − 1. In λ √ nc λ √ nc∗ 1 0 23 / 37 Spike deconvolution

Numerical aspects Proposed algorithm Algorithm Given the data y ∈
Cn 1 solve the dual problem to ﬁnd the coeﬃcients ˆ c of the dual polynomial ˆ p (cvx Matlab toolbox); 24 / 37 Spike deconvolution

Cn 1 solve the dual problem to ﬁnd the coeﬃcients ˆ c of the dual polynomial ˆ p (cvx Matlab toolbox); 2 identify supp(ˆ µ) = {ˆ t1, . . . , ˆ tˆ s } using the roots of 1 − |ˆ p|2; 24 / 37 Spike deconvolution

Cn 1 solve the dual problem to find the coefficients ˆ c of the dual polynomial ˆ p (cvx Matlab toolbox); 2 identify supp(ˆ µ) = {ˆ t1, . . . , ˆ tˆ s } using the roots of 1 − |ˆ p|2; 3 fix the design matrix X ∈ Cn×ˆ s , defined by Xk,j = ϕk (ˆ tj ); 24 / 37 Spike deconvolution

Cn 1 solve the dual problem to find the coefficients ˆ c of the dual polynomial ˆ p (cvx Matlab toolbox); 2 identify supp(ˆ µ) = {ˆ t1, . . . , ˆ tˆ s } using the roots of 1 − |ˆ p|2; 3 fix the design matrix X ∈ Cn×ˆ s , defined by Xk,j = ϕk (ˆ tj ); 4 solve now the following finite-dimensional problem (ˆ a, ˆ σ) ∈ argmin (a,σ)∈Cˆ s ×R++ 1 2nσ y − Xa 2 2 + σ 2 + λ a 1 , 24 / 37 Spike deconvolution

Cn 1 solve the dual problem to find the coefficients ˆ c of the dual polynomial ˆ p (cvx Matlab toolbox); 2 identify supp(ˆ µ) = {ˆ t1, . . . , ˆ tˆ s } using the roots of 1 − |ˆ p|2; 3 fix the design matrix X ∈ Cn×ˆ s , defined by Xk,j = ϕk (ˆ tj ); 4 solve now the following finite-dimensional problem (ˆ a, ˆ σ) ∈ argmin (a,σ)∈Cˆ s ×R++ 1 2nσ y − Xa 2 2 + σ 2 + λ a 1 , as follows: for an initial value of ˆ σ, until some stopping criterion, alternate the following steps solve the previous problem for a fixed ˆ σ to compute ˆ a, ˆ a = X+y − λˆ σ(X∗X)−1 sign(X∗ ˆ c) update ˆ σ = y − X ˆ a 2/ √ n using the new value of ˆ a, 24 / 37 Spike deconvolution

Numerical experiments Source code Code available at https://github.com/claireBoyer/CBLasso http://www.lsta.upmc.fr/boyer/ 25
/ 37 Spike deconvolution

Numerical experiments Measure reconstruction 0 0.2 0.4 0.6 0.8 1
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 Original BLasso CBLasso Reconstruction of a discrete measure. The original measure µ0 is composed of 3 spikes (in black). The reconstructed measure ˆ µ using our proposed CBLasso (in blue). In comparison, we plot the reconstructed measure using the BLasso, (in red). 26 / 37 Spike deconvolution

Numerical experiments Noise level estimation ε = ε(1) + iε(2)
with ε(j) ∼ N(0, σ0 Id) with σ0 = 1/ √ 2. 27 / 37 Spike deconvolution

Numerical experiments Noise level estimation ε = ε(1) + iε(2)
with ε(j) ∼ N(0, σ0 Id) with σ0 = 1/ √ 2. 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 BLasso CBLasso Boxplot on ˆ σ for 100 CBLasso consistent estimations of √ 2σ0 = 1. We compare our method to ˆ σBLasso = y − Fn (ˆ µBLasso) 2 √ n − ˆ sBLasso proposed in [1] where ˆ µBLasso is the reconstructed measure supported on ˆ sBLasso spikes via BLasso. S. Reid, R. Tibshirani, and J. Friedman. A study of error variance estimation in lasso regression. arXiv preprint arXiv:1311.5274, 2014. 27 / 37 Spike deconvolution

Numerical experiments Bias Noise level estimation √ nˆ σ ε
2 − 1 ≤ 1 2 , with high probability. Bias ˆ σ √ 2σ0 × E g 2 √ 2n = √ 2σ0 × Γ(n + 1/2) √ nΓ(n) → √ 2σ0 , showing that ˆ σ/ √ 2 is a consistent estimator of σ0 . 28 / 37 Spike deconvolution

Numerical experiments Conclusion The CBLasso new approach to handle unknown
noise level in spike detection theoretical contributions : prediction for CBLasso localization for CBLasso closing the question of constant polynomial in this setting numerical method is also proposed to tackle the CBLasso 29 / 37 Spike deconvolution

Numerical experiments Conclusion The CBLasso new approach to handle unknown
noise level in spike detection theoretical contributions : prediction for CBLasso localization for CBLasso closing the question of constant polynomial in this setting numerical method is also proposed to tackle the CBLasso Claire Boyer, Yohann de Castro, and Joseph Salmon. Adapting to unknown noise level in sparse deconvolution. Information and Inference, abs/1606.04760, 2016. 29 / 37 Spike deconvolution

Numerical experiments Thank you for your attention

Discussion on λ Discussion on λ λmax(y) = F∗ n
(y) ∞/ √ n y 2 (for Blasso F∗ n (y) ∞/n).

Proposition. Deﬁning λmin (y) = 1/( ˆ c(BME) 2 √
n) and the problem ˆ µ(BME) ∈ argmin Fn(µ)=y µ TV , (4) and its dual formulation ˆ c(BME) ∈ arg max c∈Cn y, c s.t. F∗ n (c) ∞ ≤ 1 . (5) the following statements are equivalent (i) λ ∈]0, λmin(y)], (ii) ˆ c = ˆ c(BME), (iii) ˆ σ = 0 (overﬁtting). Remark that λmin (y) = 1/( ˆ c(BME) 2 √ n) > 1/ √ n.

Proof tricks Rice method One needs to upper bound the
probabilities of the following events { F∗ n (ε) ∞ ≤ λ} and F∗ n (ε) ∞ √ n ε 2 ≤ R , where F∗ n (ε)(t) = fc k=−fc εk exp(2πıkt) , for all t ∈ [0, 1]. The ﬁrst event can be handled with a Rice formula for stationary Gaussian processes as in [1]. Due to the presence of the denominator, the second event cannot be described by a Gaussian process and its analysis is a bit more involved.

Proof tricks Rice method : notation We start with some
notation. Let z(1) = (z(1) −fc , . . . , z(1) 0 , . . . , z(1) fc ), z(2) = (z(2) −fc , . . . , z(2) 0 , . . . , z(2) fc ) , be i.i.d Nn (0, Idn ) random vectors. Set, for any t ∈ [0, 1], X(t) = z(1) 0 + fc k=1 (z(1) k + z(1) −k ) cos(2πkt) + fc k=1 (z(2) −k − z(2) k ) sin(2πkt) , Y (t) = z(2) 0 + fc k=1 (z(2) k + z(2) −k ) cos(2πkt) + fc k=1 (z(1) k − z(1) −k ) sin(2πkt) , Z(t) = X(t) + ıY (t) . Then, note that σ0 Z ∞ d = F∗ n (ε) ∞ and sup t∈[0,1] |Z(t)| √ n( z(1) 2 2 + z(2) 2 2 )1 2 d = F∗ n (ε) ∞ √ n ε 2 , where σ0 > 0 is the (unknown) standard deviation of the noise ε.

Proof tricks For the Gaussian process Lemma 1. For a
complex valued centered Gaussian random vector ε as deﬁned in (??), it holds ∀u > 0, P { F∗ n (ε) ∞ > u} ≤ n exp − u2 4nσ2 0 , where σ0 > 0 denotes the noise level. Observe that X(t) and Y (t) are two independent stationary Gaussian processes with the same auto-covariance function Σ given by ∀t ∈ [0, 1], Σ(t) = 1 + 2 fc k=1 cos(2πkt) =: Dfc (t) , where Dfc denotes the Dirichlet kernel. Set σn 2 = var X(t) = Dfc (0) = n . (6)

Proof tricks For the Gaussian process We use the following
inequalities, for any u > 0, P{ Z ∞ > u} ≤ P{ X ∞ > u/ √ 2}+P{ Y ∞ > u/ √ 2} = 2P{ X ∞ > u/ √ 2} , (7) and (by symmetry of the process X) P{ X ∞ > u/ √ 2} ≤ 2P{ sup t∈[0,1] X(t) > u/ √ 2} . (8) To give bounds to the right hand side of (8), we use the Rice method (see Azais & Wschebor [page 93]) P{ sup t∈[0,1] X(t) > u/ √ 2} = P{∀t ∈ [0, 1]; X(t) > u/ √ 2} + P{U u/ √ 2 > 0} , ≤ 1 √ 2π +∞ u/( √ 2σn) exp − v2 2 d + E(U u/ √ 2 ) , where Uv is the number of up-crossings of the level v by the process X(t) on the interval [0, 1]. By the Rice formula E(U u/ √ 2 ) = 1 2π var X (t)) 1 σn exp − u2 4σn 2 , where var X (t) = −Σ (0) = 2(2π)2 fc k=1 k2 = 4π2 3 fc (fc + 1)n . (9)

Proof tricks For the Gaussian process A Chernoﬀ argument provides
for any w > 0, +∞ w exp(−v2/2)d/ √ 2π ≤ exp(−w2/2), which yields P sup t∈[0,1] X(t) > u √ 2 ≤ exp − u2 4n + fc (fc + 1) 3 exp − u2 4n ≤ n 2 exp − u2 4n . The result follows with (8).

Claire Boyer

Claire Boyer

More Decks by S³ Seminar

Other Decks in Research

Featured

Transcript