Slide 1

Slide 1 text

Spike deconvolution with unknown noise level Claire Boyer, Yohann de Castro, Joseph Salmon, UPMC, LSTA, France January 20, 2017

Slide 2

Slide 2 text

1 Introduction 2 The proposed approach Results Some words about the proof 3 Numerical aspects 4 Numerical experiments

Slide 3

Slide 3 text

Introduction Motivations Point sources in applications Astronomy Microscopy (fluorescent molecules) Spectroscopy 2 / 37 Spike deconvolution

Slide 4

Slide 4 text

Introduction Motivations Point sources in applications Astronomy Microscopy (fluorescent molecules) Spectroscopy Off-the-grid Point sources do not live on a cartesian grid. 2 / 37 Spike deconvolution

Slide 5

Slide 5 text

Introduction What is spike deconvolution? Setting Aim: reconstruct a discrete measure µ0 = s i=1 a0 i δt0 i supp(µ0) = {t0 1 , . . . , t0 s } ⊂ T (a0 i )1≤i≤s ∈ Cs 3 / 37 Spike deconvolution

Slide 6

Slide 6 text

Introduction What is spike deconvolution? Setting Aim: reconstruct a discrete measure µ0 = s i=1 a0 i δt0 i supp(µ0) = {t0 1 , . . . , t0 s } ⊂ T (a0 i )1≤i≤s ∈ Cs From linear measurements y = Φµ0 + ε Φ is a filter ε is a complex Gaussian vector, ε = ε(1) + iε(2) with ε(i) ∼ N (0, σ2 0 Id) 3 / 37 Spike deconvolution

Slide 7

Slide 7 text

Introduction What is spike deconvolution? Setting Aim: reconstruct a discrete measure µ0 = s i=1 a0 i δt0 i supp(µ0) = {t0 1 , . . . , t0 s } ⊂ T (a0 i )1≤i≤s ∈ Cs From linear measurements y = Φµ0 + ε Φ is a filter ε is a complex Gaussian vector, ε = ε(1) + iε(2) with ε(i) ∼ N (0, σ2 0 Id) In this work Low-pass filter : Φ = Fn Fn(µ) := (ck (µ))|k|≤fc , ck (µ) := T exp(−2πıkt)µ(dt) 3 / 37 Spike deconvolution

Slide 8

Slide 8 text

Introduction How to reconstruct µ0? 1-deconvolution: Beurling lasso (Blasso) min µ 1 2 y − Fn (µ) 2 2 + λ µ TV µ TV = sup | j µ(Bj )| over all finite partitions (Bj ) of [0, 1] µ TV = s i=1 |ai | when the measure is discrete 4 / 37 Spike deconvolution

Slide 9

Slide 9 text

Introduction Beurling minimal extrapolation Beurling minimal extrapolation (1938) µ ∈ argmin µ µ TV s.t. T Φdµ = y with Φ = (ϕ1, . . . , ϕn ) Infinite-dimensional variable µ Finitely many constraints 5 / 37 Spike deconvolution

Slide 10

Slide 10 text

Introduction Beurling minimal extrapolation Beurling minimal extrapolation (1938) µ ∈ argmin µ µ TV s.t. T Φdµ = y with Φ = (ϕ1, . . . , ϕn ) Infinite-dimensional variable µ Finitely many constraints Fenchel dual program ˆ u ∈ arg max y, u s.t. n k=0 uk ϕk ∞ ≤ 1 Finite-dimensional variable Infinitely many constraints Strong duality 5 / 37 Spike deconvolution

Slide 11

Slide 11 text

Introduction How to reconstruct µ0? 1-deconvolution: Beurling lasso (Blasso) min µ 1 2 y − Fn (µ) 2 2 + λ µ TV µ TV = sup | j µ(Bj )| over all finite partitions (Bj ) of [0, 1] µ TV = s i=1 |ai | when the measure is discrete Algorithms proximal-based [Bredies, Pikkarainen 2012] root-finding [Candés, Fernandez-Granda 2012] 6 / 37 Spike deconvolution

Slide 12

Slide 12 text

Introduction How to reconstruct µ0? 1-deconvolution: Beurling lasso (Blasso) min µ 1 2 y − Fn (µ) 2 2 + λ µ TV µ TV = sup | j µ(Bj )| over all finite partitions (Bj ) of [0, 1] µ TV = s i=1 |ai | when the measure is discrete Algorithms proximal-based [Bredies, Pikkarainen 2012] root-finding [Candés, Fernandez-Granda 2012] 6 / 37 Spike deconvolution

Slide 13

Slide 13 text

Introduction How to reconstruct µ0? 1-deconvolution: Beurling lasso (Blasso) min µ 1 2 y − Fn (µ) 2 2 + λ µ TV µ TV = sup | j µ(Bj )| over all finite partitions (Bj ) of [0, 1] µ TV = s i=1 |ai | when the measure is discrete Algorithms proximal-based [Bredies, Pikkarainen 2012] root-finding [Candés, Fernandez-Granda 2012] Others based on Prony’s methods : MUSIC, ESPRIT, FRI 6 / 37 Spike deconvolution

Slide 14

Slide 14 text

Introduction How to reconstruct µ0? 1-deconvolution: Beurling lasso (Blasso) min µ 1 2 y − Fn (µ) 2 2 + λ µ TV µ TV = sup | j µ(Bj )| over all finite partitions (Bj ) of [0, 1] µ TV = s i=1 |ai | when the measure is discrete Algorithms proximal-based [Bredies, Pikkarainen 2012] root-finding [Candés, Fernandez-Granda 2012] Others based on Prony’s methods : MUSIC, ESPRIT, FRI Link with 1-regularization ? 6 / 37 Spike deconvolution

Slide 15

Slide 15 text

Introduction Link to compressed sensing Spike deconvolution Compressed sensing Setting ∞-dim finite-dim (or easy ∞-dim) Object of interest a discrete measure a sparse signal (mainly) µ = s i=1 ai δti x ∈ Cn 1illustrations from Carlos Fernandez-Granda 7 / 37 Spike deconvolution

Slide 16

Slide 16 text

Introduction Link to compressed sensing Spike deconvolution Compressed sensing Setting ∞-dim finite-dim (or easy ∞-dim) Object of interest a discrete measure a sparse signal (mainly) µ = s i=1 ai δti x ∈ Cn Measurements low-pass filter random 1 spectrum extrapolation spectrum interpolation 1illustrations from Carlos Fernandez-Granda 7 / 37 Spike deconvolution

Slide 17

Slide 17 text

Introduction Link to compressed sensing Spike deconvolution Compressed sensing Setting ∞-dim finite-dim (or easy ∞-dim) Object of interest a discrete measure a sparse signal (mainly) µ = s i=1 ai δti x ∈ Cn Measurements low-pass filter random 1 spectrum extrapolation spectrum interpolation Resolution minµ 1 2 y − Fn (µ) 2 2 + λ µ TV minx 1 2 y − Frd (x) 2 2 + λ x 1 Beurling Lasso estimator Lasso Key feature Minimum separation Measurement incoherence + degree of sparsity 1illustrations from Carlos Fernandez-Granda 7 / 37 Spike deconvolution

Slide 18

Slide 18 text

Introduction Link to compressed sensing Spike deconvolution Compressed sensing Setting ∞-dim finite-dim (or easy ∞-dim) Object of interest a discrete measure a sparse signal (mainly) µ = s i=1 ai δti x ∈ Cn Measurements low-pass filter random 1 spectrum extrapolation spectrum interpolation Resolution minµ 1 2 y − Fn (µ) 2 2 + λ µ TV minx 1 2 y − Frd (x) 2 2 + λ x 1 Beurling Lasso estimator Lasso Key feature Minimum separation Measurement incoherence + degree of sparsity Algorithm Root-finding (...) Easy convex programming 1illustrations from Carlos Fernandez-Granda 7 / 37 Spike deconvolution

Slide 19

Slide 19 text

Introduction Contributions Handling unknown noise level 8 / 37 Spike deconvolution

Slide 20

Slide 20 text

Introduction Contributions Handling unknown noise level Assessing the noise level using the Rice method for a non-Gaussian process 8 / 37 Spike deconvolution

Slide 21

Slide 21 text

Introduction Contributions Handling unknown noise level Assessing the noise level using the Rice method for a non-Gaussian process Prediction & strong localization accuracy 8 / 37 Spike deconvolution

Slide 22

Slide 22 text

Introduction Contributions Handling unknown noise level Assessing the noise level using the Rice method for a non-Gaussian process Prediction & strong localization accuracy Numerics : the root-finding can always be done (open question of [1, Eq. (4.5)]) E. J. Candès and C. Fernandez-Granda. Towards a mathematical theory of super-resolution. Communications on Pure and Applied Mathematics, 67(6):906–956, 2014. 8 / 37 Spike deconvolution

Slide 23

Slide 23 text

Introduction Contributions Handling unknown noise level Assessing the noise level using the Rice method for a non-Gaussian process Prediction & strong localization accuracy Numerics : the root-finding can always be done (open question of [1, Eq. (4.5)]) Claire Boyer, Yohann de Castro, and Joseph Salmon. Adapting to unknown noise level in sparse deconvolution. Information and Inference, abs/1606.04760, 2016. E. J. Candès and C. Fernandez-Granda. Towards a mathematical theory of super-resolution. Communications on Pure and Applied Mathematics, 67(6):906–956, 2014. 8 / 37 Spike deconvolution

Slide 24

Slide 24 text

1 Introduction 2 The proposed approach Results Some words about the proof 3 Numerical aspects 4 Numerical experiments

Slide 25

Slide 25 text

The proposed approach The proposed method Data y = Fnµ0 + ε y ∈ Cn with n = 2fc + 1 ε is a complex Gaussian vector, ε = ε(1) + iε(2) with ε(j) ∼ N(0, σ2 0 Id) 9 / 37 Spike deconvolution

Slide 26

Slide 26 text

The proposed approach The proposed method Data y = Fnµ0 + ε y ∈ Cn with n = 2fc + 1 ε is a complex Gaussian vector, ε = ε(1) + iε(2) with ε(j) ∼ N(0, σ2 0 Id) Handling unknown noise level : CBLasso (ˆ µ, ˆ σ) ∈ argmin (µ,σ)∈E∗×R++ 1 2nσ y − Fn (µ) 2 2 + σ 2 + λ µ TV , Owen 07, Antoniadis 10, Belloni Chernozhukov Wang 11, Sun Zhang 12, van de Geer 15 : square-root lasso and scaled-lasso. 9 / 37 Spike deconvolution

Slide 27

Slide 27 text

The proposed approach Where is the square-root? Why square-root? The CBLasso reads ([2] ∼ scaled lasso) (ˆ µ, ˆ σ) ∈ argmin (µ,σ)∈E∗×R++ 1 2nσ y − Fn (µ) 2 2 + σ 2 + λ µ TV . When the solution is reached for ˆ σ > 0, ([1] ∼ square-root lasso) ˆ σ = y − Fn (ˆ µ) 2/ √ n ˆ µ ∈ argmin µ∈E∗ y − Fn (µ) 2/ √ n + λ µ TV A. Belloni, V. Chernozhukov, and L. Wang. Square-root Lasso: Pivotal recovery of sparse signals via conic programming. Biometrika, 98(4):791–806, 2011. T. Sun and C.-H. Zhang. Scaled sparse linear regression. Biometrika, 99(4):879–898, 2012. 10 / 37 Spike deconvolution

Slide 28

Slide 28 text

The proposed approach Compatibility limits Sufficient conditions for oracle inequalities RIP =⇒ REC =⇒ Compatibility Compatibility condition : C(L, S) > 0 C(L, S) := inf |S| Fn(ν) 2 2 /n s.t. supp(ν) = S, νS TV = 1, νSc TV ≤ L . Consider ν as a difference of two measures: ˆ µ and µ0 11 / 37 Spike deconvolution

Slide 29

Slide 29 text

The proposed approach Compatibility limits Sufficient conditions for oracle inequalities RIP =⇒ REC =⇒ Compatibility Compatibility condition : C(L, S) > 0 C(L, S) := inf |S| Fn(ν) 2 2 /n s.t. supp(ν) = S, νS TV = 1, νSc TV ≤ L . Consider ν as a difference of two measures: ˆ µ and µ0 Compatibility does not hold νε = δ−ε + δε @@@@@ Compatibility =⇒ $$ REC =⇒ ¨ ¨ RIP highly coherent designs: close Dirac masses share almost the same Fourier coefficients Non-uniform approach Measure-dependent reconstruction 11 / 37 Spike deconvolution

Slide 30

Slide 30 text

The proposed approach – Results Standard assumptions Assumption (Sampling rate condition). λSNR ≤ √ 17 − 4 2 0.0616 with SNR := µ0 TV E[ ε 2 2 ]/n = µ0 TV √ 2σ0 . λ ≥ 2 2 log n/n =⇒ n/log n ≥ C SNR2 12 / 37 Spike deconvolution

Slide 31

Slide 31 text

The proposed approach – Results Standard assumptions Assumption (Sampling rate condition). λSNR ≤ √ 17 − 4 2 0.0616 with SNR := µ0 TV E[ ε 2 2 ]/n = µ0 TV √ 2σ0 . λ ≥ 2 2 log n/n =⇒ n/log n ≥ C SNR2 Assumption (No-overfitting). ˆ σ > 0 12 / 37 Spike deconvolution

Slide 32

Slide 32 text

The proposed approach – Results Standard assumptions Assumption (Sampling rate condition). λSNR ≤ √ 17 − 4 2 0.0616 with SNR := µ0 TV E[ ε 2 2 ]/n = µ0 TV √ 2σ0 . λ ≥ 2 2 log n/n =⇒ n/log n ≥ C SNR2 Assumption (No-overfitting). ˆ σ > 0 Assumption (Separation condition). For supp(µ0) = {t0 1 , . . . , t0 s0 }, ∀i = j, d(t0 i , t0 j ) ≥ 1.26 fc 12 / 37 Spike deconvolution

Slide 33

Slide 33 text

The proposed approach – Results Prediction result Theorem (B., de Castro, Salmon, 2016). Let C > 2 √ 2. Set C > 0, that may depend on C. Assume the sampling rate condition, the separation condition. The estimator ˆ µ solution to CBLasso with a choice λ ≥ C log n/n satisfies 1 n Fn (ˆ µ − µ0) 2 2 ≤ C s0 λ2 σ0, with high probability. 13 / 37 Spike deconvolution

Slide 34

Slide 34 text

The proposed approach – Results Prediction result Theorem (B., de Castro, Salmon, 2016). Let C > 2 √ 2. Set C > 0, that may depend on C. Assume the sampling rate condition, the separation condition. The estimator ˆ µ solution to CBLasso with a choice λ ≥ C log n/n satisfies 1 n Fn (ˆ µ − µ0) 2 2 ≤ C s0 λ2 σ2 0 = O s0σ2 0 log n n , with high probability. "fast rate of convergence" of [1] G. Tang, B. N. Bhaskar, and B. Recht. Near minimax line spectral estimation. Information Theory, IEEE Transactions on, 61(1):499–512, 2015. 13 / 37 Spike deconvolution

Slide 35

Slide 35 text

The proposed approach – Results Localization results Near regions ∀j ∈ [s0 ], Nj := t ∈ [0, 1]; d(t, t0 j ) ≤ c1 fc , with 0 < c1 < 1.26/2. Far region F := [0, 1] \ j∈[s0] Nj 14 / 37 Spike deconvolution

Slide 36

Slide 36 text

The proposed approach – Results Localization results Near regions ∀j ∈ [s0 ], Nj := t ∈ [0, 1]; d(t, t0 j ) ≤ c1 fc , with 0 < c1 < 1.26/2. Far region F := [0, 1] \ j∈[s0] Nj Theorem (B., de Castro, Salmon, 2016). Let C > 2 √ 2. Let C > 0 may depend on C. The estimator ˆ µ, solution to CBLasso with a choice λ ≥ C log n/n, satisfies 1 ∀j ∈ [s0], a0 j − {k: ˆ tk ∈Nj } ˆ ak ≤ C σ0s0 log n/n , 14 / 37 Spike deconvolution

Slide 37

Slide 37 text

The proposed approach – Results Localization results Near regions ∀j ∈ [s0 ], Nj := t ∈ [0, 1]; d(t, t0 j ) ≤ c1 fc , with 0 < c1 < 1.26/2. Far region F := [0, 1] \ j∈[s0] Nj Theorem (B., de Castro, Salmon, 2016). Let C > 2 √ 2. Let C > 0 may depend on C. The estimator ˆ µ, solution to CBLasso with a choice λ ≥ C log n/n, satisfies 1 ∀j ∈ [s0], a0 j − {k: ˆ tk ∈Nj } ˆ ak ≤ C σ0s0 log n/n , 2 ∀j ∈ [s0], {k: ˆ tk ∈Nj } |ˆ ak |d2(t0 j , ˆ tk ) ≤ C σ0s0 log n/n/n2 , 14 / 37 Spike deconvolution

Slide 38

Slide 38 text

The proposed approach – Results Localization results Near regions ∀j ∈ [s0 ], Nj := t ∈ [0, 1]; d(t, t0 j ) ≤ c1 fc , with 0 < c1 < 1.26/2. Far region F := [0, 1] \ j∈[s0] Nj Theorem (B., de Castro, Salmon, 2016). Let C > 2 √ 2. Let C > 0 may depend on C. The estimator ˆ µ, solution to CBLasso with a choice λ ≥ C log n/n, satisfies 1 ∀j ∈ [s0], a0 j − {k: ˆ tk ∈Nj } ˆ ak ≤ C σ0s0 log n/n , 2 ∀j ∈ [s0], {k: ˆ tk ∈Nj } |ˆ ak |d2(t0 j , ˆ tk ) ≤ C σ0s0 log n/n/n2 , 3 {k: ˆ tk ∈F} |ˆ ak | ≤ C σ0s0 log n/n , with high probability. 14 / 37 Spike deconvolution

Slide 39

Slide 39 text

The proposed approach – Results Localization results Corollary (B., de Castro, Salmon, 2016). For any t0 j in the support of µ0 such that a0 j > C σ0 s0λ, there exists an element ˆ tk in the support of ˆ µ such that d(t0 j , ˆ tk ) ≤ C σ0 s0λ |a0 j | − C σ0 s0λ 1 n , with high probability. independent of the other spikes magnitude 15 / 37 Spike deconvolution

Slide 40

Slide 40 text

The proposed approach – Results Noise level estimation Proposition (B., de Castro, Salmon, 2016). Under the sampling rate assumption, if n > −8 log(α), √ nˆ σ ε 2 − 1 ≤ 1 2 , with probability larger than 1 − α 2 √ 2 n + 2 √ 3+3 3 . 16 / 37 Spike deconvolution

Slide 41

Slide 41 text

The proposed approach – Some words about the proof Sketch of proof KKT condition 1 n F∗ n (y − Fn (ˆ µ)) = ˆ σλˆ p ˆ p ∞ ≤ 1 R T ˆ p(t)ˆ µ(dt) = ˆ µ TV (ˆ p subgradient of · TV at ˆ µ) 17 / 37 Spike deconvolution

Slide 42

Slide 42 text

The proposed approach – Some words about the proof Sketch of proof KKT condition 1 n F∗ n (y − Fn (ˆ µ)) = ˆ σλˆ p ˆ p ∞ ≤ 1 R T ˆ p(t)ˆ µ(dt) = ˆ µ TV (ˆ p subgradient of · TV at ˆ µ) 17 / 37 Spike deconvolution

Slide 43

Slide 43 text

The proposed approach – Some words about the proof Sketch of proof KKT condition 1 n F∗ n (y − Fn (ˆ µ)) = ˆ σλˆ p ˆ p ∞ ≤ 1 R T ˆ p(t)ˆ µ(dt) = ˆ µ TV (ˆ p subgradient of · TV at ˆ µ) Dual certificate Construction of q(t) = fc k=−fc ck e2iπkt obeying to q(tj ) = v0 j := a0 j |a0 j | ∀tj ∈ supp µ0 |q(t)| < 1 ∀t / ∈ supp µ0 17 / 37 Spike deconvolution

Slide 44

Slide 44 text

The proposed approach – Some words about the proof Sketch of proof KKT condition 1 n F∗ n (y − Fn (ˆ µ)) = ˆ σλˆ p ˆ p ∞ ≤ 1 R T ˆ p(t)ˆ µ(dt) = ˆ µ TV (ˆ p subgradient of · TV at ˆ µ) Dual certificate Construction of q(t) = fc k=−fc ck e2iπkt obeying to q(tj ) = v0 j := a0 j |a0 j | ∀tj ∈ supp µ0 q (tj ) = 0 |q(t)| < 1 ∀t / ∈ supp µ0 17 / 37 Spike deconvolution

Slide 45

Slide 45 text

The proposed approach – Some words about the proof Sketch of proof Spike localization Proofs are based on the work of [1, 2, 3] amended by Rice method Prediction adapt the proof of [3] to our setting Rice method for a non-Gaussian process J.-M. Azaïs, Y. De Castro, and F. Gamboa. Spike detection from inaccurate samplings. Applied and Computational Harmonic Analysis, 38(2):177–195, 2015. C. Fernandez-Granda. Support detection in super-resolution. arXiv preprint arXiv:1302.3921, 2013. G. Tang, B. N. Bhaskar, and B. Recht. Near minimax line spectral estimation. Information Theory, IEEE Transactions on, 61(1):499–512, 2015. 18 / 37 Spike deconvolution

Slide 46

Slide 46 text

1 Introduction 2 The proposed approach Results Some words about the proof 3 Numerical aspects 4 Numerical experiments

Slide 47

Slide 47 text

Numerical aspects Primal-dual Proposition. Denoting Dn = c ∈ Cn : F∗ n (c) ∞ ≤ 1, nλ2 c 2 ≤ 1 , the dual formulation of the CBLasso reads ˆ c ∈ arg max c∈Dn λ y, c . (1) Then, we have the link-equation between primal and dual solutions y = nλˆ σˆ c + Fn (ˆ µ) . (2) as well as a link between the coefficient and the polynomial F∗ n (ˆ c) = ˆ p . (3) 19 / 37 Spike deconvolution

Slide 48

Slide 48 text

Numerical aspects Primal-dual Proposition. Denoting Dn = c ∈ Cn : F∗ n (c) ∞ ≤ 1, nλ2 c 2 ≤ 1 , the dual formulation of the CBLasso reads ˆ c ∈ arg max c∈Dn λ y, c . (1) Then, we have the link-equation between primal and dual solutions y = nλˆ σˆ c + Fn (ˆ µ) . (2) as well as a link between the coefficient and the polynomial F∗ n (ˆ c) = ˆ p . (3) If the dual polynomial ˆ p associated is not constant the support of ˆ µ is finite, the support of ˆ µ is included in the set of its derivative roots, i.e. where the polynomial saturates at 1. the proof based on a dual certificate construction is possible. 19 / 37 Spike deconvolution

Slide 49

Slide 49 text

Numerical aspects Primal-dual : comparison with Blasso Dual of the Blasso ˆ c ∈ arg max F∗ n (c) ∞≤1 y, c . Dual of the CBLasso ˆ c ∈ arg max F∗ n (c) ∞≤1 nλ2 c 2≤1 λ y, c . 20 / 37 Spike deconvolution

Slide 50

Slide 50 text

Numerical aspects Primal-dual : comparison with Blasso Dual of the Blasso ˆ c ∈ arg max F∗ n (c) ∞≤1 y, c . Open question When the dual polynomial is non- constant ? Dual of the CBLasso ˆ c ∈ arg max F∗ n (c) ∞≤1 nλ2 c 2≤1 λ y, c . For the CBLasso We answer this. 20 / 37 Spike deconvolution

Slide 51

Slide 51 text

Numerical aspects Showing that the dual polynomial is non-constant |ˆ p|2 is of constant modulus 1 ⇒ ˆ p = vϕk with v ∈ C and ϕk (·) = exp(2πık·) for some −fc ≤ k ≤ fc . if |v| < 1, using Holder’s inequality on R T ˆ p(t)ˆ µ(dt) = ˆ µ TV leads to ˆ µ = 0. if |v| = 1, we also have ˆ c ∈ Dn, so ˆ c 2 ≤ 1/( √ nλ), leading to |v| ≤ 1/( √ nλ). Since λ > 2 √ logn/ √ n ⇒ |v| < 1, which contradicts |v| = 1. One can then conclude that a dual polynomial of constant modulus never occurs in the CBLasso setup 21 / 37 Spike deconvolution

Slide 52

Slide 52 text

Numerical aspects Discussion on the value of λ 22 / 37 Spike deconvolution

Slide 53

Slide 53 text

Numerical aspects SDP formulation of the CBLasso One can represent the dual feasible set Dn as an SDP condition. The dual problem can be cast as follows max c∈Cn λ y, c such that ∃Q ∈ Cn×n Q c c∗ 1 0 n−j i=1 Qi,i+j = 1 if j = 0 0 j = 1, . . . , n − 1. 23 / 37 Spike deconvolution

Slide 54

Slide 54 text

Numerical aspects SDP formulation of the CBLasso One can represent the dual feasible set Dn as an SDP condition. The dual problem can be cast as follows max c∈Cn λ y, c such that ∃Q ∈ Cn×n Q c c∗ 1 0 n−j i=1 Qi,i+j = 1 if j = 0 0 j = 1, . . . , n − 1. In λ √ nc λ √ nc∗ 1 0 23 / 37 Spike deconvolution

Slide 55

Slide 55 text

Numerical aspects Proposed algorithm Algorithm Given the data y ∈ Cn 1 solve the dual problem to find the coefficients ˆ c of the dual polynomial ˆ p (cvx Matlab toolbox); 24 / 37 Spike deconvolution

Slide 56

Slide 56 text

Numerical aspects Proposed algorithm Algorithm Given the data y ∈ Cn 1 solve the dual problem to find the coefficients ˆ c of the dual polynomial ˆ p (cvx Matlab toolbox); 2 identify supp(ˆ µ) = {ˆ t1, . . . , ˆ tˆ s } using the roots of 1 − |ˆ p|2; 24 / 37 Spike deconvolution

Slide 57

Slide 57 text

Numerical aspects Proposed algorithm Algorithm Given the data y ∈ Cn 1 solve the dual problem to find the coefficients ˆ c of the dual polynomial ˆ p (cvx Matlab toolbox); 2 identify supp(ˆ µ) = {ˆ t1, . . . , ˆ tˆ s } using the roots of 1 − |ˆ p|2; 3 fix the design matrix X ∈ Cn׈ s , defined by Xk,j = ϕk (ˆ tj ); 24 / 37 Spike deconvolution

Slide 58

Slide 58 text

Numerical aspects Proposed algorithm Algorithm Given the data y ∈ Cn 1 solve the dual problem to find the coefficients ˆ c of the dual polynomial ˆ p (cvx Matlab toolbox); 2 identify supp(ˆ µ) = {ˆ t1, . . . , ˆ tˆ s } using the roots of 1 − |ˆ p|2; 3 fix the design matrix X ∈ Cn׈ s , defined by Xk,j = ϕk (ˆ tj ); 4 solve now the following finite-dimensional problem (ˆ a, ˆ σ) ∈ argmin (a,σ)∈Cˆ s ×R++ 1 2nσ y − Xa 2 2 + σ 2 + λ a 1 , 24 / 37 Spike deconvolution

Slide 59

Slide 59 text

Numerical aspects Proposed algorithm Algorithm Given the data y ∈ Cn 1 solve the dual problem to find the coefficients ˆ c of the dual polynomial ˆ p (cvx Matlab toolbox); 2 identify supp(ˆ µ) = {ˆ t1, . . . , ˆ tˆ s } using the roots of 1 − |ˆ p|2; 3 fix the design matrix X ∈ Cn׈ s , defined by Xk,j = ϕk (ˆ tj ); 4 solve now the following finite-dimensional problem (ˆ a, ˆ σ) ∈ argmin (a,σ)∈Cˆ s ×R++ 1 2nσ y − Xa 2 2 + σ 2 + λ a 1 , as follows: for an initial value of ˆ σ, until some stopping criterion, alternate the following steps solve the previous problem for a fixed ˆ σ to compute ˆ a, ˆ a = X+y − λˆ σ(X∗X)−1 sign(X∗ ˆ c) update ˆ σ = y − X ˆ a 2/ √ n using the new value of ˆ a, 24 / 37 Spike deconvolution

Slide 60

Slide 60 text

1 Introduction 2 The proposed approach Results Some words about the proof 3 Numerical aspects 4 Numerical experiments

Slide 61

Slide 61 text

Numerical experiments Source code Code available at https://github.com/claireBoyer/CBLasso http://www.lsta.upmc.fr/boyer/ 25 / 37 Spike deconvolution

Slide 62

Slide 62 text

Numerical experiments Measure reconstruction 0 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 Original BLasso CBLasso Reconstruction of a discrete measure. The original measure µ0 is composed of 3 spikes (in black). The reconstructed measure ˆ µ using our proposed CBLasso (in blue). In comparison, we plot the reconstructed measure using the BLasso, (in red). 26 / 37 Spike deconvolution

Slide 63

Slide 63 text

Numerical experiments Noise level estimation ε = ε(1) + iε(2) with ε(j) ∼ N(0, σ0 Id) with σ0 = 1/ √ 2. 27 / 37 Spike deconvolution

Slide 64

Slide 64 text

Numerical experiments Noise level estimation ε = ε(1) + iε(2) with ε(j) ∼ N(0, σ0 Id) with σ0 = 1/ √ 2. 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 BLasso CBLasso Boxplot on ˆ σ for 100 CBLasso consistent estimations of √ 2σ0 = 1. We compare our method to ˆ σBLasso = y − Fn (ˆ µBLasso) 2 √ n − ˆ sBLasso proposed in [1] where ˆ µBLasso is the reconstructed measure supported on ˆ sBLasso spikes via BLasso. S. Reid, R. Tibshirani, and J. Friedman. A study of error variance estimation in lasso regression. arXiv preprint arXiv:1311.5274, 2014. 27 / 37 Spike deconvolution

Slide 65

Slide 65 text

Numerical experiments Bias Noise level estimation √ nˆ σ ε 2 − 1 ≤ 1 2 , with high probability. Bias ˆ σ √ 2σ0 × E g 2 √ 2n = √ 2σ0 × Γ(n + 1/2) √ nΓ(n) → √ 2σ0 , showing that ˆ σ/ √ 2 is a consistent estimator of σ0 . 28 / 37 Spike deconvolution

Slide 66

Slide 66 text

Numerical experiments Conclusion The CBLasso new approach to handle unknown noise level in spike detection theoretical contributions : prediction for CBLasso localization for CBLasso closing the question of constant polynomial in this setting numerical method is also proposed to tackle the CBLasso 29 / 37 Spike deconvolution

Slide 67

Slide 67 text

Numerical experiments Conclusion The CBLasso new approach to handle unknown noise level in spike detection theoretical contributions : prediction for CBLasso localization for CBLasso closing the question of constant polynomial in this setting numerical method is also proposed to tackle the CBLasso Claire Boyer, Yohann de Castro, and Joseph Salmon. Adapting to unknown noise level in sparse deconvolution. Information and Inference, abs/1606.04760, 2016. 29 / 37 Spike deconvolution

Slide 68

Slide 68 text

Numerical experiments Thank you for your attention

Slide 69

Slide 69 text

Discussion on λ Discussion on λ λmax(y) = F∗ n (y) ∞/ √ n y 2 (for Blasso F∗ n (y) ∞/n).

Slide 70

Slide 70 text

Proposition. Defining λmin (y) = 1/( ˆ c(BME) 2 √ n) and the problem ˆ µ(BME) ∈ argmin Fn(µ)=y µ TV , (4) and its dual formulation ˆ c(BME) ∈ arg max c∈Cn y, c s.t. F∗ n (c) ∞ ≤ 1 . (5) the following statements are equivalent (i) λ ∈]0, λmin(y)], (ii) ˆ c = ˆ c(BME), (iii) ˆ σ = 0 (overfitting). Remark that λmin (y) = 1/( ˆ c(BME) 2 √ n) > 1/ √ n.

Slide 71

Slide 71 text

Proof tricks Rice method One needs to upper bound the probabilities of the following events { F∗ n (ε) ∞ ≤ λ} and F∗ n (ε) ∞ √ n ε 2 ≤ R , where F∗ n (ε)(t) = fc k=−fc εk exp(2πıkt) , for all t ∈ [0, 1]. The first event can be handled with a Rice formula for stationary Gaussian processes as in [1]. Due to the presence of the denominator, the second event cannot be described by a Gaussian process and its analysis is a bit more involved.

Slide 72

Slide 72 text

Proof tricks Rice method : notation We start with some notation. Let z(1) = (z(1) −fc , . . . , z(1) 0 , . . . , z(1) fc ), z(2) = (z(2) −fc , . . . , z(2) 0 , . . . , z(2) fc ) , be i.i.d Nn (0, Idn ) random vectors. Set, for any t ∈ [0, 1], X(t) = z(1) 0 + fc k=1 (z(1) k + z(1) −k ) cos(2πkt) + fc k=1 (z(2) −k − z(2) k ) sin(2πkt) , Y (t) = z(2) 0 + fc k=1 (z(2) k + z(2) −k ) cos(2πkt) + fc k=1 (z(1) k − z(1) −k ) sin(2πkt) , Z(t) = X(t) + ıY (t) . Then, note that σ0 Z ∞ d = F∗ n (ε) ∞ and sup t∈[0,1] |Z(t)| √ n( z(1) 2 2 + z(2) 2 2 )1 2 d = F∗ n (ε) ∞ √ n ε 2 , where σ0 > 0 is the (unknown) standard deviation of the noise ε.

Slide 73

Slide 73 text

Proof tricks For the Gaussian process Lemma 1. For a complex valued centered Gaussian random vector ε as defined in (??), it holds ∀u > 0, P { F∗ n (ε) ∞ > u} ≤ n exp − u2 4nσ2 0 , where σ0 > 0 denotes the noise level. Observe that X(t) and Y (t) are two independent stationary Gaussian processes with the same auto-covariance function Σ given by ∀t ∈ [0, 1], Σ(t) = 1 + 2 fc k=1 cos(2πkt) =: Dfc (t) , where Dfc denotes the Dirichlet kernel. Set σn 2 = var X(t) = Dfc (0) = n . (6)

Slide 74

Slide 74 text

Proof tricks For the Gaussian process We use the following inequalities, for any u > 0, P{ Z ∞ > u} ≤ P{ X ∞ > u/ √ 2}+P{ Y ∞ > u/ √ 2} = 2P{ X ∞ > u/ √ 2} , (7) and (by symmetry of the process X) P{ X ∞ > u/ √ 2} ≤ 2P{ sup t∈[0,1] X(t) > u/ √ 2} . (8) To give bounds to the right hand side of (8), we use the Rice method (see Azais & Wschebor [page 93]) P{ sup t∈[0,1] X(t) > u/ √ 2} = P{∀t ∈ [0, 1]; X(t) > u/ √ 2} + P{U u/ √ 2 > 0} , ≤ 1 √ 2π +∞ u/( √ 2σn) exp − v2 2 d + E(U u/ √ 2 ) , where Uv is the number of up-crossings of the level v by the process X(t) on the interval [0, 1]. By the Rice formula E(U u/ √ 2 ) = 1 2π var X (t)) 1 σn exp − u2 4σn 2 , where var X (t) = −Σ (0) = 2(2π)2 fc k=1 k2 = 4π2 3 fc (fc + 1)n . (9)

Slide 75

Slide 75 text

Proof tricks For the Gaussian process A Chernoff argument provides for any w > 0, +∞ w exp(−v2/2)d/ √ 2π ≤ exp(−w2/2), which yields P sup t∈[0,1] X(t) > u √ 2 ≤ exp − u2 4n + fc (fc + 1) 3 exp − u2 4n ≤ n 2 exp − u2 4n . The result follows with (8).