Slide 1

Slide 1 text

Nonconvex Compressed Sensing with the Sum-of-Squares Method Tasuku Soma (Univ. Tokyo) Joint work with: Yuichi Yoshida (NII&PFI) 1 / 20

Slide 2

Slide 2 text

Compressed Sensing Given: A ∈ Rm×n (m n) and y = Ax, Task: Estimate the original sparse signal x ∈ Rn. y = A x ≤ s nonzeros 2 / 20

Slide 3

Slide 3 text

Compressed Sensing Given: A ∈ Rm×n (m n) and y = Ax, Task: Estimate the original sparse signal x ∈ Rn. y = A x ≤ s nonzeros Applications: Image Processing, Statistics, Machine Learning... 2 / 20

Slide 4

Slide 4 text

1 Minimization (Basis Pursuit) 0 1 min z 1 sub. to Az = y 3 / 20

Slide 5

Slide 5 text

1 Minimization (Basis Pursuit) 0 1 min z 1 sub. to Az = y • Convex relaxation for 0 minimization • For a subgaussian A with m = Ω(s log n s ), 1 minimization reconstructs x. [Cand` es-Romberg-Tao ’06, Donoho ’06] s: sparsity of x (maybe unknown) 3 / 20

Slide 6

Slide 6 text

Nonconvex Compressed Sensing 0 1/2 1 q minimization (0 < q ≤ 1): [Laska-Davenport-Baraniuk ’09,Cherian-Sra-Papanikolopoulos ’11] min z q q sub. to Az = y 4 / 20

Slide 7

Slide 7 text

Nonconvex Compressed Sensing 0 1/2 1 q minimization (0 < q ≤ 1): [Laska-Davenport-Baraniuk ’09,Cherian-Sra-Papanikolopoulos ’11] min z q q sub. to Az = y • Requires fewer samples than 1 minimization • Recovers arbitrary sparse signals as q → 0 • Nonconvex Optimization! 4 / 20

Slide 8

Slide 8 text

Stable Signal Recovery x needs not to be sparse but close to a sparse signal. 5 / 20

Slide 9

Slide 9 text

Stable Signal Recovery x needs not to be sparse but close to a sparse signal. A ∈ Rm×n and ∆ : Rm → Rn are p -stable recovery ⇐⇒ ∆(Ax) − x p ≤ O(σs (x)p ) for any x ∈ Rn. 5 / 20

Slide 10

Slide 10 text

Stable Signal Recovery x needs not to be sparse but close to a sparse signal. A ∈ Rm×n and ∆ : Rm → Rn are p -stable recovery ⇐⇒ ∆(Ax) − x p ≤ O(σs (x)p ) for any x ∈ Rn. p distance to s-sparse vector 5 / 20

Slide 11

Slide 11 text

Stable Signal Recovery x needs not to be sparse but close to a sparse signal. A ∈ Rm×n and ∆ : Rm → Rn are p -stable recovery ⇐⇒ ∆(Ax) − x p ≤ O(σs (x)p ) for any x ∈ Rn. p distance to s-sparse vector • Gaussian matrix with m = Ω(s log n s ) and 1 minimization are 1 -stable [Cand` es-Romberg-Tao ’06,Cand` es ’08] • Gaussian A and q minimization are q -stable (0 < q ≤ 1) [Cohen-Dahmen-DeVore ’09] • Smaller q yields better bound when noise is sparse. 5 / 20

Slide 12

Slide 12 text

Our Result Theorem For x ∞ ≤ 1 and fixed q = 2−k , there exist A ∈ Rm×n and a polytime algorithm ∆ : Rm → Rn s.t. ∆(Ax) − x q ≤ O(σs (x)q ) + ε, provided that m = Ω(s2/q log n). • (Nearly) q -stable recovery • #samples >> O(s log(n/s)) (Sample Complexity Trade Off) • Use of SoS Method and Ellipsoid Method Aij ∼ {±1/ √ m} 6 / 20

Slide 13

Slide 13 text

High Level Picture Naive Idea: Reduce q minimization to polynomial optimization min z q q sub. to Az = y → Does the SoS method find an “optimal solution”? 7 / 20

Slide 14

Slide 14 text

High Level Picture Naive Idea: Reduce q minimization to polynomial optimization min z q q sub. to Az = y → Does the SoS method find an “optimal solution”? ×No relaxed solutions “optimal” 7 / 20

Slide 15

Slide 15 text

High Level Picture Naive Idea: Reduce q minimization to polynomial optimization min z q q sub. to Az = y → Does the SoS method find an “optimal solution”? ×No relaxed solutions relaxed solutions “optimal” Idea: Add cuts to the SoS method min z q q s.t. Az = y, Additional Constraints 7 / 20

Slide 16

Slide 16 text

SoS Method [Lasserre ’06, Parrilo ’00, Nesterov ’00, Shor ’87] Polynomial Optimization: f, g1 , . . . , gi ∈ R[z]: polynomials min z f(z) sub. to gi (z) = 0 (i = 1, . . . , m) 8 / 20

Slide 17

Slide 17 text

SoS Method [Lasserre ’06, Parrilo ’00, Nesterov ’00, Shor ’87] Polynomial Optimization: f, g1 , . . . , gi ∈ R[z]: polynomials min z f(z) sub. to gi (z) = 0 (i = 1, . . . , m) SoS Relaxation (of degree d): min E E[f(z)] sub. to E : R[z]d → R, linear operator “pseudoexpectation” E[1] = 1 E[p(z)2] ≥ 0 (p ∈ R[z] : deg(p) ≤ d/2) E[gi (z)p(z)] = 0 (p ∈ R[z] : deg(gi p) ≤ d, i = 1, . . . , m) 8 / 20

Slide 18

Slide 18 text

Facts on SoS Method • The SoS Relaxation (of degree d) reduces to Semidefinite Programming (SDP) with nO(d)-size matrix. 9 / 20

Slide 19

Slide 19 text

Facts on SoS Method • The SoS Relaxation (of degree d) reduces to Semidefinite Programming (SDP) with nO(d)-size matrix. • Dual View: SoS Proof System Any (low-degree) “proof” in SoS proof system yields an algorithm via the SoS method. 9 / 20

Slide 20

Slide 20 text

Facts on SoS Method • The SoS Relaxation (of degree d) reduces to Semidefinite Programming (SDP) with nO(d)-size matrix. • Dual View: SoS Proof System Any (low-degree) “proof” in SoS proof system yields an algorithm via the SoS method. • Very Powerful Tool in Computer Science: • Subexponential Alg. for UG [Arora-Barak-Steurer’10] • Planted Sparse Vector [Barak-Kelner-Steurer’14] • Sparse PCA [Ma-Wigderson’14] 9 / 20

Slide 21

Slide 21 text

Outline q -stable: ˆ x − x q q ≤ O(1) · σs (x)q q q -robust null space property A has small coherence A is a Rademacher matrix q -stability proof

Slide 22

Slide 22 text

Outline q -stable: ˆ x − x q q ≤ O(1) · σs (x)q q q -robust null space property A has small coherence A is a Rademacher matrix q -stability proof E ver q -stable: E z − x q q ≤ O(1) · σs (x)q q E ver q -robust null space property (2) (1) Our proof 10 / 20

Slide 23

Slide 23 text

Basic Idea Formulate q minimization as polynomial optimization: min z q q sub. to Az = y Note: |z(i)|q is not a polynomial, but representable by lifting; |z(i)| 4 = z(i)2, |z(i)| ≥ 0 11 / 20

Slide 24

Slide 24 text

Basic Idea Formulate q minimization as polynomial optimization: min z q q sub. to Az = y Note: |z(i)|q is not a polynomial, but representable by lifting; |z(i)| 4 = z(i)2, |z(i)| ≥ 0 ×Solutions of SoS method do not satisfy triangle inequalities: E z + x q q E z q q + x q q 11 / 20

Slide 25

Slide 25 text

Basic Idea Formulate q minimization as polynomial optimization: min z q q sub. to Az = y Note: |z(i)|q is not a polynomial, but representable by lifting; |z(i)| 4 = z(i)2, |z(i)| ≥ 0 ×Solutions of SoS method do not satisfy triangle inequalities: E z + x q q E z q q + x q q Add Valid Constraints! min z q q s.t. Az = y, Valid Constraints 11 / 20

Slide 26

Slide 26 text

Triangle Inequalities z + x q q ≤ z q q + x q q We have to add |z(i) + x(i)|q, but do not know x(i). 12 / 20

Slide 27

Slide 27 text

Triangle Inequalities z + x q q ≤ z q q + x q q We have to add |z(i) + x(i)|q, but do not know x(i). Idea: Using Grid L: set of multiples of δ in [−1, 1]. -1 0 1 δ • new variable for |z(i) − b|q (b ∈ L) • triangle inequalities for |z(i) − b|q, |z(i) − b |q, and |b − b |q (b, b ∈ L) 12 / 20

Slide 28

Slide 28 text

Robust q Minimization Instead of x, we will find xL ∈ Ln closest to x. Robust q Minimization min z q q s.t. y − Az 2 2 ≤ η2 η = σmax (A) √ sδ 13 / 20

Slide 29

Slide 29 text

Robust q Minimization Instead of x, we will find xL ∈ Ln closest to x. Robust q Minimization min z q q s.t. y − Az 2 2 ≤ η2 η = σmax (A) √ sδ q Robust Null Space Property vS q q ≤ ρ v S q q + τ Av q 2 for any v and S ⊆ [n] with |S| ≤ s. 13 / 20

Slide 30

Slide 30 text

Robust q Minimization Instead of x, we will find xL ∈ Ln closest to x. Robust q Minimization min z q q s.t. y − Az 2 2 ≤ η2 η = σmax (A) √ sδ q Pseudo Robust Null Space Property ( q -PRNSP) E vS q q ≤ ρ E v S q q + τ E Av 2 2 q/2 for any v = z − b (b ∈ Ln) and S ⊆ [n] with |S| ≤ s. 13 / 20

Slide 31

Slide 31 text

(1) PRNSP =⇒ Stable Recovery Theorem If E satisfies q -PRNSP, then E z − xL q q ≤ 2(1 + ρ) 1 − ρ σs (xL )q q + 21+qτ 1 − ρ ηq, where xL is the closest vector in Ln to x. Proof Idea: A proof of stability only needs: • q q triangle inequalities for z − xL , x and z + xL • 2 triangle inequality 14 / 20

Slide 32

Slide 32 text

Rounding Extract an actual vector ˆ x from a pseudoexpectation E. ˆ x(i) := argmin b∈L E|z(i) − b|q (i = 1, . . . , n) Theorem If E satisfies PRNSP, ˆ x − xL q q ≤ 2 2(1 + ρ) 1 − ρ σs (xL )q q + 21+qτ 1 − ρ ηq 15 / 20

Slide 33

Slide 33 text

Outline q -stable: ˆ x − x q q ≤ O(1) · σs (x)q q q -robust null space property A has small coherence A is a Rademacher matrix q -stability proof E ver q -stable: E z − x q q ≤ O(1) · σs (x)q q E ver q -robust null space property (2) (1) Our proof 16 / 20

Slide 34

Slide 34 text

Imposing PRNSP How can we obtain E satisfying PRNSP? Idea: Follow known proofs for robust NSP! • From Restricted Isometry Property (RIP) [Cand` es ’08] • From Coherence [Gribonval-Nielsen ’03, Donoho-Elad ’03] • From Lossless Expander [Berinde et al. ’08] 17 / 20

Slide 35

Slide 35 text

Coherence The coherence of a matrix A = [a1 . . . an ] is µ = max i j | ai , aj | ai 2 aj 2 Facts: • If µq < 1 2s , q Robust NSP holds. • If A is a Rademacher matrix with m = O(s2/q log n), then µq < 1 2s w.h.p. 18 / 20

Slide 36

Slide 36 text

Small Coherence =⇒ PRNSP Issue: Naive import needs exponentially many variables and constraints! Lemma If A is a Rademacher matrix, • additinal variables are polynomially many • additional constraints have a separation oracle Thus ellipsoid methods find E with PRNSP. 19 / 20

Slide 37

Slide 37 text

Our Result Theorem For x ∞ ≤ 1 and fixed q = 2−k , there exist A ∈ Rm×n and a polytime algorithm ∆ : Rm → Rn s.t. ∆(Ax) − x q ≤ O(σs (x)q ) + ε, provided that m = Ω(s2/q log n). • (Nearly) q -stable recovery • #samples >> O(s log(n/s)) (Sample Complexity Trade Off) • Use of SoS Method and Ellipsoid Method Aij ∼ {±1/ √ m} 20 / 20

Slide 38

Slide 38 text

Putting Things Together Using a Rademacher matrix yields PRNSP: E vS q q ≤ O(1) · E v S q q + O(s) · E Av 2 2 q/2 21 / 20

Slide 39

Slide 39 text

Putting Things Together Using a Rademacher matrix yields PRNSP: E vS q q ≤ O(1) · E v S q q + O(s) · E Av 2 2 q/2 This guarantees: ˆ x − x q q ≤ O(σs (xL )q q ) + O(s) · ηq 21 / 20

Slide 40

Slide 40 text

Putting Things Together Using a Rademacher matrix yields PRNSP: E vS q q ≤ O(1) · E v S q q + O(s) · E Av 2 2 q/2 This guarantees: ˆ x − x q q ≤ O(σs (xL )q q ) + O(s) · ηq Theorem If we take δ small, then the rounded vector ˆ x satisfies ˆ x − x q q ≤ O(σs (x)q q ) + ε. (pf) η = σmax (A) √ sδ and σmax (A) = O(n/m) 21 / 20