Slide 1

Slide 1 text

Robust Sparse Analysis Recovery Samuel Vaiter

Slide 2

Slide 2 text

Inverse Problems

Slide 3

Slide 3 text

Inverse Problems Several problems Inpaiting Super-resolution

Slide 4

Slide 4 text

Inverse Problems ill-posed Linear hypothesis One model y = x0 + w Observations Operator Unknown signal Noise Several problems Inpaiting Super-resolution

Slide 5

Slide 5 text

Inverse Problems ill-posed Linear hypothesis One model y = x0 + w Observations Operator Unknown signal Noise Several problems Inpaiting Super-resolution Regularization x? 2 argmin x 2RN 1 2 || y x ||2 2 + J ( x )

Slide 6

Slide 6 text

Inverse Problems ill-posed Linear hypothesis One model y = x0 + w Observations Operator Unknown signal Noise x? 2 argmin x = y J ( x ) Noiseless 0 Several problems Inpaiting Super-resolution Regularization x? 2 argmin x 2RN 1 2 || y x ||2 2 + J ( x )

Slide 7

Slide 7 text

Image Priors Sobolev J ( x ) = 1 2 Z ||r x ||2

Slide 8

Slide 8 text

Image Priors Sobolev J ( x ) = 1 2 Z ||r x ||2 Total variation J ( x ) = Z ||r x ||

Slide 9

Slide 9 text

Image Priors Sobolev J ( x ) = 1 2 Z ||r x ||2 (ideal prior) Wavelet sparsity J ( x ) = | { i \ h x, i i 6= 0} | Total variation J ( x ) = Z ||r x ||

Slide 10

Slide 10 text

Overview • Analysis vs. Synthesis Regularization • Local Parameterization of Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives

Slide 11

Slide 11 text

Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N

Slide 12

Slide 12 text

Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id

Slide 13

Slide 13 text

Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id shift invariant wavelet frame

Slide 14

Slide 14 text

finite di↵erence operator ⌦n DIF 0 B B B B B B @ 1 0 +1 1 +1 ... ... 1 0 +1 1 C C C C C C A Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id shift invariant wavelet frame

Slide 15

Slide 15 text

finite di↵erence operator ⌦n DIF 0 B B B B B B @ 1 0 +1 1 +1 ... ... 1 0 +1 1 C C C C C C A Dictionary Redundant dictionary of RN : {di }P 1 i=0 , P > N Identity Id shift invariant wavelet frame fussed lasso ⌦DIF "Id

Slide 16

Slide 16 text

Analysis versus Synthesis Two point of view

Slide 17

Slide 17 text

Analysis versus Synthesis Two point of view “Generate” x Synthesis x = D↵ ↵ N P x ! non-unique if P > N

Slide 18

Slide 18 text

Analysis versus Synthesis Two point of view “Generate” x Synthesis x = D↵ ↵ N P x ! non-unique if P > N “Analyze” x OR ? Analysis D ⇤ x = ↵ ↵ x N P

Slide 19

Slide 19 text

“Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6= 0} | A Bird’s Eye View of Sparsity

Slide 20

Slide 20 text

`0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6= 0} | A Bird’s Eye View of Sparsity

Slide 21

Slide 21 text

`0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity

Slide 22

Slide 22 text

`0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity d0 d1 q = 1 q = 0 q = 2 q = 1 5 . q = 0 5 .

Slide 23

Slide 23 text

`1 norm: convexification of `0 prior `0 minimization is NP-hard “Ideal” sparsity prior: J0(↵) = | {i \ ↵i 6= 0} | convex (norms) for q > 1 `q prior: Jq(↵) = X i |↵i |q A Bird’s Eye View of Sparsity d0 d1 q = 1 q = 0 q = 2 q = 1 5 . q = 0 5 .

Slide 24

Slide 24 text

Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations

Slide 25

Slide 25 text

Analysis argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations

Slide 26

Slide 26 text

Analysis argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵

Slide 27

Slide 27 text

Analysis argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵

Slide 28

Slide 28 text

Analysis argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 Sparse approx. of x ? in D Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵

Slide 29

Slide 29 text

Analysis argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 Correlation of x ? and D sparse Sparse approx. of x ? in D Synthesis argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 = D x = D↵ Sparse Regularizations = 6= 0 D x ↵ = D⇤ x ↵

Slide 30

Slide 30 text

Support and Signal Model I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )

Slide 31

Slide 31 text

Ker D⇤ J = GJ Definition Support and Signal Model I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )

Slide 32

Slide 32 text

Ker D⇤ J = GJ Definition Support and Signal Model I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? 2 GJ ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

Slide 33

Slide 33 text

Ker D⇤ J = GJ Definition Hypothesis: Ker \ Ker D⇤ = {0} Support and Signal Model I = supp( D ⇤ x ?) , J = I c x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? 2 GJ ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

Slide 34

Slide 34 text

Examples of Signal Model Identity ⇥k : k-sparse signals 1

Slide 35

Slide 35 text

Examples of Signal Model Credit to G. Peyr´ e shift invariant wavelet frame Identity ⇥k : k-sparse signals 1

Slide 36

Slide 36 text

Examples of Signal Model Credit to G. Peyr´ e shift invariant wavelet frame finite di↵erence operator ⇥k : piecewise constant signals with k 1 steps 1 a1 a2 a3 Identity ⇥k : k-sparse signals 1

Slide 37

Slide 37 text

Examples of Signal Model Credit to G. Peyr´ e shift invariant wavelet frame ⇥k : sum of k interval characteristic functions fussed lasso a1 a2 a3 a4 a5 a6 a7 a8 1 finite di↵erence operator ⇥k : piecewise constant signals with k 1 steps 1 a1 a2 a3 Identity ⇥k : k-sparse signals 1

Slide 38

Slide 38 text

Synthesis Analysis ! 0 x? = argmin x = y || D ⇤ x ||1 P(y, 0) Remember ! ↵? = argmin ↵2RQ 1 2 ||y ↵||2 2 + ||↵||1 P(y, ) x? = argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1

Slide 39

Slide 39 text

Local behavior ? Properties of x ? solution of P (y, ) as a function of y Toward a Better Understanding

Slide 40

Slide 40 text

Local behavior ? Properties of x ? solution of P (y, ) as a function of y Noiseless identifiability ? Is x0 the unique solution of P ( x0, 0) ? Toward a Better Understanding

Slide 41

Slide 41 text

Noise robustness ? What can we say about || x ? x0 || for noisy observations ? Local behavior ? Properties of x ? solution of P (y, ) as a function of y Noiseless identifiability ? Is x0 the unique solution of P ( x0, 0) ? Toward a Better Understanding

Slide 42

Slide 42 text

[Fuchs, Tropp, Dossal]: address these questions — Previous works in synthesis From Synthesis to Analysis Results

Slide 43

Slide 43 text

[Fuchs, Tropp, Dossal]: address these questions — Previous works in synthesis Geometry of the problem ? — Similar problem but much more di culties in analysis From Synthesis to Analysis Results

Slide 44

Slide 44 text

From Synthesis to Analysis Results

Slide 45

Slide 45 text

From Synthesis to Analysis Results d1 d2

Slide 46

Slide 46 text

G2 G1 From Synthesis to Analysis Results d1 d2

Slide 47

Slide 47 text

G2 G1 ||↵||1 = 1 From Synthesis to Analysis Results d1 d2

Slide 48

Slide 48 text

G2 G1 ||↵||1 = 1 From Synthesis to Analysis Results d1 d2 y = ↵

Slide 49

Slide 49 text

G2 G1 ||↵||1 = 1 From Synthesis to Analysis Results d1 d2 y = ↵

Slide 50

Slide 50 text

G2 G1 ||↵||1 = 1 From Synthesis to Analysis Results d1 d2 y = ↵ ↵?

Slide 51

Slide 51 text

G2 G1 ||↵||1 = 1 sparsest solution From Synthesis to Analysis Results d1 d2 y = ↵ ↵?

Slide 52

Slide 52 text

From Synthesis to Analysis Results

Slide 53

Slide 53 text

From Synthesis to Analysis Results d1 d2 d3

Slide 54

Slide 54 text

From Synthesis to Analysis Results d1 d2 d3 G3 G2 G1

Slide 55

Slide 55 text

From Synthesis to Analysis Results d1 d2 d3 G3 G2 G1 || D ⇤ x ||1 = 1

Slide 56

Slide 56 text

From Synthesis to Analysis Results d1 d2 d3 G3 G2 G1 || D ⇤ x ||1 = 1 y = x

Slide 57

Slide 57 text

From Synthesis to Analysis Results d1 d2 d3 G3 G2 G1 || D ⇤ x ||1 = 1 y = x

Slide 58

Slide 58 text

From Synthesis to Analysis Results d1 d2 d3 G3 G2 G1 || D ⇤ x ||1 = 1 y = x x ?

Slide 59

Slide 59 text

Overview • Analysis vs. Synthesis Regularization • Local Parameterization of Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives

Slide 60

Slide 60 text

Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable,

Slide 61

Slide 61 text

Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0

Slide 62

Slide 62 text

Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y = x

Slide 63

Slide 63 text

Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x

Slide 64

Slide 64 text

A ne function: ¯ y 7! x(¯ y) = A ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x

Slide 65

Slide 65 text

Except for y 2 H , if ¯ y is close enough from y , then x(¯ y) is a solution of P (¯ y, ). Theorem 1 A ne function: ¯ y 7! x(¯ y) = A ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, GJ GJ0 y + " = x y = x

Slide 66

Slide 66 text

Except for y 2 H , if ¯ y is close enough from y , then x(¯ y) is a solution of P (¯ y, ). Theorem 1 A ne function: ¯ y 7! x(¯ y) = A ⇤ ¯ y ADI s Analysis is Piecewise Affine i.e solutions of P ( y, ) and P ( y + ", ) lives in the same GJ . Main idea: GJ is stable, definition in few minutes GJ GJ0 y + " = x y = x

Slide 67

Slide 67 text

Sketch of the Proof Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )

Slide 68

Slide 68 text

Sketch of the Proof Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )

Slide 69

Slide 69 text

Sketch of the Proof Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ

Slide 70

Slide 70 text

Hypothesis Ker \ GJ = {0} Sketch of the Proof Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ

Slide 71

Slide 71 text

Hypothesis Ker \ GJ = {0} Sketch of the Proof Support I = supp( D ⇤ x ?) , J = I c Problem : Lasso x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) Subspace of analysis Ker D⇤ J = GJ — I, J, s = sign(D ⇤ x ? ) are fixed by x ? — We fix observations y

Slide 72

Slide 72 text

First Order Conditions x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )

Slide 73

Slide 73 text

Non di↵erentiable problem x ? is a minimum of P (y, ) if, and only if, 0 2 @f(x ? ) First Order Conditions x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, )

Slide 74

Slide 74 text

Non di↵erentiable problem x ? is a minimum of P (y, ) if, and only if, 0 2 @f(x ? ) First Order Conditions x? 2 argmin x 2RN 1 2 || y x ||2 2 + || D ⇤ x ||1 P(y, ) x ? solution of P (y, ) , 9 2 ⌃y(x ? ), || ||1 6 1 ⌃y( x ) = n 2 R|J| \ ⇤( x y ) + DI s + DJ = 0o First-order conditions of Lasso Gradient Subdi↵erential

Slide 75

Slide 75 text

x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 A Solution of Lasso

Slide 76

Slide 76 text

x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso

Slide 77

Slide 77 text

⇤ x ( y ) = ⇤ y DI s DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso

Slide 78

Slide 78 text

⇤ x ( y ) = ⇤ y DI s DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso Non-inversible

Slide 79

Slide 79 text

⇤ x ( y ) = ⇤ y DI s DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible

Slide 80

Slide 80 text

⇤ x ( y ) = ⇤ y DI s DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? x ( y ) = A ⇤ y ADI s ADJ A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible

Slide 81

Slide 81 text

⇤ x ( y ) = ⇤ y DI s DJ x ( y ) 2 argmin x 2GJ 1 2 || y x ||2 2 + || D ⇤ x ||1 How to implicit a solution ? x ( y ) = A ⇤ y ADI s ADJ A Solution of Lasso A ⇤ : ( A ⇤ | (GJ ) = |GJ 1 A ⇤ | (GJ )? = 0 A ⇤ inverse of on GJ A ⇤ RQ RN 0 GJ Non-inversible = 0 ( x ( y ) 2 GJ )

Slide 82

Slide 82 text

H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space

Slide 83

Slide 83 text

H : first order conditions saturation ! “jump” from GJ to GJ0 H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space

Slide 84

Slide 84 text

H : first order conditions saturation ! “jump” from GJ to GJ0 H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space x = GJ GJ0 2 H =

Slide 85

Slide 85 text

H : first order conditions saturation ! “jump” from GJ to GJ0 Open question Smallest union of subspace containing H ? H = ⇢ y 2 RQ \ 9 x 2 RN : min 2⌃y( x ) || ||1 = 1 Transition Space x = GJ GJ0 2 H =

Slide 86

Slide 86 text

¯ y 7! x (¯ y ) = A ⇤ ¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) End of the Proof

Slide 87

Slide 87 text

¯ y 7! x (¯ y ) = A ⇤ ¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) — Fix ¯ y close enough to have sign(D ⇤ x(y)) = sign(D ⇤ x(¯ y)) Sign stability End of the Proof

Slide 88

Slide 88 text

¯ y 7! x (¯ y ) = A ⇤ ¯ y ADI s — Consider x(y) as a mapping of observations ¯ y 7! x(¯ y) — Check that x(¯ y) is indeed solution of P (¯ y, ) Use of first order conditions — Fix ¯ y close enough to have sign(D ⇤ x(y)) = sign(D ⇤ x(¯ y)) Sign stability End of the Proof

Slide 89

Slide 89 text

x (¯ y ) = A ⇤ ¯ y ADI s Remember !

Slide 90

Slide 90 text

x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ

Slide 91

Slide 91 text

— continuous y 7! x ?( y ) is : x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ

Slide 92

Slide 92 text

— continuous y 7! x ?( y ) is : x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne

Slide 93

Slide 93 text

— continuous y 7! x ?( y ) is : Property given by sign stability x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne

Slide 94

Slide 94 text

— continuous y 7! x ?( y ) is : Property given by sign stability Useful for : — Robustness study — SURE denoising risk estimation — Inverse problem on x x (¯ y ) = A ⇤ ¯ y ADI s Remember ! Inverse of on GJ — locally a ne

Slide 95

Slide 95 text

Overview • Analysis vs. Synthesis Regularization • Local Parameterization of Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives

Slide 96

Slide 96 text

Identifiability: x0 unique solution of P ( x0, 0) { x0 } ? = argmin x = x0 || D ⇤ x ||1 Identifiability

Slide 97

Slide 97 text

Identifiability: x0 unique solution of P ( x0, 0) { x0 } ? = argmin x = x0 || D ⇤ x ||1 Strategy: P ( y, ) is almost P ( y, 0) for small values of Identifiability

Slide 98

Slide 98 text

Identifiability: x0 unique solution of P ( x0, 0) { x0 } ? = argmin x = x0 || D ⇤ x ||1 ! Restrictive condition ! But gives a stability results for small noise. Assumption: GJ must be stable for small values of Strategy: P ( y, ) is almost P ( y, 0) for small values of Identifiability

Slide 99

Slide 99 text

⌦ = D+ J ( ⇤ A Id)DI F(s) = min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector Noiseless and Sign Criterion

Slide 100

Slide 100 text

⌦ = D+ J ( ⇤ A Id)DI F(s) = min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) Noiseless and Sign Criterion

Slide 101

Slide 101 text

⌦ = D+ J ( ⇤ A Id)DI F(s) = min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) If F (sign ( D ⇤ I x0)) < 1 then x0 is identifiable. Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Theorem 2 Suppose that Ker \ GJ = { 0 } . Noiseless and Sign Criterion

Slide 102

Slide 102 text

⌦ = D+ J ( ⇤ A Id)DI F(s) = min w2Ker DJ ||⌦s w||1 Algebraic criterion on sign vector (convex ! computable) If F (sign ( D ⇤ I x0)) < 1 then x0 is identifiable. Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Theorem 2 Suppose that Ker \ GJ = { 0 } . Specializes to Fuchs results for synthesis ( D = Id) Noiseless and Sign Criterion

Slide 103

Slide 103 text

Nam et al. Results Only other work on analysis recovery [Nam 2011] “Cosparse” model

Slide 104

Slide 104 text

Nam et al. Results G(s) = || s||1 M⇤ orthonormal basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model

Slide 105

Slide 105 text

Nam et al. Results G(s) = || s||1 M⇤ orthonormal basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model Theorem Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Suppose that Ker \ GJ = { 0 } . If G (sign ( D ⇤ I x0)) < 1 then x0 is identifiable.

Slide 106

Slide 106 text

! But no noise robustness, even for small ones More intrinsic criterion Nam et al. Results G(s) = || s||1 M⇤ orthonormal basis of Ker = (MDJ )+MDI Only other work on analysis recovery [Nam 2011] “Cosparse” model Theorem Let x0 2 RN be a fixed vector, and J = I c where I = I(D ⇤ x0). Suppose that Ker \ GJ = { 0 } . If G (sign ( D ⇤ I x0)) < 1 then x0 is identifiable.

Slide 107

Slide 107 text

x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 Sketch of the Proof

Slide 108

Slide 108 text

small enough to have sign(D ⇤ x ( x0)) = sign(D ⇤ x0) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 Sketch of the Proof

Slide 109

Slide 109 text

small enough to have sign(D ⇤ x ( x0)) = sign(D ⇤ x0) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 Sketch of the Proof

Slide 110

Slide 110 text

small enough to have sign(D ⇤ x ( x0)) = sign(D ⇤ x0) x ( x0) solution of P ( ) and x ( x0) ! !0 x0( x0) x0( x0) solution of P (0) ) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 Sketch of the Proof

Slide 111

Slide 111 text

small enough to have sign(D ⇤ x ( x0)) = sign(D ⇤ x0) x ( x0) solution of P ( ) and x ( x0) ! !0 x0( x0) x0( x0) solution of P (0) ) x ( x0) = A ⇤ x0 ADI s Idea: Study P ( y, ) for ⇡ 0 lim !0 x ( x0) = A ⇤ x0 = x0 F(sign(D ⇤ x ( x0)) < 1 ) x ( x0) unique solution Sketch of the Proof

Slide 112

Slide 112 text

Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery

Slide 113

Slide 113 text

Generalization of Theorem 2 : Yes, if ||w|| small enough Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery

Slide 114

Slide 114 text

Generalization of Theorem 2 : Yes, if ||w|| small enough Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) F (sign( D ⇤ x0)) < 1 gives • identifiability • small noise robustness Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery

Slide 115

Slide 115 text

Generalization of Theorem 2 : Yes, if ||w|| small enough Condition : sign(D ⇤ x (y)) = sign(D ⇤ x0) Question: And for an arbitrary noise ? F (sign( D ⇤ x0)) < 1 gives • identifiability • small noise robustness Suppose we observe y = x0 + w Does argmin x = y || D ⇤ x ||1 recovers x0 + A ⇤ w ? Small Noise Recovery

Slide 116

Slide 116 text

Settings: y = x0 + w, with w bounded noise. Noisy and Support Criterion

Slide 117

Slide 117 text

identifiability of vector ! identifiability of support Settings: y = x0 + w, with w bounded noise. Noisy and Support Criterion

Slide 118

Slide 118 text

ARC(I) = max x 2GJ F(sign(D ⇤ I x)) identifiability of vector ! identifiability of support Settings: y = x0 + w, with w bounded noise. Noisy and Support Criterion

Slide 119

Slide 119 text

ARC(I) = max x 2GJ F(sign(D ⇤ I x)) identifiability of vector ! identifiability of support Settings: y = x0 + w, with w bounded noise. then x (y) is the unique solution of P (y, ) and ||x (¯ y) x0 || = O( ) Theorem 3 Suppose ARC( I ) < 1 and > K ||w|| 1 ARC( I ) Noisy and Support Criterion

Slide 120

Slide 120 text

sign support Noiseless Noisy F(s) = min w2Ker DJ ||⌦s w||1 Vector identifiability Support identifiability Remember ! ARC(I) = max x 2GJ F(sign(D ⇤ I x))

Slide 121

Slide 121 text

We give a su cient condition for identifiability. From Theory to Numerics

Slide 122

Slide 122 text

How far are we from a necessary condition ? We give a su cient condition for identifiability. From Theory to Numerics

Slide 123

Slide 123 text

Overview • Analysis vs. Synthesis Regularization • Local Parameterization of Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives

Slide 124

Slide 124 text

f l.s.c convex function from C convex of an Hilbert H in R. Proximal Operator

Slide 125

Slide 125 text

Proximal operator proxf (x) = argmin u2RN ⇢ f(u) + 1 2 || u x ||2 2 f l.s.c convex function from C convex of an Hilbert H in R. Proximal Operator

Slide 126

Slide 126 text

Proximal operator proxf (x) = argmin u2RN ⇢ f(u) + 1 2 || u x ||2 2 f l.s.c convex function from C convex of an Hilbert H in R. Proximal Operator Fundamental examples: proxiC = PC prox||·||1 = S1 T .

Slide 127

Slide 127 text

min x 2RN L( K ( x )) where ⇢ L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?

Slide 128

Slide 128 text

Alternating Direction Method of Multipliers un = prox L⇤ (un 1 + K(zn 1)) xn = prox⌧G(xn 1 ⌧K ⇤ (un)) zn = xn + ✓(xn xn 1) [Chambolle, Pock] min x 2RN L( K ( x )) where ⇢ L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?

Slide 129

Slide 129 text

For P ( y, 0), ||y g||2 ! i{y} Alternating Direction Method of Multipliers un = prox L⇤ (un 1 + K(zn 1)) xn = prox⌧G(xn 1 ⌧K ⇤ (un)) zn = xn + ✓(xn xn 1) [Chambolle, Pock] min x 2RN L( K ( x )) where ⇢ L( g, u ) = 1 2 || y g ||2 + || u ||1 K ( x ) = ( x, D ⇤ x ) Primal-dual schemes How to Solve These Regularizations ?

Slide 130

Slide 130 text

Computing Criterions Unconstrained formulation F(s) = min w2RN ||⌦s w||1 + iD(w) Prox P||·||1=1 PD

Slide 131

Slide 131 text

Computing Criterions ARC(I) = max x 2GJ F(sign(D ⇤ I x)) 6 wARC(I) = max s 2{ 1 , 1}|J| F(s) 6 oARC(I) = || ⌦ ||1!1 easy non-convex non-convex ARC di cult to compute (non-convex) Unconstrained formulation F(s) = min w2RN ||⌦s w||1 + iD(w) Prox P||·||1=1 PD

Slide 132

Slide 132 text

More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

Slide 133

Slide 133 text

Sparsity || D ⇤ x0 ||0 is not a good parameter More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

Slide 134

Slide 134 text

D redundant Gaussian i.i.d matrix N ⇥ P || D ⇤ x0 ||0 < P N ) x0 = 0 ! Sparsity || D ⇤ x0 ||0 is not a good parameter More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

Slide 135

Slide 135 text

D redundant Gaussian i.i.d matrix N ⇥ P || D ⇤ x0 ||0 < P N ) x0 = 0 ! Sparsity || D ⇤ x0 ||0 is not a good parameter Good one : DOF( x ) = dim GJ More on Signal Models ⇥ = [ k2{1...P } ⇥k where ⇥k = {GJ \ dim GJ = k} Signal model : “Union of subspace”

Slide 136

Slide 136 text

|| x ||0 = DOF( x ) Credit to C. Dossal Recovery rate Identifiability F (sign( D ⇤ x )) < 1 ARC( I ( D ⇤ x )) < 1 Compressed sensing : Q ⌧ N 1) Synthesis results Random Settings

Slide 137

Slide 137 text

Random Settings 2) Analysis results D, Gaussian i.i.d random matrices

Slide 138

Slide 138 text

! Strong unstability Many dependancies between columns Random Settings 2) Analysis results D, Gaussian i.i.d random matrices

Slide 139

Slide 139 text

! Strong unstability Many dependancies between columns Close to `2 ball ! Random Settings 2) Analysis results D, Gaussian i.i.d random matrices

Slide 140

Slide 140 text

D⇤ = r, = Id Limits : TV Instability

Slide 141

Slide 141 text

D⇤ = r, = Id Limits : TV Instability ⇥k : piecewise constant signals with k 1 step.

Slide 142

Slide 142 text

D⇤ = r, = Id Limits : TV Instability “Box” F(s) = 1 " +1 1 ⇥k : piecewise constant signals with k 1 step.

Slide 143

Slide 143 text

D⇤ = r, = Id Limits : TV Instability “Box” F(s) = 1 " +1 1 +1 “Staircase” F(s) = 1 No noise stability even for small one +1 ⇥k : piecewise constant signals with k 1 step.

Slide 144

Slide 144 text

Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

Slide 145

Slide 145 text

Signal Model: Characteristic functions sum ⇥2 : x0 = 1[a,b] + 1[c,d] Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

Slide 146

Slide 146 text

Signal Model: Characteristic functions sum ⇥2 : x0 = 1[a,b] + 1[c,d] Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF Overlap 1 No overlap 1

Slide 147

Slide 147 text

[ a, b ] \ [ c, d ] 6= ; ) F ( x0) > 1 no noise robustness Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

Slide 148

Slide 148 text

[ a, b ] \ [ c, d ] = ; ) 2 situations Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

Slide 149

Slide 149 text

[ a, b ] \ [ c, d ] = ; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

Slide 150

Slide 150 text

[ a, b ] \ [ c, d ] = ; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") strong noise robustness F (sign( D ⇤ x0)) = ARC( I ) < 1 |c b| > ⇠(") Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

Slide 151

Slide 151 text

[ a, b ] \ [ c, d ] = ; ) 2 situations F (sign( D ⇤ x0)) > 1 no noise robustness |c b| 6 ⇠(") strong noise robustness F (sign( D ⇤ x0)) = ARC( I ) < 1 |c b| > ⇠(") Haar : similar results Fused Lasso argmin x 2RN 1 2 || y x ||2 2 subject to ⇢ ||r x ||1 6 s1 || x ||1 6 s2 "Id ⌦DIF

Slide 152

Slide 152 text

Take-Away Messages

Slide 153

Slide 153 text

— Analysis regularization is robust Take-Away Messages

Slide 154

Slide 154 text

— Analysis regularization is robust — Geometry (union of subspaces) : key concept for recovery Take-Away Messages

Slide 155

Slide 155 text

— Analysis regularization is robust — Geometry (union of subspaces) : key concept for recovery — Sparsity is not univoquely defined Take-Away Messages

Slide 156

Slide 156 text

Overview • Analysis vs. Synthesis Regularization • Local Parameterization of Analysis Regularization • Identifiability and Stability • Numerical Evaluation • Perspectives

Slide 157

Slide 157 text

What’s Next ? Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise

Slide 158

Slide 158 text

— Total Variation identifiability Existence of a better criterion to ensure noisy recovery ? What’s Next ? Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise

Slide 159

Slide 159 text

— Total Variation identifiability Existence of a better criterion to ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise

Slide 160

Slide 160 text

— Total Variation identifiability Existence of a better criterion to ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model — Larger class of priors J Block sparsity || · ||p,q Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise

Slide 161

Slide 161 text

— Total Variation identifiability Existence of a better criterion to ensure noisy recovery ? What’s Next ? Work initiated by Chambolle in TV — Continuous model — Larger class of priors J Block sparsity || · ||p,q — Real-world recovery results Almost equal support recovery Deterministic theorem ! treat the noise as a random variable — Support identifiability with Gaussian, Poisson noise

Slide 162

Slide 162 text

Joint work with — Gabriel Peyr´ e (CEREMADE, Dauphine) — Charles Dossal (IMB, Bordeaux I) — Jalal Fadili (GREYC, ENSICAEN) Any questions ? Thanks

Slide 163

Slide 163 text

x (¯ y ) = A ⇤ ¯ y ADI s An Affine Implicit Mapping

Slide 164

Slide 164 text

x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping

Slide 165

Slide 165 text

x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ

Slide 166

Slide 166 text

B : ( B|IJ = |GJ 1 B|I? J = 0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ

Slide 167

Slide 167 text

B : ( B|IJ = |GJ 1 B|I? J = 0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) E cient computation y = Bx = argmin D⇤z=0 || z x ||2 2 An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ

Slide 168

Slide 168 text

B : ( B|IJ = |GJ 1 B|I? J = 0 B = U(U⇤ ⇤ U) 1U⇤ ⇤ U BON of GJ x (¯ y ) = A ⇤ ¯ y ADI s s = sign( D ⇤ I x ( y )) C ✓ y µ ◆ = ✓ x 0 ◆ where C = ✓ ⇤ D D ⇤ 0 ◆ E cient computation y = Bx = argmin D⇤z=0 || z x ||2 2 An Affine Implicit Mapping B = A ⇤ inverse of on GJ GJ B ⇠ = IJ = (GJ ) B RQ RN 0 IJ GJ