Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Sébastien Bourguignon

Sébastien Bourguignon

(École Centrale de Nantes, LS2N, SIMS team)

https://s3-seminar.github.io/seminars/sebastien-bourguignon/

Title — Global optimization for l0-norm-based sparse solutions to least squares problems

Abstract — Finding solutions to least- squares problems with low cardinality (with small l0 “norm”) has found many applications, including sparsity-enhancing inverse problems in signal processing, subset selection in statistics, and cardinality-constrained portfolio optimization. This problem is NP-hard, therefore most works propose suboptimal (continuous relaxation or greedy) methods that are able to address high-dimensional problems. On the other hand, several recent works studied the global optimization of l0-norm problems through the mixed integer programming (MIP) reformulation of the l0 norm. Resolution is performed via MIP solvers, which essentially implement branch-and-bound methods.

We propose branch-and-bound algorithms which are specifically built for three possible formulations: cardinality-constrained and cardinality-penalized least squares, and cardinality minimization under a limited error constraint. A specific tree exploration strategy is built, inspired by forward greedy methods in stepwise regression. We then show that the relaxation problems involved at each node of the search tree can be reformulated as a l1-norm optimization problem. We propose a dedicated algorithm for such problems, based on the homotopy continuation principle. Branch-and-bound algorithms are then constructed, which are guaranteed to find the optimal subset, without resorting to MIP reformulations nor binary variables.

The obtained certified solutions are shown to better estimate sparsity patterns than standard methods on simulated problems involving 1 000 unknowns and up to a few tens of non-zero components. For problems with higher complexity, unguaranteed solutions obtained by limiting the computing time are also shown to provide more satisfactory estimates. The resulting global optimization procedure is finally shown to strongly improve over the CPLEX MIP solver in terms of computing time.

Joint work with Ramzi Ben Mhenni and Jordan Ninin.

Biography — Sébastien Bourguignon is an assistant professor (HDR) at Ecole Centrale de Nantes since 2011. His research focuses on the design of signal and image processing algorithms for the resolution of inverse problems encountered in various fields of science and engineering, including ultrasonic imaging, microwave imaging or hyperspectral imaging, in particular for nondestructive testing, planetary science and astronomy. He is particularly interested in exploiting information based on sparsity, both in the physical modeling of data and in the related computational aspects. He is currently the coordinator of the MIMOSA “young researchers” project (Mixed Integer programming Methods for Optimization of Sparse Approximation criteria) funded by the French national agency (2017-2021).

S³ Seminar

April 09, 2021
Tweet

More Decks by S³ Seminar

Other Decks in Research

Transcript

  1. Exact optimization for 0
    -norm sparse solutions to
    least-squares problems

    ebastien Bourguignon
    Laboratoire des Sciences du Num´
    erique de Nantes
    ´
    Ecole Centrale de Nantes
    S3: S´
    eminaire Signal @Paris-Saclay, April 9th 2021
    Joint work with
    Ramzi Ben Mhenni (LS2N-ECN, now LITIS, Universit´
    e de Rouen)
    Jordan Ninin (Lab-STICC / ENSTA Bretagne)
    Marcel Mongeau (Universit´
    e de Toulouse, ENAC)
    Herv´
    e Carfantan (Universit´
    e de Toulouse, IRAP)

    View Slide

  2. Outline
    1 Why?
    2 Who?
    3 How?
    4 Where?
    S. Bourguignon Exact 0-norm optimization 2 / 23

    View Slide

  3. Outline
    1 Why?
    exact solutions to 0
    -norm problems may achieve better estimates
    2 Who?
    small to moderate size sparse problems can be solved exactly
    3 How?
    dedicated Branch-and-Bound strategy
    4 Where?
    directions for further works
    S. Bourguignon Exact 0-norm optimization 2 / 23

    View Slide

  4. Exactness: exact criterion, exact optimization
    True, unrelaxed, 0
    -“norm” criterion1
    x
    1
    =
    p
    |xp| x q
    q
    =
    p
    |xp|q x
    0
    := Card{xp| xp
    = 0}
    Some sparsity-enhancing functions
    p
    ϕ(|xp|) and their unit balls.
    Global optimization: optimality guaranteed by the algorithm
    1On (re)lˆ
    ache rien!
    S. Bourguignon Exact 0-norm optimization 3 / 23

    View Slide

  5. Low-cardinality least-squares problems
    Cardinality constrained
    P2/0
    : min
    x∈Rn
    1
    2
    y − Ax 2 s.t. Card {i | xi
    = 0} ≤ K
    CC portfolio optimization, subset selection, . . .
    Error constrained
    P0/2
    : min
    x∈Rn
    Card {i | xi
    = 0} s.t. 1
    2
    y − Ax 2 ≤
    Sparse approximation, subset selection, . . .
    Penalized / scalarized
    P2+0
    : min
    x∈Rn
    1
    2
    y − Ax 2 + λ Card {i | xi
    = 0}
    Inverse problems, compressed sensing, subset selection, . . .
    Card {i | xi
    = 0} = x
    0
    : 0
    “norm”
    S. Bourguignon Exact 0-norm optimization 4 / 23

    View Slide

  6. Some sparse inverse problems in signal processing
    Deconvolution of spike trains
    (seismology [Mendel 83], non-destructive testing [Zala 92])
    reflectivity sequence x instrument response h data y = h ∗ x + noise
    Atoms h(t − nTs
    ), n = 1, . . . , N
    x = arg min
    x
    1
    2
    y − Ax 2 + λ x
    0
    x ∈ Arg min
    x
    x
    0
    s.t. 1
    2
    y − Ax 2 ≤
    S. Bourguignon Exact 0-norm optimization 5 / 23
    ∗ →

    View Slide

  7. Some sparse inverse problems in signal processing
    Spectral unmixing (hyperspectral imaging) [Iordache et al., 11]
    a = arg min
    a ≥ 0,
    i
    ai
    = 1
    1
    2
    s − Sa 2 s.t. a
    0
    ≤ K
    a ∈ Arg min
    a ≥ 0,
    i
    ai
    = 1
    1
    2
    a
    0
    s.t. s − Sa 2 ≤
    Atoms sref
    (λ)
    S. Bourguignon Exact 0-norm optimization 6 / 23
    s1
    (λ) asoil
    ssoil
    (λ) + arock
    srock
    (λ)
    s2
    (λ) awater
    swater
    (λ)
    s3
    (λ) atree
    stree
    (λ) + asoil
    ssoil
    (λ)

    View Slide

  8. Some sparse inverse problems in signal processing
    Identification of non-linear models [Tang et al., 13]
    Non-linear model
    y
    K
    k=1
    αk
    M(θk
    )
    θ1
    θ2
    θ3
    α1
    α2
    α3
    → θ, α and K?
    (ex: sparse spectral analysis)
    Linear model + sparsity
    y
    Q
    q=1
    xq M(θd
    q
    ) = M x
    → x sparse: Q = {q|xq
    = 0}
    θ = θd
    q q∈Q
    α = {xq}
    q∈Q
    K = Card Q
    Variable selection / subset selection / sparse regression [Miller, 02]
    S. Bourguignon Exact 0-norm optimization 7 / 23
    θd
    q
    = q∆θ









    View Slide

  9. Exactness may be worth. . .
    Natural formulation for many problems
    P2/0
    : min
    x∈RP
    1
    2
    y − Ax 2
    2
    s.t. x
    0
    ≤ K P0/2
    : min
    x∈RP
    x
    0
    s.t. 1
    2
    y − Ax 2
    2

    P2+0
    : min
    x∈RP
    1
    2
    y − Ax 2
    2
    + λ x
    0
    Global optimum better solution [Bertsimas et al., 16, Bourguignon et al., 16]
    0 50 100
    −6
    −4
    −2
    0
    2
    4
    0 50 100
    −6
    −4
    −2
    0
    2
    4
    0 50 100
    −6
    −4
    −2
    0
    2
    4
    0 50 100
    −6
    −4
    −2
    0
    2
    4
    0 50 100
    −6
    −4
    −2
    0
    2
    4
    Data and truth OMP 1 relaxation SBR Global optimum
    y − H˚
    x 2
    2
    = 1.62 y − Hx 2
    2
    = 6.07 y − Hx 2
    2
    = 2.36 y − Hx 2
    2
    = 2.22 y − Hx 2
    2
    = 1.43
    Results taken from [Bourguignon et al., 16]
    S. Bourguignon Exact 0-norm optimization 8 / 23

    View Slide

  10. . . . but exactness has a price
    S. Bourguignon Exact 0-norm optimization 9 / 23

    View Slide

  11. . . . but exactness has a price
    S. Bourguignon Exact 0-norm optimization 9 / 23

    View Slide

  12. . . . but exactness has a price
    NP-hard: x
    0
    ≤ K P
    K
    possible combinations. . . in worst case scenario!
    Branch-and-Bound: eliminate (hopefully huge) sets of possible combinations
    without resorting to their evaluation
    Small data problems!
    one-dimensional problems
    deconvolution, time series spectral analysis, spectral unmixing, . . .
    variable/subset selection in statistics
    S. Bourguignon Exact 0-norm optimization 9 / 23

    View Slide

  13. Mixed Integer Programming (MIP) reformulation
    (see [Bienstock 96, Bertsimas et al., 16, Bourguignon et al., 16])
    Big-M assumption: ∀p, |xp| ≤ M.
    Then: min
    x∈R
    P
    y − Ax 2
    2
    s.t.
    x
    0
    ≤ K
    ∀p, |xp| ≤ M
    ⇔ min
    b∈{0;1}P
    x∈R
    P
    y − Ax 2
    2
    s.t.





    p
    bp ≤ K
    ∀p, |xp| ≤ Mbp
    Can be addressed by MIP solvers (CPLEX, GUROBI, . . . )
    but computation time ↑ / limited to small size
    Here:
    No need for MIP reformulation nor binary variables
    Specific Branch-and-Bound construction for problems P2/0
    , P0/2
    , and P2+0
    S. Bourguignon Exact 0-norm optimization 10 / 23

    View Slide

  14. Specificity of 0
    -norm problems as a very particular MIP
    P2/0
    : min
    x ∈ RP
    b ∈ {0, 1}P
    1
    2
    y − Ax 2 s.c. p
    bp ≤ K
    ∀p, −Mbp ≤ xp ≤ Mbp
    binary variables appear in a simple sum (separable)
    one binary variable ↔ one continuous variable
    no quadratic coupling mixing x and b, between bi
    and bj
    cost function in x / constraints in b (or conversely)
    finally, no need for binary variables!
    [Bienstock, 96]: many ideas, few results
    [Jokar & Pfetsch, 08]: MIP mentioned, but two heavy
    [Bertsimas & Shioda, 09]: dedicated MIP solver (#x ≤ #y)
    [Karahanoglu et al., 13]: noise-free problems (y = Ax)
    [Hazimeh et al., 20]: 0
    + 2
    penalization
    S. Bourguignon Exact 0-norm optimization 11 / 23

    View Slide

  15. Branch-and-Bound resolution [Land & Doig, 60]
    Decision tree for binary variables
    At each node, a lower bound on all subproblems contained by this node
    remaining binary variables are relaxed into 0, 1
    If this bound exceeds the best known solution, the branch is pruned.
    P(0)
    P(1) P(4)
    bp0
    = 1 bp0
    = 0
    P(2) P(3) P(5) P(6)
    bp1
    = 1 bp1
    = 0 bp4
    = 1 bp4
    = 0
    Which variable bp
    branch on?
    Which side explore first?
    Which node explore first?
    S. Bourguignon Exact 0-norm optimization 12 / 23

    View Slide

  16. Branch-and-Bound resolution [Land & Doig, 60]
    Decision tree for binary variables
    At each node, a lower bound on all subproblems contained by this node
    remaining binary variables are relaxed into 0, 1
    If this bound exceeds the best known solution, the branch is pruned.
    P(0)
    P(1) P(4)
    bp0
    = 1 bp0
    = 0
    P(2) P(3) P(5) P(6)
    bp1
    = 1 bp1
    = 0 bp4
    = 1 bp4
    = 0
    Which variable bp
    branch on?
    highest relaxed variable
    Which side explore first?
    bp
    = 1
    Which node explore first?
    depth-first search
    S. Bourguignon Exact 0-norm optimization 12 / 23

    View Slide

  17. Branch-and-Bound resolution [Land & Doig, 60]
    Decision tree for binary variables
    At each node, a lower bound on all subproblems contained by this node
    remaining binary variables are relaxed into 0, 1
    If this bound exceeds the best known solution, the branch is pruned.
    P(0)
    P(1) P(4)
    bp0
    = 1 bp0
    = 0
    P(2) P(3) P(5) P(6)
    bp1
    = 1 bp1
    = 0 bp4
    = 1 bp4
    = 0
    Which variable bp
    branch on?
    highest relaxed variable
    Which side explore first?
    bp
    = 1
    Which node explore first?
    depth-first search
    Computation of relaxed solutions?
    S. Bourguignon Exact 0-norm optimization 12 / 23

    View Slide

  18. Branch-and-Bound resolution [Land & Doig, 60]
    Decision tree for binary variables
    At each node, a lower bound on all subproblems contained by this node
    remaining binary variables are relaxed into 0, 1
    If this bound exceeds the best known solution, the branch is pruned.
    P(0)
    P(1) P(4)
    bp0
    = 1 bp0
    = 0
    P(2) P(3) P(5) P(6)
    bp1
    = 1 bp1
    = 0 bp4
    = 1 bp4
    = 0
    Which variable bp
    branch on?
    highest relaxed variable
    Which side explore first?
    bp
    = 1
    Which node explore first?
    depth-first search
    Computation of relaxed solutions?
    related to 1
    -norm optimization. . .
    S. Bourguignon Exact 0-norm optimization 12 / 23

    View Slide

  19. MIP continuous relaxation and 1
    norm
    P2/0
    : min
    x∈RP
    b∈{0,1}P
    1
    2
    y − Ax 2 s.c. p
    bp ≤ K
    ∀p, −Mbp ≤ xp ≤ Mbp
    R2/0
    : min
    x∈RP
    b∈[0,1]P
    1
    2
    y − Ax 2 s.c. p
    bp ≤ K
    ∀p, −Mbp ≤ xp ≤ Mbp
    0 1
    -M
    0
    M
    We have2
    min R2/0
    = min
    x∈RP
    1
    2
    y − Ax 2 s.t. p
    |xp| ≤ MK
    ∀p, |xp| ≤ M
    .
    2Proof: for a solution (x , b ) of PR
    2/0
    , we have |x | = Mb . . .
    S. Bourguignon Exact 0-norm optimization 13 / 23

    View Slide

  20. Continuous relaxation within the branch-and-bound procedure
    At a given node:
    bS0
    = 0 and xS0
    = 0
    bS1
    = 1 and |xS1
    | ≤ M
    bS
    free and |xS
    | ≤ MbS
    p
    bp
    = Card S1
    +
    p∈S
    bp
    P(0)
    P(1) P(4)
    bp0
    = 1 bp0
    = 0
    P(2) P(3) P(5) P(6)
    bp1
    = 1 bp1
    = 0 bp4
    = 1 bp4
    = 0
    The relaxed problem at node i reads equivalently:
    R(i)
    2/0
    : min
    xS , xS1
    1
    2
    y − AS
    xS − AS1
    xS1
    2
    2
    s.t.





    xS 1
    ≤ M(K − Card S1
    )
    xS ∞
    ≤ M
    xS1 ∞
    ≤ M
    Least squares, 1
    norm (partially) and box constraints.
    No binary variables!
    S. Bourguignon Exact 0-norm optimization 14 / 23

    View Slide

  21. Optimization with (partial) 1
    -norm and box constraints
    Homotopy continuation principle
    Standard case [Osborne et al., 00] With free variable and box constraints
    min
    x
    1
    2
    y − Ax 2
    2
    + λ x
    1
    min
    xS ,xS1
    1
    2
    y − AS
    xS − AS1
    xS1
    2
    2
    + λ xS 1
    s.c.
    xS ∞
    ≤ M
    xS1 ∞
    ≤ M
    λ
    x∗
    1
    x∗
    2
    x∗
    3
    x∗
    4
    λ∗
    x∗
    λ(0)
    λ(1)
    λ(2)
    λ(3)
    λ(4)
    M
    −M
    λ
    x∗
    x∗
    1
    x∗
    2
    x∗
    3
    x∗
    4
    x∗
    5
    λ∗
    λ(0)
    λ(1)
    λ(2)
    λ(3)
    λ(4)
    λ(5)
    λ(6)
    S. Bourguignon Exact 0-norm optimization 15 / 23

    View Slide

  22. Homotopy continuation
    Similarly solves relaxations for the sparsity-constrained problem:
    R(i)
    2/0
    : min
    xS , xS1
    1
    2
    y − AS
    xS − AS1
    xS1
    2
    2
    s.t.
    xS 1
    ≤ τ
    xS ∞
    ≤ M, xS1 ∞
    ≤ M
    and for the error-constrained problem:
    R(i)
    0/2
    : min
    xS , xS1
    xS 1
    s.t.
    1
    2
    y − AS
    xS − AS1
    xS1
    2
    2

    xS ∞
    ≤ M, xS1 ∞
    ≤ M
    1
    2
    y − AS1
    x∗
    S1
    − AS x∗
    S
    2
    x∗
    S 1
    τ
    λ(0)
    λ(1)
    λ(2)
    λ(3)
    λ(4)
    λ∗
    Pareto curve
    S. Bourguignon Exact 0-norm optimization 16 / 23

    View Slide

  23. Branch-and-Bound : example (P = 5, K = 2)
    y = A x +




    0.0489
    8.1035
    8.0727
    −2.0303








    1 5 2 −1 1
    3 4 5 0 −1
    2 1 4 2 −2
    −1 0 −2 1 0










    2
    0
    0
    0
    −2






    min
    x∈R5,b∈{0,1}5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, avec M = 5
    S. Bourguignon Exact 0-norm optimization 17 / 23

    View Slide

  24. Branch-and-Bound : example
    P(0)
    P(0)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb
    f = 0, x = 2.2, 0.16, 0.3, −0.072, −2.2 T
    , b = 0.58, 0.24, 0.27, 0.24, 0.56 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  25. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(0)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb
    f = 0, x = 2.2, 0.16, 0.3, −0.072, −2.2 T
    , b = 0.58, 0.24, 0.27, 0.24, 0.56 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  26. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    P(1)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 1
    f = 0, x = 3.1, −0.05, 0.14, 0.51, −1.2 T
    , b = 1, 0.25, 0.23, 0.2, 0.32 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  27. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(1)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 1
    f = 0, x = 3.1, −0.05, 0.14, 0.51, −1.2 T
    , b = 1, 0.25, 0.23, 0.2, 0.32 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  28. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2)
    P(2) : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 1, b5
    = 1
    f = 0.07, x = 3.1, −0, −0, −0, −1.9 T
    , b = 1, 0, 0, 0, 1 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  29. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    P(3)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 1, b5
    = 0
    f = 0, x = 4.2, −0.29, −0.041, 1.2, 0 T
    , b = 1, 0.33, 0.3, 0.36, 0 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  30. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(3)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 1, b5
    = 0
    f = 0, x = 4.2, −0.29, −0.041, 1.2, 0 T
    , b = 1, 0.33, 0.3, 0.36, 0 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  31. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4)
    P(4) : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 1, b5
    = 0, b4
    = 1
    f = 0.527, x = 3.7, −0, −0, 1.4, 0 T
    , b = 1, 0, 0, 1, 0 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  32. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4)
    P(4) : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 1, b5
    = 0, b4
    = 1
    f = 0.527, x = 3.7, −0, −0, 1.4, 0 T
    , b = 1, 0, 0, 1, 0 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  33. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(5)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 1, b5
    = 0, b4
    = 0
    f = 2.52, x = 3, −0.64, 0.95, 0, 0 T
    , b = 1, 0.47, 0.53, 0, 0 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  34. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(5)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 1, b5
    = 0, b4
    = 0
    f = 2.52, x = 3, −0.64, 0.95, 0, 0 T
    , b = 1, 0.47, 0.53, 0, 0 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  35. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(6)
    P(6)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 0
    f = 0, x = 0, 0.69, 0.69, −1.5, −4.9 T
    , b = 0, 0.24, 0.27, 0.41, 0.99 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  36. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(6)
    b5
    = 1 b5
    = 0
    P(6)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 0
    f = 0, x = 0, 0.69, 0.69, −1.5, −4.9 T
    , b = 0, 0.24, 0.27, 0.41, 0.99 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  37. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(6)
    b5
    = 1 b5
    = 0
    P(7)
    P(7) : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 0, b5
    = 1
    f = 0, x = 0, 0.69, 0.69, −1.5, −4.9 T
    , b = 0, 0.31, 0.27, 0.42, 1 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  38. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(6)
    b5
    = 1 b5
    = 0
    P(7)
    b4
    = 1 b4
    = 0
    P(7) : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 0, b5
    = 1
    f = 0, x = 0, 0.69, 0.69, −1.5, −4.9 T
    , b = 0, 0.31, 0.27, 0.42, 1 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  39. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(6)
    b5
    = 1 b5
    = 0
    P(7)
    b4
    = 1 b4
    = 0
    P(8)
    P(8)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 0, b5
    = 1, b4
    = 1
    f = 37, x = 0, −0, −0, −1.5, −5 T
    , b = 0, 0, 0, 1, 1 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  40. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(6)
    b5
    = 1 b5
    = 0
    P(7)
    b4
    = 1 b4
    = 0
    P(8)
    P(8)
    P(8)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 0, b5
    = 1, b4
    = 1
    f = 37, x = 0, −0, −0, −1.5, −5 T
    , b = 0, 0, 0, 1, 1 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  41. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(6)
    b5
    = 1 b5
    = 0
    P(7)
    b4
    = 1 b4
    = 0
    P(8)
    P(8) P(9)
    P(9)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 0, b5
    = 1, b4
    = 0
    f = 0.327, x = 0, 0.19, 1.5, 0, −2.2 T
    , b = 0, 0.41, 0.59, 0, 1 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  42. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(6)
    b5
    = 1 b5
    = 0
    P(7)
    b4
    = 1 b4
    = 0
    P(8)
    P(8) P(9)
    P(9)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 0, b5
    = 1, b4
    = 0
    f = 0.327, x = 0, 0.19, 1.5, 0, −2.2 T
    , b = 0, 0.41, 0.59, 0, 1 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  43. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(6)
    b5
    = 1 b5
    = 0
    P(7)
    b4
    = 1 b4
    = 0
    P(8)
    P(8) P(9)
    P(10)
    P(10)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 0, b5
    = 0
    f = 0.992, x = 0, −0.27, 2.2, 1, 0 T
    , b = 0, 0.47, 0.67, 0.52, 0 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  44. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(6)
    b5
    = 1 b5
    = 0
    P(7)
    b4
    = 1 b4
    = 0
    P(8)
    P(8) P(9)
    P(10)
    P(10)
    R
    : min
    x∈R5,b∈[0,1]5
    1
    2
    y − Ax 2 s.c.
    p
    bp ≤ 2, −Mb ≤ x ≤ Mb, b1
    = 0, b5
    = 0
    f = 0.992, x = 0, −0.27, 2.2, 1, 0 T
    , b = 0, 0.47, 0.67, 0.52, 0 T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  45. Branch-and-Bound : example
    P(0)
    P(0)
    b1
    = 1 b1
    = 0
    P(1)
    b5
    = 1 b5
    = 0
    P(2) P(3)
    b4
    = 1 b4
    = 0
    P(4) P(5)
    P(6)
    b5
    = 1 b5
    = 0
    P(7)
    b4
    = 1 b4
    = 0
    P(8)
    P(8) P(9)
    P(10)
    P(2)
    f = 0.07, x = 3.1, −0, −0, −0, −1.9, T
    , b = 1, 0, 0, 0, 1, T
    S. Bourguignon Exact 0-norm optimization 18 / 23

    View Slide

  46. Sparse deconvolution: computing time
    Implementation in C++, UNIX desktop machine (single core)
    N = 120, P = 100, SNR = 10 dB, 50 instances
    Problem Branch-and-Bound MIP CPLEX12.8
    Time Nb. nodes Failed Time Nb. nodes Failed
    (s) (×103) (s) (×103)
    K = 5 0.7 1.28 0 3.0 1.71 0
    P2/0
    K = 7 11.6 17.89 0 16.6 21.51 0
    K = 9 43.5 57.37 9 53.8 72.04 6
    K = 5 0.1 0.21 0 25.7 6.71 0
    P0/2
    K = 7 0.9 2.32 0 114.8 49.54 2
    K = 9 2.5 5.22 0 328.2 101.07 17
    K = 5 1.8 2.01 0 3.2 1.98 0
    P2+0 K = 7 7.3 10.20 0 7.4 9.61 0
    K = 9 25.6 31.80 5 17.3 23.74 2
    Remark: C5
    100
    ∼ 7.5 107, C7
    100
    ∼ 1.6 1010, C9
    100
    ∼ 1.9 1012
    Competitive exploration strategy (despite simple)
    Efficient node evaluation (especially for small K)
    Performance ≥ CPLEX, CPLEX for P0/2
    S. Bourguignon Exact 0-norm optimization 19 / 23

    View Slide

  47. Subset selection problems (framework in [Bertsimas et al., 16])
    N = 500, P = 1 000, SNR = 8 dB, 50 instances
    an ∼ N(0, Σ), σij
    = ρ|i−j|
    y = Ax0 + , with equispaced non-zero values in x0 equal to 1
    ρ = 0.8 ρ = 0.95
    Exact recovery (%)
    0 10 20 30 40
    K
    0
    20
    40
    60
    80
    100
    5 10 15 20
    K
    0
    20
    40
    60
    80
    100
    Sparsity degree Sparsity degree
    Better estimation than standard methods (even with limited time)
    Performance CPLEX
    S. Bourguignon Exact 0-norm optimization 20 / 23

    View Slide

  48. Subset selection: computing times
    N = 500, P = 1 000, SNR = 8 dB, 50 instances
    Problem Branch-and-Bound CPLEX
    Time Nodes T/N Failed Time Nodes T/N Failed
    (s) (×103) (s) (s) (×103) (s)
    K = 5 0.3 0.01 0.02 0 80.0 0.01 6.07 0
    P2/0
    K = 9 7.4 0.15 0.05 0 265.8 0.18 1.45 0
    K = 13 54.6 0.69 0.08 6 786.0 0.84 0.93 46
    K = 17 624.4 5.82 0.11 43 - - - 50
    K = 5 0.2 0.01 0.02 1 495.7 0.32 1.54 11
    ρ = 0.8
    P0/2
    K = 9 16 0.21 0.08 0 - - - 50
    K = 13 311.5 2.32 0.13 12 - - - 50
    K = 17 694.3 4.90 0.14 47 - - - 50
    K = 5 0.3 0.03 0.01 0 235.3 0.04 6.00 0
    P2+0
    K = 9 14.3 0.95 0.02 0 549.1 1.29 0.42 4
    K = 13 855.3 34.52 0.02 32 982.6 3.61 0.27 49
    K = 17 996.9 42.49 0.02 41 - - - 50
    CPLEX performance ↓↓
    Resolution guaranteed up to K ∼ 15
    1000
    15
    ∼ 1032 . . .
    S. Bourguignon Exact 0-norm optimization 21 / 23

    View Slide

  49. Conclusions
    Exact 0
    solutions may be worth
    Specific Branch-and-bound method generic MIP resolution
    Cardinality-constrained, error-constrained, penalized problems
    Most computing time = proving optimality
    partial exploration competitive solutions
    S. Bourguignon Exact 0-norm optimization 22 / 23

    View Slide

  50. Conclusions
    Exact 0
    solutions may be worth
    Specific Branch-and-bound method generic MIP resolution
    Cardinality-constrained, error-constrained, penalized problems
    Most computing time = proving optimality
    partial exploration competitive solutions
    Alternate formulations to bigM (SOS, set covering, . . . )
    S. Bourguignon Exact 0-norm optimization 22 / 23

    View Slide

  51. Conclusions
    Exact 0
    solutions may be worth
    Specific Branch-and-bound method generic MIP resolution
    Cardinality-constrained, error-constrained, penalized problems
    Most computing time = proving optimality
    partial exploration competitive solutions
    Alternate formulations to bigM (SOS, set covering, . . . )
    More sophisticated branching rules, local heuristics, cutting planes, . . .
    S. Bourguignon Exact 0-norm optimization 22 / 23

    View Slide

  52. Conclusions
    Exact 0
    solutions may be worth
    Specific Branch-and-bound method generic MIP resolution
    Cardinality-constrained, error-constrained, penalized problems
    Most computing time = proving optimality
    partial exploration competitive solutions
    Alternate formulations to bigM (SOS, set covering, . . . )
    More sophisticated branching rules, local heuristics, cutting planes, . . .
    Improve relaxations non-convex solutions!
    1x=0
    |x|
    |x|p
    CEL0(x)
    S. Bourguignon Exact 0-norm optimization 22 / 23

    View Slide

  53. Thank you.
    ANR JCJC program, 2016-2021, MIMOSA: Mixed Integer programming Methods
    for Optimization of Sparse Approximation criteria.
    S. Bourguignon, J. Ninin, H. Carfantan and M. Mongeau.
    IEEE Transactions on Signal Processing, 2016.
    R. Ben Mhenni, S. Bourguignon and J. Ninin.
    Optimization Methods and Software, submitted.
    Matlab and C codes available upon request (Git will come soon!)
    S. Bourguignon Exact 0-norm optimization 23 / 23

    View Slide

  54. Homotopy continuation principle [Osborne et al., 00, Efron et al., 04]
    Consider the penalized formulation:
    xλ = arg min
    x
    J(x) := 1
    2
    y − Ax 2 + λ x
    1
    .
    xλ minimizes J if and only if 0 ∈ ∂J(xλ), with
    ∂J(xλ) = −AT (y − Axλ) + z ∈ RQ
    zi
    = sgn(xλ
    i
    ) if xλ
    i
    = 0
    zi ∈ [−1, 1] if xλ
    i
    = 0
    .
    With NZ and Z indexing the nonzero and the zero components in xλ, it reads:
    |AT
    Z
    (y − ANZxλ
    NZ
    )| ≤ λ.
    −AT
    NZ
    (y − ANZxλ
    NZ
    ) + λsgn(xλ
    NZ
    ) = 0.
    If λ ≥ λmax
    := AT y

    , then xλ = 0.
    Non-zero components satisfy xλ
    NZ
    = AT
    NZ
    ANZ
    −1
    AT
    NZ
    y − λsgn(xλ
    NZ
    )
    The solution path λ → xλ is a piecewise linear function of λ. It is a linear function
    of λ as long as the support configuration {NZ, Z, sgn(xλ
    NZ
    )} does not change.
    S. Bourguignon Exact 0-norm optimization 24 / 23

    View Slide