Dynamique(s) de descente pour l'optimisation multi-objectif

22c721aa043f752b3b6e3299df04b306?s=47 GdR MOA 2015
December 02, 2015

Dynamique(s) de descente pour l'optimisation multi-objectif

by G. Garrigos

22c721aa043f752b3b6e3299df04b306?s=128

GdR MOA 2015

December 02, 2015
Tweet

Transcript

  1. Dynamique(s) de descente pour l’optimisation multi-objectif Guillaume Garrigos, Hédy Attouch

    Istituto Italiano di Tecnologia & Massachusetts Institute of Technology Genoa, Italie Journées du GdR MOA 2 Décembre, 2015 Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 1/19
  2. Introduction/Motivation Multi-objective problem In engineering, decision sciences, it happens that

    various objective functions shall be minimized simultaneously: f1, ..., fm Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 2/19
  3. Introduction/Motivation Multi-objective problem In engineering, decision sciences, it happens that

    various objective functions shall be minimized simultaneously: f1, ..., fm −→ Needs appropriate tools: multi-objective optimization. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 2/19
  4. The multi-objective optimization problem Let F = (f1, ..., fm)

    : H → Rm locally Lipschitz, H Hilbert. Solve MIN (f1(x), ..., fm(x)) : x ∈ C ⊂ H convex. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 3/19
  5. The multi-objective optimization problem Let F = (f1, ..., fm)

    : H → Rm locally Lipschitz, H Hilbert. Solve MIN (f1(x), ..., fm(x)) : x ∈ C ⊂ H convex. We consider the usual order(s) on Rm: a ĺ b ⇔ ai ≤ bi for all i = 1, ..., m, a ă b ⇔ ai < bi for all i = 1, ..., m. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 3/19
  6. The multi-objective optimization problem Let F = (f1, ..., fm)

    : H → Rm locally Lipschitz, H Hilbert. Solve MIN (f1(x), ..., fm(x)) : x ∈ C ⊂ H convex. We consider the usual order(s) on Rm: a ĺ b ⇔ ai ≤ bi for all i = 1, ..., m, a ă b ⇔ ai < bi for all i = 1, ..., m. x is a Pareto point if y ∈ C such that F(y) ň F(x) x is a weak Pareto point if y ∈ C such that F(y) ă F(x) Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 3/19
  7. The multi-objective optimization problem Let F = (f1, ..., fm)

    : H → Rm locally Lipschitz. Solve MIN f1(x), ..., fm(x) : x ∈ C ⊂ H convex. How to solve it? Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 4/19
  8. The multi-objective optimization problem Let F = (f1, ..., fm)

    : H → Rm locally Lipschitz. Solve MIN f1(x), ..., fm(x) : x ∈ C ⊂ H convex. How to solve it? genetic algorithm −→ no theoretical guarantees. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 4/19
  9. The multi-objective optimization problem Let F = (f1, ..., fm)

    : H → Rm locally Lipschitz. Solve MIN f1(x), ..., fm(x) : x ∈ C ⊂ H convex. How to solve it? genetic algorithm −→ no theoretical guarantees. scalarization method: minimize fθ := m i=1 θi fi , θ = (θi )i=1..m ∈ ∆m because θ∈∆m argmin x∈H fθ(x) ⊂ {weak Paretos}. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 4/19
  10. The multi-objective optimization problem Let F = (f1, ..., fm)

    : H → Rm locally Lipschitz. Solve MIN f1(x), ..., fm(x) : x ∈ C ⊂ H convex. We are going to present a method which: Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 4/19
  11. The multi-objective optimization problem Let F = (f1, ..., fm)

    : H → Rm locally Lipschitz. Solve MIN f1(x), ..., fm(x) : x ∈ C ⊂ H convex. We are going to present a method which: generalizes the steepest descent dynamic ˙ x(t) + ∇f (x(t)) = 0, Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 4/19
  12. The multi-objective optimization problem Let F = (f1, ..., fm)

    : H → Rm locally Lipschitz. Solve MIN f1(x), ..., fm(x) : x ∈ C ⊂ H convex. We are going to present a method which: generalizes the steepest descent dynamic ˙ x(t) + ∇f (x(t)) = 0, is cooperative, i.e. all objective functions decrease simultaneously, Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 4/19
  13. The multi-objective optimization problem Let F = (f1, ..., fm)

    : H → Rm locally Lipschitz. Solve MIN f1(x), ..., fm(x) : x ∈ C ⊂ H convex. We are going to present a method which: generalizes the steepest descent dynamic ˙ x(t) + ∇f (x(t)) = 0, is cooperative, i.e. all objective functions decrease simultaneously, is independent of any choice of parameters. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 4/19
  14. Towards a descent dynamic for multi-objective optimization Historical review Smale

    (1975) Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 5/19
  15. Towards a descent dynamic for multi-objective optimization Historical review Smale

    (1975) Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 5/19
  16. Towards a descent dynamic for multi-objective optimization Historical review Smale

    (1975) Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 5/19
  17. Towards a descent dynamic for multi-objective optimization Historical review Cornet

    (1981) s(x) := −[∇f1(x), ∇f2(x)]0 s(x), ∇fi (x) < 0 ∇f1(x) ∇f2(x) PhD defense - Guillaume Garrigos 28/30
  18. Multi-objective steepest descent Let F = (f1, ..., fm) :

    H −→ Rm locally Lipschitz, C = H Hilbert. Definition For all x ∈ H, s(x) := − (co {∂Cfi (x)}i=1,...,m)0 is the (common) steepest descent direction at x. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 6/19
  19. Multi-objective steepest descent Let F = (f1, ..., fm) :

    H −→ Rm locally Lipschitz, C = H Hilbert. Definition For all x ∈ H, s(x) := − (co {∂Cfi (x)}i=1,...,m)0 is the (common) steepest descent direction at x. Remarks in the smooth case If m = 1 then s(x) = −∇f1(x). Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 6/19
  20. Multi-objective steepest descent Let F = (f1, ..., fm) :

    H −→ Rm locally Lipschitz, C = H Hilbert. Definition For all x ∈ H, s(x) := − (co {∂Cfi (x)}i=1,...,m)0 is the (common) steepest descent direction at x. Remarks in the smooth case If m = 1 then s(x) = −∇f1(x). At each x, s(x) selects a convex combination: s(x) = − m i=1 θi (x)∇fi (x) = −∇fθ(x) (x) where fθ(x) = m i=1 θi (x)fi . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 6/19
  21. Multi-objective steepest descent Let F = (f1, ..., fm) :

    H −→ Rm locally Lipschitz, C = H Hilbert. Definition For all x ∈ H, s(x) := − (co {∂Cfi (x)}i=1,...,m)0 is the (common) steepest descent direction at x. Remarks in the smooth case If m = 1 then s(x) = −∇f1(x). At each x, s(x) selects a convex combination: s(x) = − m i=1 θi (x)∇fi (x) = −∇fθ(x) (x) where fθ(x) = m i=1 θi (x)fi . s(x) is the steepest descent: s(x) s(x) = argmin d∈BH max i=1,...,m ∇fi (x), d . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 6/19
  22. The (multi-objective) Steepest Descent dynamic We consider the continuous steepest

    descent dynamic: (SD) ˙ x(t) = s(x(t)), i.e. (SD) ˙ x(t) + (co {∂Cfi (x(t))}i )0 = 0. A solution is a trajectory x : [0, +∞[−→ H being absolutely continuous (on bounded intervals), satisfying (SD) for a.e. t ≥ 0. It is the continuous version of the steepest descent algorithm studied by Svaiter, Fliege, Iusem, ... Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 7/19
  23. The (multi-objective) Steepest Descent dynamic Example (SD) ˙ x(t) =

    s(x(t)) with f1(x) = x 2 and f2(x) = x1 . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 8/19
  24. The (multi-objective) Steepest Descent dynamic Example (SD) ˙ x(t) =

    s(x(t)) with f1(x) = x 2 and f2(x) = x1 . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 8/19
  25. The (multi-objective) Steepest Descent dynamic Example (SD) ˙ x(t) =

    s(x(t)) with f1(x) = x 2 and f2(x) = x1 . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 8/19
  26. The (multi-objective) Steepest Descent dynamic Example (SD) ˙ x(t) =

    s(x(t)) with f1(x) = x 2 and f2(x) = x1 . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 8/19
  27. The (multi-objective) Steepest Descent dynamic Example (SD) ˙ x(t) =

    s(x(t)) with f1(x) = x 2 and f2(x) = x1 . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 8/19
  28. The (multi-objective) Steepest Descent dynamic Example (SD) ˙ x(t) =

    s(x(t)) with f1(x) = x 2 and f2(x) = x1 . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 8/19
  29. The (multi-objective) Steepest Descent dynamic Example (SD) ˙ x(t) =

    s(x(t)) with f1(x) = x2 1 and f2(x) = x2 2 . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 9/19
  30. The (multi-objective) Steepest Descent dynamic Example (SD) ˙ x(t) =

    s(x(t)) with f1(x) = x2 1 and f2(x) = x2 2 . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 9/19
  31. The (multi-objective) Steepest Descent dynamic Example (SD) ˙ x(t) =

    s(x(t)) with f1(x) = x2 1 and f2(x) = x2 2 . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 9/19
  32. The (multi-objective) Steepest Descent dynamic Main results (Attouch, G., Goudou,

    2014) A cooperative dynamic Let x : R+ −→ H be a solution of (SD) ˙ x(t) = s(x(t)). For all i = 1, ..., m, the function t → fi (x(·)) is decreasing. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 10/19
  33. The (multi-objective) Steepest Descent dynamic Main results (Attouch, G., Goudou,

    2014) A cooperative dynamic Let x : R+ −→ H be a solution of (SD) ˙ x(t) = s(x(t)). For all i = 1, ..., m, the function t → fi (x(·)) is decreasing. Convergence in the convex case Assume that the objective functions are convex. Then any bounded trajectory weakly converges to a weak Pareto point. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 10/19
  34. The (multi-objective) Steepest Descent dynamic Main results (Attouch, G., Goudou,

    2014) A cooperative dynamic Let x : R+ −→ H be a solution of (SD) ˙ x(t) = s(x(t)). For all i = 1, ..., m, the function t → fi (x(·)) is decreasing. Convergence in the convex case Assume that the objective functions are convex. Then any bounded trajectory weakly converges to a weak Pareto point. Existence in the convex case Suppose that H is finite dimensional. Then, for any initial data, there exists a global solution to (SD). Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 10/19
  35. The (multi-objective) Steepest Descent dynamic Going further In case of

    convex constraint C ⊂ H: (SD) ˙ x(t) + (NC (x(t)) + co {∂Cfi (x(t))}i )0 = 0. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 11/19
  36. The (multi-objective) Steepest Descent dynamic Going further In case of

    convex constraint C ⊂ H: (SD) ˙ x(t) + (NC (x(t)) + co {∂Cfi (x(t))}i )0 = 0. Uniqueness? Yes, if {∇fi (x(·))}i=1,...,m are affinely independants. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 11/19
  37. The (multi-objective) Steepest Descent dynamic Going further In case of

    convex constraint C ⊂ H: (SD) ˙ x(t) + (NC (x(t)) + co {∂Cfi (x(t))}i )0 = 0. Uniqueness? Yes, if {∇fi (x(·))}i=1,...,m are affinely independants. Convergence to Pareto points? Guaranteed by endowing Rm with a different order (recall F = (f1, ..., fm) : H −→ Rm). Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 11/19
  38. Numerical results Recovering the Pareto front f1(x, y) = x

    + y f2(x, y) = x2 + y2 + 1 x + 3e−100(x−0.3)2 + 3e−100(x−0.6)2 (x, y) ∈ C = [0.1, 1]2 Plot of F(C), F = (f1, f2) : C −→ R2 . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 12/19
  39. Numerical results Recovering the Pareto front f1(x, y) = x

    + y f2(x, y) = x2 + y2 + 1 x + 3e−100(x−0.3)2 + 3e−100(x−0.6)2 (x, y) ∈ C = [0.1, 1]2 Plot of F(C), F = (f1, f2) : C −→ R2 and its pareto front. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 12/19
  40. Numerical results Recovering the Pareto front f1(x, y) = x

    + y f2(x, y) = x2 + y2 + 1 x + 3e−100(x−0.3)2 + 3e−100(x−0.6)2 (x, y) ∈ C = [0.1, 1]2 Gradient method (Right) vs Scalar method (Left). 100 samples. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 12/19
  41. Numerical results Pareto selection with Tikhonov penalization Can we select,

    among the weak Paretos (= the zeros of x → s(x)) the closest to a desired state? Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 13/19
  42. Numerical results Pareto selection with Tikhonov penalization Can we select,

    among the weak Paretos (= the zeros of x → s(x)) the closest to a desired state? → Tikhonov penalization Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 13/19
  43. Numerical results Pareto selection with Tikhonov penalization Can we select,

    among the weak Paretos (= the zeros of x → s(x)) the closest to a desired state? → Tikhonov penalization ˙ x(t) − s(x(t)) + ε(t)(x(t) − xd ) = 0, ε(t) ↓ 0, ∞ 0 ε(t) dt = +∞. See the works of Attouch, Cabot, Czarnecki, Peypouquet (...) in the monotone case. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 13/19
  44. Numerical results Pareto selection with Tikhonov penalization Journées du GdR

    MOA 2015 - Dijon - Guillaume Garrigos 13/19
  45. Numerical results Pareto selection with Tikhonov penalization Journées du GdR

    MOA 2015 - Dijon - Guillaume Garrigos 13/19
  46. Numerical results Pareto selection with Tikhonov penalization Journées du GdR

    MOA 2015 - Dijon - Guillaume Garrigos 13/19
  47. Convergence rates : empirical observation ˙ x(t) + ∇f (x(t))

    = 0 ¨ x(t) + γ ˙ x(t) + ∇f (x(t)) = 0 Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 14/19
  48. Convergence rates : empirical observation ˙ x(t) + ∇f (x(t))

    = 0 ¨ x(t) + γ ˙ x(t) + ∇f (x(t)) = 0 Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 14/19
  49. Convergence rates : empirical observation ˙ x(t) + ∇f (x(t))

    = 0 ¨ x(t) + γ ˙ x(t) + ∇f (x(t)) = 0 Inertia promotes Faster trajectories (varying γ(t)), Exploratory properties. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 14/19
  50. Convergence rates : empirical observation f1(x)= 10 i=1 x2 i

    −10cos(2πxi )+10 1 4 , f2(x)= 10 i=1 (xi −1.5)2−10cos(2π(xi −1.5))+10 1 4 Convergence rate of F(xn) − F(x∞) ∞ : Steepest Descent vs Inertial Steepest Descent Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 15/19
  51. Inertial (multi-objective) Steepest Descent Let f1, ..., fm be smooth,

    with L-Lipschitz gradient. (ISD) ¨ x(t) = −γ ˙ x(t) + s(x(t)). Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 16/19
  52. Inertial (multi-objective) Steepest Descent Let f1, ..., fm be smooth,

    with L-Lipschitz gradient. (ISD) ¨ x(t) = −γ ˙ x(t) + s(x(t)). Example: f1(x) = x 2 and f2(x) = x1 . Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 16/19
  53. Inertial (multi-objective) Steepest Descent Main results (Attouch, G., 2015) Let

    f1, ..., fm be smooth, with L-Lipschitz gradient. (ISD) m¨ x(t) = −γ ˙ x(t) + s(x(t)). Assume that γ ≥ L. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 17/19
  54. Inertial (multi-objective) Steepest Descent Main results (Attouch, G., 2015) Let

    f1, ..., fm be smooth, with L-Lipschitz gradient. (ISD) m¨ x(t) = −γ ˙ x(t) + s(x(t)). Assume that γ ≥ L. Existence Suppose that H is finite dimensional. Then, for any initial data, there exists a global solution to (ISD). Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 17/19
  55. Inertial (multi-objective) Steepest Descent Main results (Attouch, G., 2015) Let

    f1, ..., fm be smooth, with L-Lipschitz gradient. (ISD) m¨ x(t) = −γ ˙ x(t) + s(x(t)). Assume that γ ≥ L. Existence Suppose that H is finite dimensional. Then, for any initial data, there exists a global solution to (ISD). Convergence in the convex case Let f1, ..., fm be convex. Then, any bounded trajectory weakly converges to a weak Pareto point. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 17/19
  56. Conclusion The steepest descent provides a flexible tool once adapted

    to multi-objective optimization problems. Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 18/19
  57. Conclusion The steepest descent provides a flexible tool once adapted

    to multi-objective optimization problems. Open questions: Uniqueness of the trajectories for ˙ x(t) = s(x(t))? Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 18/19
  58. Conclusion The steepest descent provides a flexible tool once adapted

    to multi-objective optimization problems. Open questions: Uniqueness of the trajectories for ˙ x(t) = s(x(t))? Understand the asymptotic behaviour of ˙ x(t) − s(x(t)) + ε(t)x(t) = 0 (the set of weak Paretos is non-convex). Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 18/19
  59. Conclusion The steepest descent provides a flexible tool once adapted

    to multi-objective optimization problems. Open questions: Uniqueness of the trajectories for ˙ x(t) = s(x(t))? Understand the asymptotic behaviour of ˙ x(t) − s(x(t)) + ε(t)x(t) = 0 (the set of weak Paretos is non-convex). Having convergence rates for first and second-order dynamics (the critical values are not unique). Journées du GdR MOA 2015 - Dijon - Guillaume Garrigos 18/19
  60. Thank you for your attention !