Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SHADE with Iterative Local Search for Large-Scale Global Optimization

dmolina
July 10, 2018

SHADE with Iterative Local Search for Large-Scale Global Optimization

Global optimization is a very important topic in research due to its wide applications in many real-world problems in science and engineering. Among optimization problems, dimensionality is one of the most crucial issues that increases the difficulty of the optimization process. Thus, Large-Scale Global Optimization, optimization with a great number of variables, arises as a field that is getting an increasing interest. In this paper, we propose a new hybrid algorithm especially designed to tackle this type of optimization problems. The proposal combines, in a iterative way, a modern Differential Evolution algorithm with one local search method chosen from a set of different search methods. The selection of the local search method is dynamic and takes into account the improvement obtained by each of them in the previous intensification phase, to identify the most adequate in each case for the problem. Experiments are carried out using the CEC'2013 Large-Scale Global Optimization benchmark, and the proposal is compared with other state-of-the-art algorithms, showing that the synergy among the different components of our proposal leads to better and more robust results than more complex algorithms. In particular, it improves the results of the current winner of previous Large-Scale Global Optimization competitions, Multiple Offspring Sampling, MOS, obtaining very good results, especially in the most difficult problems.

dmolina

July 10, 2018
Tweet

More Decks by dmolina

Other Decks in Science

Transcript

  1. SHADE with Iterative Local Search for Large-Scale Global Optimization Daniel

    Molina1 Antonio LaTorre2 Francisco Herrera1 1 University of Granada, Spain 2 Universidad Politécnica de Madrid, Spain
  2. Large Scale Optimization Problem Optimization problems Global Optimization f (x∗)

    ≤ f (x) ∀x ∈ Domain Real-parameter Optimization Domain ⊆ D, x∗ = [x1, x2, · · · , xD] Large Scale Global Optimization (LSGO) when D ≥ 1000. Domain Search Increase exponentially with dimension.
  3. Interest of LSGO problems Real-word problems Optimization of many parameters.

    Scalability of algorithms Algorithms can scale? Specic scalable algorithms for optimization? Study separability of variable Several variable could have a stronger inuence. Completely separable/unseparable are unusual. ˆ Dierent degree of separability
  4. MOS current state-of-art Features Combine several evolutionary algorithms and Local

    Seachs. Apply each one with an adaptive probability. Include specic local search methods for high-dimensionality. Disadvantages Complex: 9 components. Several components only useful for few functions. Many parameters: it requires a lot of tuning. Citation Antonio LaTorre, Santiago Muelas, José María Peña:A comprehensive comparison of large scale global optimizers. Inf. Sci. 316: 517-549 (2015)
  5. Previous proposal: IHDELS Previous proposal (CEC'2015) Iterative Hybridation DE with

    LS: IHDELS. Reference Daniel Molina, Francisco Herrera: Iterative hybridization of DE with Local Search for the CEC'2015 special session on large scale global optimization. Proceeding on IEEE CEC 2015: 1974-1978 (2015)
  6. IHDELS problems IHDELS Problems Complex self-adaptation Restart useless We design

    the proposal Simplifying the algorithm Improving results at the same time
  7. SHADE-ILS Memetic Algorithm Evolutionary Algorithm (DE) for diversity. Local Search

    to enforce exploitation. Evolutionary Algorithm To explore eciently. Few parameters. Local Search Particularly good for high-dimension problems. Robust.
  8. Components: Evolutionary Algorithm Diferencial Evolution: SHADE Parameters F and CR

    are adapted: ˆ F and CR values from a distribution and a mean. ˆ self-adapt means by the tness obtained. Movement is guided by best solutions. Scalable, exploring well the domain search. No reduction of population, enforce exploitation through LS. Changes from IHDELS IHDELS used SaDE. Dierent mutation operator. Dierent parameters adaptation.
  9. Components: Local Search Method Local Search Methods MTS-LS1, specic method

    for large-scale optimization. Well-known L-BFGS-B for more robustness. Combination Both are complementary. Each iteration apply only one of them. Adaptative method to select which one apply each iteration. Changes from IHDELS Selection method completely dierent.
  10. Why these two? They are complementary MTS-LS1 Explore dimension by

    dimension, quick. Very sensitive to coordinate system. L-BFGS-B Guided by gradient estimation. Not Sensitive to coordinate system.
  11. Global Scheme Init Population Get best solution Get best solution

    Maxevals? Must restart? Choose LS by best improvement Apply LS to best one for FE DE evals Apply SHADE for FE LS evals Last improvement ratio for each LS Restart population Update improvement ratio Restart Step No Yes Yes No
  12. Component: SHADE Init Population Get best solution Get best solution

    Maxevals? Must restart? Choose LS by best improvement Apply LS to best one for FE DE evals Apply SHADE for FE LS evals Last improvement ratio for each LS Restart population Update improvement ratio Restart Step No Yes Yes No
  13. SHADE: Mutation Initialization Mutation Crossover Selection Strategy mutation ui =

    xi + Fi · (xpbest − xi) + Fi · (xr1 − ar2) xpbest random individual chosen from pbest best individuals in population. xr1 Random solution from population. ar2 random solution from Population ∪ A. Fi randomly obtained following normal distribution with FmeanK mean.
  14. SHADE: Crossover Initialization Mutation Crossover Selection Crossover ui = vi

    si rand[0,1] < CRi xi otherside CR ← distribution with mean CRmeank. CRmeank is randomly obtained from memory.
  15. SHADE: Selection Initialization Mutation Crossover Selection Selection xt+ 1 i

    = ui if ui improves xt i xt i otherside Update of CR and F means Periodically updates F and CR means. It stores several means in a memory (more diversity).
  16. LS Method Init Population Get best solution Get best solution

    Maxevals? Must restart? Choose LS by best improvement Apply LS to best one for FE DE evals Apply SHADE for FE LS evals Last improvement ratio for each LS Restart population Update improvement ratio Restart Step No Yes Yes No
  17. Method of Local Search MTS-LS1 Dene a random group of

    variable to improve C. Dene SR = 10% local search ratio. Update solution x for Istr evals, ∀i ∈ C. 1 x_i' ← x_i + SR. 2 si tness(x ) ≤ tness(x) x ← x , go to step 1. 3 otherwise x'_i ← x_i - 0.5 · SR. 4 if tness(x ) ≤ tness(x) x ← x , go to step 1. 5 Reduce SR (SR ← SR/2), if x did not improve. L-BFGS-B Well-known mathematical method. Approximate gradient in each point.
  18. Selection of the Local Search Init Population Get best solution

    Get best solution Maxevals? Must restart? Choose LS by best improvement Apply LS to best one for FE DE evals Apply SHADE for FE LS evals Last improvement ratio for each LS Restart population Update improvement ratio Restart Step No Yes Yes No
  19. LS Selection IHDELS applied an adaptive probability. Initial Probability PLSM

    = 1 |LS| ∀M ∈ LS Update (after FreqLS evaluations) PLSM = ILSM m∈ LS ILSm ILSM = FreqLS i= 1 Improvement LSM
  20. Local Search selection Problems Complex to calculate. Be careful with

    minimum probabilities. ˆ For avoiding chosing always the same. SHADE-ILS Model 1 Apply each time each LS, obtain the improvement ratio. 2 Store for each LS method that ratio: RatioLSM = Fitnessold − Fitnessnew Fitnessold 1 In each iteration it applies the LS with best last improvement ratio.
  21. Example Example 1 Apply MTS-LS1 ⇒ RatioMTS = 0.9. 2

    Apply L-BFGS-B ⇒ Ratiobfgs = 0.8. 3 Apply MTS-LS1 ⇒ RatioMTS = 0.85. 4 Apply MTS-LS1 ⇒ RatioMTS = 0.78. 5 Apply L-BFGS-B ⇒ Ratiobfgs =... 6 ... Advantages A lot easier. Better results.
  22. Restart mechanism IHDELS Model There is not improvement in a

    iteration. It was almost never applied, never improve results. SHADEILS Model It counts when it does not achieve a minimum ratio of improvement (1%). Restart when after 3 iterations never improve that 1%. Partial restart for each iteration LS not improving ⇒ restart the LS parameters (SR). SHADE not improving ⇒ randomly restart population.
  23. Benchmark About the benchmark 15 functions with dimension 1000. Dierent

    group of variables: ˆ Fully Separable: F1 − F3. ˆ Partially Separable: F4 − F11. ˆ Overlapping functions: F12 − F14. ˆ Non-separable: F15. Run during 3 · 106 evaluations. Measured the results at dierent evaluation numbers: 1.25 · 105 (4%), 6 · 105 (20%), 3 · 106 (100%).
  24. SHADEILS Parameters SHADEILS Parameter values Parameter Description Value DE popsize

    Population size 100 FEDE Evaluations for each DE run 25000 FELS Evaluations for each LS run 25000 MTSSR Initial step size for MTS-LS1 10% Restartmin Minimum improvement 1% Iteranoimprov Iterations without improvement 3
  25. Inuence of the dierent components Func. Using SHADE Using SaDE

    Using SHADE IHDELS +New Restart +New Restart +Old Restart F 1 2.69e-24 1.21e-24 1.76e-28 4.80e-29 F 2 1.00e+03 1.26e+03 1.40e+03 1.27e+03 F 3 2.01e+01 2.01e+01 2.01e+01 2.00e+01 F 4 1.48e+08 1.58e+08 2.99e+08 3.09e+08 F 5 1.39e+06 3.07e+06 1.76e+06 9.68e+06 F 6 1.02e+06 1.03e+06 1.03e+06 1.03e+06 F 7 7.41e+01 8.35e+01 2.44e+02 3.18e+04 F 8 3.17e+11 3.59e+11 8.55e+11 1.36e+12 F 9 1.64e+08 2.48e+08 2.09e+08 7.12e+08 F 10 9.18e+07 9.19e+07 9.25e+07 9.19e+07 F 11 5.11e+05 4.76e+05 5.20e+05 9.87e+06 F 12 6.18e+01 1.10e+02 3.42e+02 5.16e+02 F 13 1.00e+05 1.34e+05 9.61e+05 4.02e+06 F 14 5.76e+06 6.14e+06 7.40e+06 1.48e+07 F 15 6.25e+05 8.69e+05 1.01e+06 3.13e+06 Better 12 1 0 2
  26. Inuence of the dierent components Dierences by the new restart

    mechanism (using TACO). Evaluations Mean Error Function: 04 New Restart Old Restart 1.20e+5 3.00e+6 6.00e+5 1.00e+8 1.00e+9 1.00e+10 1.00e+11 Highcharts.com Evaluations Mean Error Function: 09 1.20e+5 3.00e+6 6.00e+5 1.50e+8 1.75e+8 2.00e+8 2.25e+8 2.50e+8 2.75e+8 3.00e+8 Evaluations Mean Error Function: 05 New Restart Old Restart 1.20e+5 3.00e+6 6.00e+5 1.25e+6 1.50e+6 1.75e+6 2.00e+6 2.25e+6 2.50e+6 2.75e+6 Highcharts.com Evaluations Mean Error Function: 12 1.20e+5 3.00e+6 6.00e+5 4.00e+1 1.00e+2 2.00e+2 4.00e+2 1.00e+3 2.00e+3 4.00e+3
  27. Comparison against MOS MaxEvals = 1.25 · 105 (4%) Functions

    SHADEILS MOS F 1 6.10e+04 2.71e+07 F 2 2.65e+03 2.64e+03 F 3 2.03e+01 7.85e+00 F 4 3.13e+10 3.47e+10 F 5 2.50e+06 6.96e+06 F 6 1.05e+06 3.11e+05 F 7 3.95e+08 3.46e+08 F 8 2.12e+14 3.72e+14 F 9 2.88e+08 4.29e+08 F 10 9.43e+07 1.16e+06 F 11 6.55e+09 3.13e+09 F 12 2.67e+03 1.16e+04 F 13 1.29e+10 8.37e+09 F 14 1.62e+11 4.61e+10 F 15 9.12e+07 1.45e+07 Best 6 9
  28. Comparison against MOS MaxEvals = 6 · 105 (20%) Functions

    SHADEILS MOS F 1 3.71e-23 3.48e+00 F 2 1.80e+03 1.78e+03 F 3 2.01e+01 1.33e-10 F 4 1.54e+09 2.56e+09 F 5 2.29e+06 6.95e+06 F 6 1.04e+06 1.48e+05 F 7 9.25e+05 8.19e+06 F 8 6.93e+12 8.41e+13 F 9 2.50e+08 3.84e+08 F 10 9.29e+07 9.03e+05 F 11 1.37e+08 8.05e+08 F 12 1.28e+03 2.20e+03 F 13 5.68e+07 8.10e+08 F 14 6.97e+07 2.03e+08 F 15 1.22e+07 6.26e+06 Best 10 5
  29. Comparison against MOS MaxEvals = 3 · 106 (100%) Functions

    SHADEILS MOS F 1 2.69e-24 0.00e+00 F 2 1.00e+03 8.32e+02 F 3 2.01e+01 9.17e-13 F 4 1.48e+08 1.74e+08 F 5 1.39e+06 6.94e+06 F 6 1.02e+06 1.48e+05 F 7 7.41e+01 1.62e+04 F 8 3.17e+11 8.00e+12 F 9 1.64e+08 3.83e+08 F 10 9.18e+07 9.02e+05 F 11 5.11e+05 5.22e+07 F 12 6.18e+01 2.47e+02 F 13 1.00e+05 3.40e+06 F 14 5.76e+06 2.56e+07 F 15 6.25e+05 2.35e+06 Best 10 5
  30. Comparison against MOS Considering the number of evaluations Alg 4%

    20% 100% MOS 9 5 5 SHADEILS 6 10 10 Conclusions about SHADEILS Since 20% it obtains better results. Better in more complex functions. ˆ Worse in separable ones. Very competitive in overlapping/non-separable functions.
  31. Improvement in more complex functions Ratio of Improvement Fun MOS

    ⇒ SHADEILS 4% 20% 100% F7 1.6e+4 ⇒ 7.4e+1 -12.7% 88.6% 99.4% F8 8e+12 ⇒ 3.1e+11 53.8% 91.8% 96.0% F9 3.8e+8 ⇒ 1.6e+8 23.3% 34.9% 57.2% F10 9.0e+5 ⇒ 9.2e+7 -95.4% -99.0% -99.0% F11 5.2e+7 ⇒ 5.1e+5 -45.8% 83.0% 99.0% F12 2.5e+2 ⇒ 6.2e+1 77.6% 41.8% 74.9% F13 3.4e+6 ⇒ 1.0e+5 -38.8% 93.0% 97.1% F14 2.6e+7 ⇒ 5.8e+6 -61.7% 65.5% 77.3% F15 2.3e+6 ⇒ 6.2e+5 -87.7% -48.7% 73.2% Ratio = 100 · ErrorMOS − ErrorSHADEILS max(ErrorMOS, ErrorSHADEILS)
  32. Comparing with CEC'2013 criterion For each function Algorithms are sorted

    by its tness/error. Algorithms are ranking by that order. Each algorithm receive points in based on its ranking (more to best ones). Global result Sum of score functions are accumulated.
  33. Comparison with CEC'2013 criterion Algorithm Values Accuracy: 1.200e+05 Non-separable Functions

    Overlapping Functions Functions with no separable subcomponents Functions with a separable subcomponent Unimodal IHDELS_2015 MOS SHADE-ILS VMODE 0 100 200 300 400 Highcharts.com
  34. Comparison with CEC'2013 criterion Algorithm Values Accuracy: 6.000e+05 Non-separable Functions

    Overlapping Functions Functions with no separable subcomponents Functions with a separable subcomponent Unimodal IHDELS_2015 MOS SHADE-ILS VMODE 0 100 200 300 400 Highcharts.com
  35. Comparison with CEC'2013 criterion Algorithm Values Accuracy: 3.000e+06 Non-separable Functions

    Overlapping Functions Functions with no separable subcomponents Functions with a separable subcomponent Unimodal IHDELS_2015 MOS SHADE-ILS VMODE 0 100 200 300 400 Highcharts.com
  36. Conclusions We have proposed a new algorithm for LSGO: SHADEILS.

    Apply iteratively DE+LS SHADE as the DE algorithm. It select iterately the LS with best last ratio improvement. Results SHADEILS is more competitive specially in more complex functions. SHADEILS win the previous state-of-art MOS.