SHADE with Iterative Local Search for Large-Scale Global Optimization

Slide 1

Slide 1 text

SHADE with Iterative Local Search for Large-Scale Global Optimization Daniel Molina1 Antonio LaTorre2 Francisco Herrera1 1 University of Granada, Spain 2 Universidad Politécnica de Madrid, Spain

Slide 2

Slide 2 text

Contents 1 Introduction 2 SHADE-ILS 3 Results and comparisons

Slide 3

Slide 3 text

Large Scale Optimization Problem Optimization problems Global Optimization f (x∗) ≤ f (x) ∀x ∈ Domain Real-parameter Optimization Domain ⊆ D, x∗ = [x1, x2, · · · , xD] Large Scale Global Optimization (LSGO) when D ≥ 1000. Domain Search Increase exponentially with dimension.

Slide 4

Slide 4 text

Interest of LSGO problems Real-word problems Optimization of many parameters. Scalability of algorithms Algorithms can scale? Specic scalable algorithms for optimization? Study separability of variable Several variable could have a stronger inuence. Completely separable/unseparable are unusual. Dierent degree of separability

Slide 5

Slide 5 text

MOS current state-of-art Features Combine several evolutionary algorithms and Local Seachs. Apply each one with an adaptive probability. Include specic local search methods for high-dimensionality. Disadvantages Complex: 9 components. Several components only useful for few functions. Many parameters: it requires a lot of tuning. Citation Antonio LaTorre, Santiago Muelas, José María Peña:A comprehensive comparison of large scale global optimizers. Inf. Sci. 316: 517-549 (2015)

Slide 6

Slide 6 text

Previous proposal: IHDELS Previous proposal (CEC'2015) Iterative Hybridation DE with LS: IHDELS. Reference Daniel Molina, Francisco Herrera: Iterative hybridization of DE with Local Search for the CEC'2015 special session on large scale global optimization. Proceeding on IEEE CEC 2015: 1974-1978 (2015)

Slide 7

Slide 7 text

IHDELS problems IHDELS Problems Complex self-adaptation Restart useless We design the proposal Simplifying the algorithm Improving results at the same time

Slide 8

Slide 8 text

Contents 1 Introduction 2 SHADE-ILS 3 Results and comparisons

Slide 9

Slide 9 text

SHADE-ILS Memetic Algorithm Evolutionary Algorithm (DE) for diversity. Local Search to enforce exploitation. Evolutionary Algorithm To explore eciently. Few parameters. Local Search Particularly good for high-dimension problems. Robust.

Slide 10

Slide 10 text

Components: Evolutionary Algorithm Diferencial Evolution: SHADE Parameters F and CR are adapted: F and CR values from a distribution and a mean. self-adapt means by the tness obtained. Movement is guided by best solutions. Scalable, exploring well the domain search. No reduction of population, enforce exploitation through LS. Changes from IHDELS IHDELS used SaDE. Dierent mutation operator. Dierent parameters adaptation.

Slide 11

Slide 11 text

Components: Local Search Method Local Search Methods MTS-LS1, specic method for large-scale optimization. Well-known L-BFGS-B for more robustness. Combination Both are complementary. Each iteration apply only one of them. Adaptative method to select which one apply each iteration. Changes from IHDELS Selection method completely dierent.

Slide 12

Slide 12 text

Why these two? They are complementary MTS-LS1 Explore dimension by dimension, quick. Very sensitive to coordinate system. L-BFGS-B Guided by gradient estimation. Not Sensitive to coordinate system.

Slide 13

Slide 13 text

Global Scheme Init Population Get best solution Get best solution Maxevals? Must restart? Choose LS by beﬆ improvement Apply LS to best one for FE DE evals Apply SHADE for FE LS evals Last improvement ratio for each LS Restart population Update improvement ratio Restart Step No Yes Yes No

Slide 14

Slide 14 text

Component: SHADE Init Population Get best solution Get best solution Maxevals? Must restart? Choose LS by beﬆ improvement Apply LS to best one for FE DE evals Apply SHADE for FE LS evals Last improvement ratio for each LS Restart population Update improvement ratio Restart Step No Yes Yes No

Slide 15

Slide 15 text

SHADE: Mutation Initialization Mutation Crossover Selection Strategy mutation ui = xi + Fi · (xpbest − xi) + Fi · (xr1 − ar2) xpbest random individual chosen from pbest best individuals in population. xr1 Random solution from population. ar2 random solution from Population ∪ A. Fi randomly obtained following normal distribution with FmeanK mean.

Slide 16

Slide 16 text

SHADE: Crossover Initialization Mutation Crossover Selection Crossover ui = vi si rand[0,1] < CRi xi otherside CR ← distribution with mean CRmeank. CRmeank is randomly obtained from memory.

Slide 17

Slide 17 text

SHADE: Selection Initialization Mutation Crossover Selection Selection xt+ 1 i = ui if ui improves xt i xt i otherside Update of CR and F means Periodically updates F and CR means. It stores several means in a memory (more diversity).

Slide 18

Slide 18 text

LS Method Init Population Get best solution Get best solution Maxevals? Must restart? Choose LS by beﬆ improvement Apply LS to best one for FE DE evals Apply SHADE for FE LS evals Last improvement ratio for each LS Restart population Update improvement ratio Restart Step No Yes Yes No

Slide 19

Slide 19 text

Method of Local Search MTS-LS1 Dene a random group of variable to improve C. Dene SR = 10% local search ratio. Update solution x for Istr evals, ∀i ∈ C. 1 x_i' ← x_i + SR. 2 si tness(x ) ≤ tness(x) x ← x , go to step 1. 3 otherwise x'_i ← x_i - 0.5 · SR. 4 if tness(x ) ≤ tness(x) x ← x , go to step 1. 5 Reduce SR (SR ← SR/2), if x did not improve. L-BFGS-B Well-known mathematical method. Approximate gradient in each point.

Slide 20

Slide 20 text

Selection of the Local Search Init Population Get best solution Get best solution Maxevals? Must restart? Choose LS by beﬆ improvement Apply LS to best one for FE DE evals Apply SHADE for FE LS evals Last improvement ratio for each LS Restart population Update improvement ratio Restart Step No Yes Yes No

Slide 21

Slide 21 text

LS Selection IHDELS applied an adaptive probability. Initial Probability PLSM = 1 |LS| ∀M ∈ LS Update (after FreqLS evaluations) PLSM = ILSM m∈ LS ILSm ILSM = FreqLS i= 1 Improvement LSM

Slide 22

Slide 22 text

Local Search selection Problems Complex to calculate. Be careful with minimum probabilities. For avoiding chosing always the same. SHADE-ILS Model 1 Apply each time each LS, obtain the improvement ratio. 2 Store for each LS method that ratio: RatioLSM = Fitnessold − Fitnessnew Fitnessold 1 In each iteration it applies the LS with best last improvement ratio.

Slide 23

Slide 23 text

Example Example 1 Apply MTS-LS1 ⇒ RatioMTS = 0.9. 2 Apply L-BFGS-B ⇒ Ratiobfgs = 0.8. 3 Apply MTS-LS1 ⇒ RatioMTS = 0.85. 4 Apply MTS-LS1 ⇒ RatioMTS = 0.78. 5 Apply L-BFGS-B ⇒ Ratiobfgs =... 6 ... Advantages A lot easier. Better results.

Slide 24

Slide 24 text

Restart mechanism IHDELS Model There is not improvement in a iteration. It was almost never applied, never improve results. SHADEILS Model It counts when it does not achieve a minimum ratio of improvement (1%). Restart when after 3 iterations never improve that 1%. Partial restart for each iteration LS not improving ⇒ restart the LS parameters (SR). SHADE not improving ⇒ randomly restart population.

Slide 25

Slide 25 text

Contents 1 Introduction 2 SHADE-ILS 3 Results and comparisons

Slide 26

Slide 26 text

Benchmark About the benchmark 15 functions with dimension 1000. Dierent group of variables: Fully Separable: F1 − F3. Partially Separable: F4 − F11. Overlapping functions: F12 − F14. Non-separable: F15. Run during 3 · 106 evaluations. Measured the results at dierent evaluation numbers: 1.25 · 105 (4%), 6 · 105 (20%), 3 · 106 (100%).

Slide 27

Slide 27 text

SHADEILS Parameters SHADEILS Parameter values Parameter Description Value DE popsize Population size 100 FEDE Evaluations for each DE run 25000 FELS Evaluations for each LS run 25000 MTSSR Initial step size for MTS-LS1 10% Restartmin Minimum improvement 1% Iteranoimprov Iterations without improvement 3

Slide 28

Slide 28 text

Inuence of the dierent components Func. Using SHADE Using SaDE Using SHADE IHDELS +New Restart +New Restart +Old Restart F 1 2.69e-24 1.21e-24 1.76e-28 4.80e-29 F 2 1.00e+03 1.26e+03 1.40e+03 1.27e+03 F 3 2.01e+01 2.01e+01 2.01e+01 2.00e+01 F 4 1.48e+08 1.58e+08 2.99e+08 3.09e+08 F 5 1.39e+06 3.07e+06 1.76e+06 9.68e+06 F 6 1.02e+06 1.03e+06 1.03e+06 1.03e+06 F 7 7.41e+01 8.35e+01 2.44e+02 3.18e+04 F 8 3.17e+11 3.59e+11 8.55e+11 1.36e+12 F 9 1.64e+08 2.48e+08 2.09e+08 7.12e+08 F 10 9.18e+07 9.19e+07 9.25e+07 9.19e+07 F 11 5.11e+05 4.76e+05 5.20e+05 9.87e+06 F 12 6.18e+01 1.10e+02 3.42e+02 5.16e+02 F 13 1.00e+05 1.34e+05 9.61e+05 4.02e+06 F 14 5.76e+06 6.14e+06 7.40e+06 1.48e+07 F 15 6.25e+05 8.69e+05 1.01e+06 3.13e+06 Better 12 1 0 2

Slide 29

Slide 29 text

Inuence of the dierent components Dierences by the new restart mechanism (using TACO). Evaluations Mean Error Function: 04 New Restart Old Restart 1.20e+5 3.00e+6 6.00e+5 1.00e+8 1.00e+9 1.00e+10 1.00e+11 Highcharts.com Evaluations Mean Error Function: 09 1.20e+5 3.00e+6 6.00e+5 1.50e+8 1.75e+8 2.00e+8 2.25e+8 2.50e+8 2.75e+8 3.00e+8 Evaluations Mean Error Function: 05 New Restart Old Restart 1.20e+5 3.00e+6 6.00e+5 1.25e+6 1.50e+6 1.75e+6 2.00e+6 2.25e+6 2.50e+6 2.75e+6 Highcharts.com Evaluations Mean Error Function: 12 1.20e+5 3.00e+6 6.00e+5 4.00e+1 1.00e+2 2.00e+2 4.00e+2 1.00e+3 2.00e+3 4.00e+3

Slide 30

Slide 30 text

Comparison against MOS MaxEvals = 1.25 · 105 (4%) Functions SHADEILS MOS F 1 6.10e+04 2.71e+07 F 2 2.65e+03 2.64e+03 F 3 2.03e+01 7.85e+00 F 4 3.13e+10 3.47e+10 F 5 2.50e+06 6.96e+06 F 6 1.05e+06 3.11e+05 F 7 3.95e+08 3.46e+08 F 8 2.12e+14 3.72e+14 F 9 2.88e+08 4.29e+08 F 10 9.43e+07 1.16e+06 F 11 6.55e+09 3.13e+09 F 12 2.67e+03 1.16e+04 F 13 1.29e+10 8.37e+09 F 14 1.62e+11 4.61e+10 F 15 9.12e+07 1.45e+07 Best 6 9

Slide 31

Slide 31 text

Comparison against MOS MaxEvals = 6 · 105 (20%) Functions SHADEILS MOS F 1 3.71e-23 3.48e+00 F 2 1.80e+03 1.78e+03 F 3 2.01e+01 1.33e-10 F 4 1.54e+09 2.56e+09 F 5 2.29e+06 6.95e+06 F 6 1.04e+06 1.48e+05 F 7 9.25e+05 8.19e+06 F 8 6.93e+12 8.41e+13 F 9 2.50e+08 3.84e+08 F 10 9.29e+07 9.03e+05 F 11 1.37e+08 8.05e+08 F 12 1.28e+03 2.20e+03 F 13 5.68e+07 8.10e+08 F 14 6.97e+07 2.03e+08 F 15 1.22e+07 6.26e+06 Best 10 5

Slide 32

Slide 32 text

Comparison against MOS MaxEvals = 3 · 106 (100%) Functions SHADEILS MOS F 1 2.69e-24 0.00e+00 F 2 1.00e+03 8.32e+02 F 3 2.01e+01 9.17e-13 F 4 1.48e+08 1.74e+08 F 5 1.39e+06 6.94e+06 F 6 1.02e+06 1.48e+05 F 7 7.41e+01 1.62e+04 F 8 3.17e+11 8.00e+12 F 9 1.64e+08 3.83e+08 F 10 9.18e+07 9.02e+05 F 11 5.11e+05 5.22e+07 F 12 6.18e+01 2.47e+02 F 13 1.00e+05 3.40e+06 F 14 5.76e+06 2.56e+07 F 15 6.25e+05 2.35e+06 Best 10 5

Slide 33

Slide 33 text

Comparison against MOS Considering the number of evaluations Alg 4% 20% 100% MOS 9 5 5 SHADEILS 6 10 10 Conclusions about SHADEILS Since 20% it obtains better results. Better in more complex functions. Worse in separable ones. Very competitive in overlapping/non-separable functions.

Slide 34

Slide 34 text

Improvement in more complex functions Ratio of Improvement Fun MOS ⇒ SHADEILS 4% 20% 100% F7 1.6e+4 ⇒ 7.4e+1 -12.7% 88.6% 99.4% F8 8e+12 ⇒ 3.1e+11 53.8% 91.8% 96.0% F9 3.8e+8 ⇒ 1.6e+8 23.3% 34.9% 57.2% F10 9.0e+5 ⇒ 9.2e+7 -95.4% -99.0% -99.0% F11 5.2e+7 ⇒ 5.1e+5 -45.8% 83.0% 99.0% F12 2.5e+2 ⇒ 6.2e+1 77.6% 41.8% 74.9% F13 3.4e+6 ⇒ 1.0e+5 -38.8% 93.0% 97.1% F14 2.6e+7 ⇒ 5.8e+6 -61.7% 65.5% 77.3% F15 2.3e+6 ⇒ 6.2e+5 -87.7% -48.7% 73.2% Ratio = 100 · ErrorMOS − ErrorSHADEILS max(ErrorMOS, ErrorSHADEILS)

Slide 35

Slide 35 text

Comparing with CEC'2013 criterion For each function Algorithms are sorted by its tness/error. Algorithms are ranking by that order. Each algorithm receive points in based on its ranking (more to best ones). Global result Sum of score functions are accumulated.

Slide 36

Slide 36 text

Comparison with CEC'2013 criterion Algorithm Values Accuracy: 1.200e+05 Non-separable Functions Overlapping Functions Functions with no separable subcomponents Functions with a separable subcomponent Unimodal IHDELS_2015 MOS SHADE-ILS VMODE 0 100 200 300 400 Highcharts.com

Slide 37

Slide 37 text

Comparison with CEC'2013 criterion Algorithm Values Accuracy: 6.000e+05 Non-separable Functions Overlapping Functions Functions with no separable subcomponents Functions with a separable subcomponent Unimodal IHDELS_2015 MOS SHADE-ILS VMODE 0 100 200 300 400 Highcharts.com

Slide 38

Slide 38 text

Comparison with CEC'2013 criterion Algorithm Values Accuracy: 3.000e+06 Non-separable Functions Overlapping Functions Functions with no separable subcomponents Functions with a separable subcomponent Unimodal IHDELS_2015 MOS SHADE-ILS VMODE 0 100 200 300 400 Highcharts.com

Slide 39

Slide 39 text

Conclusions We have proposed a new algorithm for LSGO: SHADEILS. Apply iteratively DE+LS SHADE as the DE algorithm. It select iterately the LS with best last ratio improvement. Results SHADEILS is more competitive specially in more complex functions. SHADEILS win the previous state-of-art MOS.

Slide 40

Slide 40 text

Questions? Thanks you for your attention!!