Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Algorithm Configuration Data Mining for CMA Evo...

Avatar for Sander Sander
July 18, 2017

Algorithm Configuration Data Mining for CMA Evolution Strategies

Slides presented at GECCO 2017

Avatar for Sander

Sander

July 18, 2017
Tweet

More Decks by Sander

Other Decks in Research

Transcript

  1. Discover the world at Leiden University 1 Algorithm Configuration Data

    Mining for CMA Evolution Strategies Sander van Rijn, Hao Wang, Bas van Stein and Thomas B¨ ack .
  2. Discover the world at Leiden University 1 Introduction • EA

    popularity → many variants • Focus: CMA-ES • Few combinations are tested
  3. Discover the world at Leiden University 2 Modular CMA-ES Framework

    Selected CMA-ES Modules Active Update Elitism Mirrored Sampling Orthogonal Sampling Sequential Selection Threshold Convergence Two-Point step-size Adaptation (TPA) Pairwise Selection Recombination Weights Quasi-Gaussian Sampling Increasing Population S. van Rijn, H. Wang, M. van Leeuwen and T. B¨ ack, ”Evolving the structure of Evolution Strategies,” 2016 IEEE Symposium Series on Computational Intelligence (SSCI), Athens, 2016, pp. 1-8. DOI: 10.1109/SSCI.2016.7850138
  4. Discover the world at Leiden University 2 Modular CMA-ES Framework

    Selected CMA-ES Modules Active Update Elitism Mirrored Sampling Orthogonal Sampling Sequential Selection Threshold Convergence TPA Pairwise Selection Recombination Weights Quasi-Gaussian Sampling Increasing Population → [0,0,0,0,0,0,0,0,0,0,0] [0,0,0,0,0,0,0,0,0,0,1] [0,0,0,0,0,0,0,0,0,0,2] ... [1,1,1,1,1,1,1,1,1,2,0] [1,1,1,1,1,1,1,1,1,2,1] [1,1,1,1,1,1,1,1,1,2,2] 29 × 32 = 4 608 S. van Rijn, H. Wang, M. van Leeuwen and T. B¨ ack, ”Evolving the structure of Evolution Strategies,” 2016 IEEE Symposium Series on Computational Intelligence (SSCI), Athens, 2016, pp. 1-8. DOI: 10.1109/SSCI.2016.7850138
  5. Discover the world at Leiden University 2 Modular CMA-ES Framework

    Selected CMA-ES Modules Active Update Elitism Mirrored Sampling Orthogonal Sampling Sequential Selection Threshold Convergence TPA Pairwise Selection Recombination Weights Quasi-Gaussian Sampling Increasing Population → [0,0,0,0,0,0,0,0,0,0,0] [0,0,0,0,0,0,0,0,0,0,1] [0,0,0,0,0,0,0,0,0,0,2] ... [1,1,1,1,1,1,1,1,1,2,0] [1,1,1,1,1,1,1,1,1,2,1] [1,1,1,1,1,1,1,1,1,2,2] 29 × 32 = 4 608 → S. van Rijn, H. Wang, M. van Leeuwen and T. B¨ ack, ”Evolving the structure of Evolution Strategies,” 2016 IEEE Symposium Series on Computational Intelligence (SSCI), Athens, 2016, pp. 1-8. DOI: 10.1109/SSCI.2016.7850138
  6. Discover the world at Leiden University 3 Modular CMA-ES Framework

    Algorithm 1 Modular CMA-ES Framework 1: options ← which modules are active 2: init-params ← initial/default parameter values 3: while not terminate do // Local restart loop 4: params ← Initialize(init-params) 5: t ← 0 6: ¯ x ← randomly generated individual 7: while not terminate local do // ES execution loop 8: x ← Mutate(¯ x, options) // Sampler, Threshold 9: f ← Evaluate(x, options) // Sequential 10: P(t+1) ← Select(x, f , options) // Elitism, Pairwise 11: ¯ x ← Recombine(P(t+1), options) // Weights 12: UpdateParams(params, options) // Active, TPA 13: t ← t + 1 14: end while 15: AdaptParams(init-params, options) // (B)IPOP 16: end while S. van Rijn, H. Wang, M. van Leeuwen and T. B¨ ack, ”Evolving the structure of Evolution Strategies,” 2016 IEEE Symposium Series on Computational Intelligence (SSCI), Athens, 2016, pp. 1-8. DOI: 10.1109/SSCI.2016.7850138
  7. Discover the world at Leiden University 4 Algorithm Quality Problem:

    How to determine quality of a configuration c? • FCE: arbirtrary values • ERT: only defined on FCE(c) < FCEtarget
  8. Discover the world at Leiden University 4 Algorithm Quality Problem:

    How to determine quality of a configuration c? • FCE: arbirtrary values • ERT: only defined on FCE(c) < FCEtarget Solution: scaled combination ERT × FCE → [0, 2]
  9. Discover the world at Leiden University 5 Example 0 1000

    2000 3000 4000 Rank 0.00 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 Quality Algorithm quality vs. Rank for F10 2D 3D 5D 10D 15D 20D 25D 30D 35D 40D
  10. Discover the world at Leiden University 6 Random Forest Regression

    Forest predicts q per experiment Forest of 250 trees Mean feature importance1over all experiments 1A measure of how pure the split according to a feature is
  11. Discover the world at Leiden University 6 Random Forest Regression

    Forest predicts q per experiment Forest of 250 trees Mean feature importance1over all experiments Module Importance Active 0.04 Elitism 0.06 Mirrored 0.04 Orthogonal 0.05 Sequential 0.16 Threshold 0.31 TPA 0.03 Pairwise 0.02 Weights 0.02 Base-Sampler 0.20 (B)IPOP 0.07 1A measure of how pure the split according to a feature is
  12. Discover the world at Leiden University 7 Impact Compare xon

    and xoff: Ix = ¯ q(Cx off ) − ¯ q(Cx on ) ¯ q(C): mean q of set C
  13. Discover the world at Leiden University 7 Impact Compare xon

    and xoff: Ix = ¯ q(Cx off ) − ¯ q(Cx on ) ¯ q(C): mean q of set C
  14. Discover the world at Leiden University 9 Impact: Module Interaction

    Compare xon ∧ yon and ¬(xon ∧ yon ) Active Elitism M irrored O rthogonal Sequential Threshold TPA Pairwise W eights Base-Sam pler (B)IPO P Active Elitism Mirrored Orthogonal Sequential Threshold TPA Pairwise Weights Base-Sampler (B)IPOP −0.3 −0.2 −0.1 0.0 0.1 Module Interaction Impact
  15. Discover the world at Leiden University 10 Impact: Module Interaction

    Active Elitism M irrored O rthogonal Sequential Threshold TPA Pairwise W eights Base-Sam pler (B)IPO P Active Elitism Mirrored Orthogonal Sequential Threshold TPA Pairwise Weights Base-Sampler (B)IPOP Impact of module interaction for F2 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 Module Interaction Impact Active Elitism M irrored O rthogonal Sequential Threshold TPA Pairwise W eights Base-Sam pler (B)IPO P Active Elitism Mirrored Orthogonal Sequential Threshold TPA Pairwise Weights Base-Sampler (B)IPOP Impact of module interaction for F23 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 Module Interaction Impact Active Elitism M irrored O rthogonal Sequential Threshold TPA Pairwise W eights Base-Sam pler (B)IPO P Active Elitism Mirrored Orthogonal Sequential Threshold TPA Pairwise Weights Base-Sampler (B)IPOP Impact of module interaction for F7 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 Module Interaction Impact Active Elitism M irrored O rthogonal Sequential Threshold TPA Pairwise W eights Base-Sam pler (B)IPO P Active Elitism Mirrored Orthogonal Sequential Threshold TPA Pairwise Weights Base-Sampler (B)IPOP Impact of module interaction for F24 −0.8 −0.6 −0.4 −0.2 0.0 0.2 0.4 Module Interaction Impact
  16. Discover the world at Leiden University 11 Module Progression Configuration

    ranking: #1: [0 0 1 1 0 1 1 0 0 2 0] #2: [0 0 1 1 0 1 1 0 0 2 2] #3: [0 0 1 1 0 1 1 0 0 2 1] #4: [0 0 1 1 0 1 1 1 0 2 2] ... #4608: ... Which modules are active • in the best configuration?
  17. Discover the world at Leiden University 11 Module Progression Configuration

    ranking: #1: [0 0 1 1 0 1 1 0 0 2 0] #2: [0 0 1 1 0 1 1 0 0 2 2] #3: [0 0 1 1 0 1 1 0 0 2 1] #4: [0 0 1 1 0 1 1 1 0 2 2] ... #4608: ... Which modules are active • in the best configuration? • in the 10 best configurations?
  18. Discover the world at Leiden University 11 Module Progression Configuration

    ranking: #1: [0 0 1 1 0 1 1 0 0 2 0] #2: [0 0 1 1 0 1 1 0 0 2 2] #3: [0 0 1 1 0 1 1 0 0 2 1] #4: [0 0 1 1 0 1 1 1 0 2 2] ... #4608: ... Which modules are active • in the best configuration? • in the 10 best configurations? • in the 100 best configurations? • ...
  19. Discover the world at Leiden University 12 Progression Correlation 0

    25 50 75 100 0.0 0.2 0.4 0.6 0.8 1.0 Relative activation frequency 1000 2000 3000 4000 Rank 0.0 0.2 0.4 0.6 0.8 1.0 (B)IPOP (30D F7) (B)IPOP (35D F7) (B)IPOP (15D F7) (B)IPOP (40D F7) (B)IPOP (20D F7) (B)IPOP (25D F7) (B)IPOP (10D F7) Progression for experiment similarity cluster (c > 0.978) 0 25 50 75 100 0.0 0.2 0.4 0.6 0.8 1.0 Relative activation frequency 1000 2000 3000 4000 Rank 0.0 0.2 0.4 0.6 0.8 1.0 Elitism (2D F10) Elitism (2D F2) Elitism (2D F11) Progression for experiment similarity cluster (c > 0.993) 0 25 50 75 100 0.0 0.2 0.4 0.6 0.8 1.0 Relative activation frequency 1000 2000 3000 4000 Rank 0.0 0.2 0.4 0.6 0.8 1.0 Threshold (2D F2) Threshold (2D F10) Threshold (2D F11) Progression for experiment similarity cluster (c > 0.997)
  20. Discover the world at Leiden University 13 Progression Correlation 0

    25 50 75 100 0.0 0.2 0.4 0.6 0.8 1.0 Relative activation frequency 1000 2000 3000 4000 Rank 0.0 0.2 0.4 0.6 0.8 1.0 Elitism (5D F14) TPA (5D F14) Progression for module cooperation cluster (c > 0.921) 0 25 50 75 100 0.0 0.2 0.4 0.6 0.8 1.0 Relative activation frequency 1000 2000 3000 4000 Rank 0.0 0.2 0.4 0.6 0.8 1.0 Base-Sampler (5D F15) (B)IPOP (5D F15) Progression for module cooperation cluster (c > 0.962)
  21. Discover the world at Leiden University 14 Summary & Outlook

    Summary • We can successfully identify useful options • Similar landscapes show similar impact/progression behavior
  22. Discover the world at Leiden University 14 Summary & Outlook

    Summary • We can successfully identify useful options • Similar landscapes show similar impact/progression behavior Outlook • Combine with landscape features • Include parameters tuning • Expand to include more modules
  23. Discover the world at Leiden University 14 Summary & Outlook

    Summary • We can successfully identify useful options • Similar landscapes show similar impact/progression behavior Outlook • Combine with landscape features • Include parameters tuning • Expand to include more modules Code on Github S. van Rijn, ”module analysis.ipynb”, hosted at: https://github.com/Energya/cma-es-configuration-data-mining
  24. Discover the world at Leiden University 15 Appendix: Module Impact

    p-values All Ix outside [−0.144, 0.229] have p < 0.01 (442 / 2 640 values)
  25. Discover the world at Leiden University 16 Appendix: Interaction Impact

    p-values All Ix outside [−0.551, 0.371] have p < 0.01 (1 383 / 29 040 values)