M. Feurer, F. Hutter, J. Bergstra, J. Snoek, H. Hoos, and K. Leyton-Brown, “Towards an Empirical Foundation for Assessing Bayesian Optimization of Hyperparameters,” NIPS workshop on Bayesian Optimization in Theory and Practice, vol.10, p.3, 2013. • [Hutter et al., 2013] F. Hutter, H. Hoos, and K. Leyton-Brown, “An Evaluation of Sequential Model-based Optimization for Expensive Blackbox Functions,” Proceedings of the 15th annual conference companion on Genetic and evolutionary computation, ACM, pp.1209–1216 2013. • [Snoek et al., 2012] J. Snoek, H. Larochelle, and R.P. Adams, “Practical Bayesian Optimization of Machine Learning Algorithms,” Advances in neural information processing systems, pp.2951–2959, 2012. • [Bergstra et al., 2011] J.S. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, “Algorithms for Hyper-Parameter Optimization,” Advances in neural information processing systems, pp.2546–2554, 2011. • [Hutter et al., 2011] F. Hutter, H.H. Hoos, and K. Leyton-Brown, “Sequential Model-Based Optimization for General Algorithm Configuration,” International Conference on Learning and Intelligent Optimization, Springer, pp.507–523 2011. • [Wang et al., 2016] Z. Wang, F. Hutter, M. Zoghi, D. Matheson, and N. de Feitas, “Bayesian optimization in a billion dimensions via random embeddings,” Journal of Artificial Intelligence Research, vol.55, pp.361–387, 2016.
A. Seyfarth, J. Peters, and M.P. Deisenroth, “An experimental comparison of bayesian optimization for bipedal locomotion,” 2014 IEEE International Conference on Robotics and Automation (ICRA)IEEE, pp.1951–1958 2014. • [Hansen and Ostermeier, 1996] N. Hansen and A. Ostermeier, “Adapting Arbitrary Normal Mutation Distributions in Evolution Strategies: The Covariance Matrix Adaptation,” Proceedings of IEEE international conference on evolutionary computation, IEEE, pp.312–317 1996. • [Loshchilov and Hutter, 2016] I. Loshchilov and F. Hutter, “CMA-ES for Hyper- parameter Optimization of Deep Neural Networks,” ICLR Workshop, 2016. • [Friedrichs and Igel, 2005] F. Friedrichs and C. Igel, “Evolutionary Tuning of Multiple SVM Parameters,” Neurocomputing, vol.64, pp.107–117, 2005. • [Watanabe and Le Roux, 2014] S. Watanabe and J. Le Roux, “Black box optimization for automatic speech recognition,” 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)IEEE, pp.3256–3260 2014. • [Akimoto et al., 2010] Y. Akimoto, Y. Nagata, I. Ono, and S. Kobayashi. Bidirectional relation between CMA evolution strategies and natural evolution strategies. In Parallel Problem Solving from Nature (PPSN), 2010.
and A. Talwalkar, “Non-stochastic best arm identification and hyperparameter optimiza- tion,” AISTATS, pp.240–248, 2016. • [Li et al., 2018] L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar, “Hyperband: A Novel Bandit- Based Approach to Hyperparameter Optimization,” Journal of Machine Learning Research, vol.18, no.185, pp.1–52, 2018. • [Falkner et al., 2018] S. Falkner, A. Klein, and F. Hutter, “BOHB: Robust and efficient hyperparameter optimization at scale,” ICML, pp.1437–1446, 2018. • [Domhan et al., 2015] T. Domhan, J.T. Springenberg, and F. Hutter, “Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves,” Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015. • [Klein et al., 2017] A. Klein, S. Falkner, J.T. Springenberg, and F. Hutter, “Learning Curve Prediction with Bayesian Neural Networks,” International Conference on Learning Representations (ICLR) 2017 Conference Track, April 2017. • [Lockett and Miikkulainen, 2016] A. Lockett and R. Miikkulainen, “A probabilistic re-formulation of no free lunch: Continuous lunches are not free”, Evolutionary Computation, 2016.
V. Dalibard, S. Osindero, W. M Czarnecki, J. Donahue, A. Razavi, O. Vinyals, T. Green, I. Dunning, K. Simonyan, et al, “Population based training of neural networks,” arXiv preprint arXiv:1711.09846, 2017. • [Swersky et al., 2013] K. Swersky, J. Snoek, R. Adams, “Multi-Task Bayesian Optimization,” In Advances in neural information processing systems, pages 2004–2012, 2013. • [Akiba et al., 2019] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama. Optuna: “A next generation hyperparameter optimization framework,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019. • [Li and Talwalkar, 2019] L. Li and A. Talwalkar, “Random Search and Reproducibility for Neural Architecture Search,” arXiv:1902.07638, 2019. • [Sciuto et al., 2019] C. Sciuto, K. Yu, M. Jaggi, C. Musat, M. Salzmann, “Evaluating the search phase of neural architecture search,” arXiv:1902.08142, 2019. • [Klein and Hutter, 2019] A. Klein and F. Hutter, “Tabular benchmarks for joint architecture and hyperparameter optimization”, arXiv:1905.04970, 2019. • [Ying et al., 2019] C. Ying, A. Klein, E. Christiansen, E. Real, K. Murphy, and F. Hutter. NAS-Bench-101: Towards reproducible neural architecture search. ICML, volume 97 of Proceedings of Machine Learning Research, pp. 7105–7114, 2019.