The presence of computationally demanding problems and the current inability to auto- matically transfer experience from the application of past experiments to new ones delays the evolution of knowledge itself. In this paper we present the Automated Data Scientist, a system that employs meta-learning for hyperparameter selection and builds a rich ensem- ble of models through forward model selection in order to automate binary classification tasks. Preliminary evaluation shows that the system is capable of coping with classification problems of medium complexity.
Predicting hyperparameters from meta-features
in binary classification problems
• Proposal: Automated Data Scientist – a system that employs meta-learning for hyperparameter prediction and builds a
rich ensemble of models through forward model selection in order to automate binary classification tasks.
• Design requirement: User just inserts a dataset. Default setting: full automation with opinionated choices
AutoML 2018, Stockholm Nisioti E., Chatzidimitriou K., Symeondis A. - Aristotle University of Thessaloniki 1
• data cleaning (inappropriate
value removal, data type
• data preprocessing
• data splitting
• Hyperparameter selection is performed using prediction models, trained on data
consisting of meta-features and optimal hyperparameters, produced through
Bayesian optimization on a repository of 100 binary classification datasets
• 31 meta-features extracted
from the dataset (from 77
• Forward selection ensembler
• Result: performance equivalence to both well-established and state-of-the-art hyperparameter optimization
techniques, while bringing the additional benefits of generation of meta-knowledge and speed, as the time-consuming
search was replaced by a simple prediction.
• Autogenerated, intuitive reporting,
to help guide manual tweaks