Slide 1

Slide 1 text

Predicting hyperparameters from meta-features in binary classification problems • Proposal: Automated Data Scientist – a system that employs meta-learning for hyperparameter prediction and builds a rich ensemble of models through forward model selection in order to automate binary classification tasks. • Design requirement: User just inserts a dataset. Default setting: full automation with opinionated choices AutoML 2018, Stockholm Nisioti E., Chatzidimitriou K., Symeondis A. - Aristotle University of Thessaloniki 1 • data cleaning (inappropriate value removal, data type recognition, compression) • data preprocessing (normalization, compression, feature engineering) • data splitting • Hyperparameter selection is performed using prediction models, trained on data consisting of meta-features and optimal hyperparameters, produced through Bayesian optimization on a repository of 100 binary classification datasets • 31 meta-features extracted from the dataset (from 77 studied) • Forward selection ensembler • Result: performance equivalence to both well-established and state-of-the-art hyperparameter optimization techniques, while bringing the additional benefits of generation of meta-knowledge and speed, as the time-consuming search was replaced by a simple prediction. • Autogenerated, intuitive reporting, to help guide manual tweaks