Mining Robust Default Configurations for Resource-constrained AutoML
- Moe Kayali ,
- Chi Wang
MSR-TR-2022-28 |
Published by Microsoft
Automatic machine learning (AutoML) is a key enabler of the mass deployment of the next generation of machine learning systems. A key desideratum for future ML systems is the automatic selection of models and hyperparameters. We present a novel method of selecting performant configurations for a given task by performing offline autoML and mining over a diverse set of tasks. By mining the training tasks, we can select a compact portfolio of configurations that perform well over a wide variety of tasks, as well as learn a strategy to select portfolio configurations for yet-unseen tasks. The algorithm runs in a zero-shot manner, that is without training any models online except the chosen one. In a compute- or time-constrained setting, this virtually instant selection is highly performant. Further, we show that our approach is effective for warm-starting existing autoML platforms. In both settings, we demonstrate an improvement on the state-of-the-art by testing over 62 classification and regression datasets. We also demonstrate the utility of recommending data-dependent default configurations that outperform widely used hand-crafted defaults.
Publication Downloads
FLAML: A Fast Library for AutoML and Tuning
December 15, 2020
FLAML is a Python library designed to automatically produce accurate machine learning models with low computational cost. It frees users from selecting learners and hyperparameters for each learner. FLAML is powered by a new, cost-effective hyperparameter optimization and learner selection method invented by Microsoft Research.