Retiarii: A Deep Learning Exploratory-Training Framework
- Quanlu Zhang ,
- Zhenhua Han ,
- Fan Yang ,
- Yuge Zhang ,
- Zhe Liu ,
- Mao Yang ,
- Lidong Zhou
2020 Operating Systems Design and Implementation |
Traditional deep learning frameworks such as TensorFlow and PyTorch support training on a single deep neural network(DNN) model, which involves computing the weights iteratively for the DNN model. Designing a DNN model for a task remains an experimental science and is typically a practice of deep learning model exploration, dovetailed with training and validation, aiming to find the best model among a set that yields the best result. Retrofitting such exploratory-training into the training process of a single DNN model, as supported by current deep learning frameworks, is unintuitive, cumber-some, and inefficient, because of the fundamental mismatch between exploring a set of models and training a single one. Retiarii is the first framework to support deep learning exploratory-training. In particular, Retiarii (i) provides a new programming interface to specify a DNN model space for exploration, as well as an interface to describe the exploration strategy that decides which order to instantiate and train models in, how to prioritize model training, and when to terminate training of certain models; (ii) offers a Just-In-Time (JIT)engine that instantiates models, manages the training of the instantiated models, gathers the information for the exploration strategy to consume, and executes the decisions accordingly;(iii) identifies the correlations between the instantiated models and develops a set of cross-model optimizations to improve the overall exploratory-training process. Retiarii does so by introducing a key abstraction, Mutator, that connects the specifications of DNN model spaces and exploration strategies, while exposing the correlations between models for optimization. As a result, Retiarii’s clean separation of DNN model space specification, exploration strategy, and cross-model optimizations, connected through the single mutator abstraction, leads to ease of programming, reuse of components, and vastly improved (up to 8.58x) overall exploratory-training efficiency.
Introducing Retiarii: A deep learning exploratory-training framework on NNI
Traditional deep learning frameworks such as TensorFlow and PyTorch support training on a single deep neural network (DNN) model, which involves computing the weights iteratively for the DNN model. Designing a DNN model for a task remains an experimental science and is typically a practice of deep learning model exploration. Retrofitting such exploratory-training into the training process of a single DNN model, as supported by current deep learning frameworks, is unintuitive, cumbersome, and inefficient. In this webinar, Microsoft Research Asia Senior Researcher Quanlu Zhang and Principal Program Manager Scarlett Li will analyze these challenges within the context of Neural Architecture Search (NAS). The first part of the webinar will focus on Retiarii, a deep learning exploratory-training framework for DNN models. Retiarii also offers a just-in-time…