The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding

  • Xiaodong Liu ,
  • Yu Wang ,
  • Jianshu Ji ,
  • Hao Cheng ,
  • Xueyun Zhu ,
  • Emmanuel Awa ,
  • Pengcheng He ,
  • Weizhu Chen ,
  • Hoifung Poon ,
  • Guihong Cao ,
  • Jianfeng Gao

Meeting of the Association for Computational Linguistics |

Published by Association for Computational Linguistics

PDF | PDF | Publication | Publication | Publication | Publication | Publication

We present MT-DNN, an open-source natural language understanding (NLU) toolkit that makes it easy for researchers and developers to train customized deep learning models. Built upon PyTorch and Transformers, MT-DNN is designed to facilitate rapid customization for a broad spectrum of NLU tasks, using a variety of objectives (classification, regression, structured prediction) and text encoders (e.g., RNNs, BERT, RoBERTa, UniLM). A unique feature of MT-DNN is its built-in support for robust and transferable learning using the adversarial multi-task learning paradigm. To enable efficient production deployment, MT-DNN supports multi-task knowledge distillation, which can substantially compress a deep neural model without significant performance drop. We demonstrate the effectiveness of MT-DNN on a wide range of NLU applications across general and biomedical domains. The software and pre-trained models will be publicly available at https://github.com/namisan/mt-dnn.