Logical Transformers: Infusing Logical Structures into Pre-Trained Language Models
- Borui Wang ,
- Qiuyuan Huang ,
- Budhaditya Deb ,
- Aaron L. Halfaker ,
- Liqun Shao ,
- Daniel McDuff ,
- Ahmed Awadallah ,
- Dragomir Radev ,
- Jianfeng Gao
Natural language has a logical structure and information, the understanding of which is essential for many language-based tasks. Existing pre-trained language models based on transformer architectures mostly adopt a classical design for constructing their input embeddings that ignores the logical structures underlying natural language texts, thus limiting their ability to better capture and encode main logical information in the input sequences. To overcome such limitations, we first propose a novel approach to construct logic-aware input embeddings for transformer language models through a combination of logic detection, logic mapping and hierarchical logical projections, and then develop a corresponding new modeling paradigm that can upgrade all existing transformer language models into logical transformers to consistently boost their performance on different NLU and NLG tasks. Our empirical experiments on three challenging abstractive text summarization tasks demonstrate that our proposed logical transformer language approach achieves consistently superior performance over their baseline transformer models through a deeper understanding of the logical structures of the source texts.