Contrastive Multi-document Question Generation

European Chapter of the Association for Computational Linguistics (EACL) |

Organized by ACL

Publication

Web search engines today return a ranked list of document links in response to a user’s query. However, when a user query is vague, the resultant documents span multiple subtopics. In such a scenario, it would be helpful if the search engine provided clarification options to the user’s initial query in a way that each clarification option is closely related to the documents in one subtopic and is far away from the documents in all other subtopics. Motivated by this scenario, we address the task of contrastive common question generation where given a “positive” set of documents and a “negative” set of documents, we generate a question that is closely related to the “positive” set and is far away from the “negative” set. We propose Multi-Source Coordinated Question Generator (MSCQG), a novel coordinator model trained using reinforcement learning to optimize a reward based on document-question ranker score. We also develop an effective auxiliary objective, named Set-induced Contrastive Regularization (SCR) that draws the coordinator’s generation behavior more closely toward “positive” documents and away from “negative” documents. We show that our model significantly outperforms strong retrieval baselines as well as a baseline model developed for a similar task, as measured by various metrics.