February 6, 2017

The WSDM 2017 Tutorial on Neural Text Embeddings for Information Retrieval

Lieu: Cambridge, UK

In the last few years, neural representation learning approaches have achieved very good performance on many natural language processing tasks, such as language modelling and machine translation. This suggests that neural models will also achieve good performance on information retrieval (IR) tasks, such as relevance ranking, addressing the query-document vocabulary mismatch problem by using a semantic rather than lexical matching. Although initial iterations of neural models do not outperform traditional lexical-matching baselines, the level of interest and effort in this area is increasing, potentially leading to a breakthrough. The popularity of the recent SIGIR 2016 workshop on Neural Information Retrieval provides evidence to the growing interest in neural models for IR. While recent tutorials have covered some aspects of deep learning for retrieval tasks, there is a significant scope for organizing a tutorial that focuses on the fundamentals of representation learning for text retrieval. The goal of this tutorial will be to introduce state-of-the-art neural embedding models and bridge the gap between these neural models with early representation learning approaches in IR (e.g., LSA). We will discuss some of the key challenges and insights in making these models work in practice, and demonstrate one of the toolsets available to researchers interested in this area.