Direct Optimization of Evaluation Measures in Learning to Rank
- Jun Xu ,
- Hang Li ,
- Tie-Yan Liu ,
- Yisha Peng ,
- Min Lu ,
- Wei-Ying Ma
MSR-TR-2010-171 |
One of the central issues in learning to rank for Information Retrieval (IR) is to develop algorithms that construct ranking models by directly optimizing evaluation measures used in information retrieval, such as Mean Average Precision (MAP) and Normalized Discounted Cumulative Gain (NDCG). In this paper, we aim to conduct a comprehensive study on the approach of directly optimizing evaluation measures in learning to rank for IR. We focus on the methods that minimize loss functions upper bounding the basic loss function defined on the IR measures. We first provide a general framework for the study, which is based on upper bound analysis and two types of upper bounds are discussed. Moreover, we make theoretical analysis the two types of upper bounds and show that we can derive new algorithms on the basis of this analysis and present two new algorithms called AdaRank and PermuRank. We make comparisons between direct optimization methods of AdaRank, PermuRank, and SVMmap, using benchmark datasets. We also compare them with conventional methods of Ranking SVM and RankBoost. Experimental results show that the methods based on direct optimization of ranking measures can always outperform these conventional methods. However, no significant difference exists among the performances of the direct optimization methods themselves.
Publication Downloads
AdaRank
April 11, 2011
The basic idea of AdaRank is constructing “weak rankers” repeatedly based on reweighted training queries and linearly combining the weak rankers for making ranking predictions. In learning, AdaRank minimizes a loss function directly defined on performance measures. The details of AdaRank can be found in the paper “AdaRank: A Boosting Algorithm for Information Retrieval.”.