RLNF: Reinforcement Learning based Noise Filtering for Click-Through Rate Prediction
Click-through rate (CTR) prediction aims to recall the advertisements that users are interested in and to lead users to click, which is of critical importance for a variety of online advertising systems. In practice, CTR prediction is generally formulated as a conventional binary classification problem, where the clicked advertisements are positive samples and the others are negative samples. However, directly treating unclicked advertisements as negative samples would suffer from the severe label noise issue, since there exist many reasons why users are interested in a few advertisements but do not click. For instance, the layouts of such advertisements are not arresting to draw users’ attention. Moreover, in many online advertising systems, positive samples are only observed after relatively long delays. To address such serious issue, we propose a reinforcement learning based noise filtering approach, dubbed RLNF, which employs a noise filter to select effective negative samples. In RLNF, such selected, effective negative samples can be used to enhance the CTR prediction model, and meanwhile the effectiveness of the noise filter can be enhanced through reinforcement learning using the performance of CTR prediction model as reward. Actually, by alternating the enhancements of the noise filter and the CTR prediction model, the performance of both the noise filter and the CTR prediction model is improved. In our experiments, we equip 7 state-of-the-art CTR prediction models with RLNF. Extensive experiments on a public dataset and an industrial dataset present that RLNF significantly improves the performance of all these 7 CTR prediction models, which indicates both the effectiveness and the generality of RLNF.