Reinforcing Pretrained Models for Generating Attractive Text Advertisements
- Xiting Wang ,
- Xinwei Gu ,
- Jie Cao ,
- Zihua Zhao ,
- Yulan Yan ,
- Bhuvan Middha ,
- Xing Xie
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM SIGKDD) |
Applied Data Science Track
We study how pretrained language models can be enhanced by using deep reinforcement learning to generate attractive text advertisements that reach the high quality standard of real-world advertiser mediums. To improve ad attractiveness without hampering user experience, we propose a model-based reinforcement learning framework for text ad generation, which constructs a model for the environment dynamics and avoids large sample complexity. Based on the framework, we develop Masked-Sequence Policy Gradient, a reinforcement learning algorithm that integrates efficiently with pretrained models and explores the action space effectively. Our method has been deployed to production in Microsoft Bing. Automatic offline experiments, human evaluation, and online experiments demonstrate the superior performance of our method.
Translations: Urdu translation (opens in new tab)