NICE: Neural Image Commenting Evaluation with an Emphasis on Emotion and Empathy
- Kezhen Chen ,
- Qiuyuan Huang ,
- Daniel McDuff ,
- Jianfeng Wang ,
- Hamid Palangi ,
- Xiang Gao ,
- Kevin Li ,
- Kenneth Forbus ,
- Jianfeng Gao
NeurIPS 2020, Contributed talk in HLDS workshop. |
Emotion and empathy are examples of human qualities lacking in many human-machine interactions. The goal of our work is to generate engaging dialogue grounded in a user-shared image with increased emotion and empathy while minimizing socially inappropriate or offensive outputs. We release the Neural Image Commenting Evaluation (NICE) dataset consisting of almost two million images and their corresponding, human-generated comments, as well as a set of baseline models and over 28,000 human annotated samples. Instead of relying on manually labeled emotions, we also use automatically generated linguistic representations as a source of weakly supervised labels. Based on the annotations, we define two different task settings on the NICE dataset. Then, we propose a novel model-Modeling Affect Generation for Image Comments (MAGIC) – which aims to generate comments for images, conditioned on linguistic representations that capture style and affect, and to help generate more empathetic, emotional, engaging and socially appropriate comments. Using this model we achieve state-of-the-art performance on one setting and set a benchmark for the NICE dataset. Experiments show that our proposed method can generate more human-like and engaging image comments.