Video Annotation Through Search and Graph Reinforcement Mining

Emily Moxley; Tao Mei; B. S. Manjunath

Video Annotation Through Search and Graph Reinforcement Mining

Emily Moxley ,
Tao Mei ,
B. S. Manjunath

IEEE Trans. on Multimedia | August 2010

2013 IEEE Transactions on Multimedia Prize Paper Award

下载 BibTex

Unlimited vocabulary annotation of multimedia documents remains elusive despite progress solving the problem in the case of a small, fixed lexicon. Taking advantage of the repetitive nature of modern information and online media databases with independent annotation instances, we present an approach to automatically annotate multimedia documents that uses mining techniques to discover new annotations from similar documents and to filter existing incorrect annotations. The annotation set is not limited to words that have training data or for which models have been created. It is limited only by the words in the collective annotation vocabulary of all the database documents. A graph reinforcement method driven by a particular modality (e.g., visual) is used to determine the contribution of a similar document to the annotation target. The graph supplies possible annotations of a different modality (e.g., text) that can be mined for annotations of the target. Experiments are performed using videos crawled from YouTube. A customized precision-recall metric shows that the annotations obtained using the proposed method are superior to those originally existing for the document. These extended, filtered tags are also superior to a state-of-the-art semi-supervised technique for graph reinforcement learning on the initial user-supplied annotations.