Multimodal semi-supervised image classification by combining tag refinement, graph-based learning and support vector regression
We investigate an image classification task where the training images come along with tags, but only a subset being labeled, and the goal is to predict the class label of test images without tags. This task is crucial for image search engine on photo sharing websites. In previous work, it is handled by first learning a multiple kernel learning classifier using both image content and tags to score unlabeled training images, and then building up a least-squares regression (LSR) model on visual features to predict the label of test images. However, there exist three important issues in the task: 1) Image tags on photo sharing websites tend to be inaccurate and incomplete, and thus refining them is beneficial; 2) Supervised learning with a limited number of labeled samples may be unreliable to some extent, while a graph-based semi-supervised approach can be adopted by alsoconsideringsimilaritiesofunlabeleddata;3) LSR is established upon centered visual kernel columns and breaks the symmetry of kernel matrix, whereas support vector regression can readily use the original visual kernel and thus leverage its full power. To handle the task more effectively, we propose to combine tag refinement, graph-based learning and support vector regression together. Experimental results on the PASCAL VOC’07 and MIR Flickr datasets show the superior performance of the proposed approach.