Shi Weiwei, Gong Yihong, Tao Xiaoyu, Zheng Nanning
IEEE Trans Neural Netw Learn Syst. 2018 Jul;29(7):2896-2908. doi: 10.1109/TNNLS.2017.2705222. Epub 2017 Jun 13.
In this paper, we build a multilabel image classifier using a general deep convolutional neural network (DCNN). We propose a novel objective function that consists of three parts, i.e., max-margin objective, max-correlation objective, and correntropy loss. The max-margin objective explicitly enforces that the minimum score of positive labels must be larger than the maximum score of negative labels by a predefined margin, which not only improves accuracies of the multilabel classifier, but also eases the threshold determination. The max-correlation objective can make the DCNN model learn a latent semantic space, which maximizes the correlations between the feature vectors of the training samples and their corresponding ground-truth label vectors projected into this space. Instead of using the traditional softmax loss, we adopt the correntropy loss from the information theory field to minimize the training errors of the DCNN model. The proposed framework can be end-to-end trained. Comprehensive experimental evaluations on Pascal VOC 2007 and MIR Flickr 25K multilabel benchmark data sets with four DCNN models, i.e., AlexNet, VGG-16, GoogLeNet, and ResNet demonstrate that the proposed objective function can remarkably improve the performance accuracies of a DCNN model for the task of multilabel image classification.
在本文中,我们使用通用深度卷积神经网络(DCNN)构建了一个多标签图像分类器。我们提出了一种新颖的目标函数,它由三部分组成,即最大间隔目标、最大相关目标和核熵损失。最大间隔目标明确强制正标签的最小得分必须比负标签的最大得分大一个预定义的间隔,这不仅提高了多标签分类器的准确率,还简化了阈值确定。最大相关目标可以使DCNN模型学习一个潜在语义空间,该空间能最大化训练样本的特征向量与其投影到该空间中的相应真实标签向量之间的相关性。我们不是使用传统的softmax损失,而是采用信息论领域的核熵损失来最小化DCNN模型的训练误差。所提出的框架可以进行端到端训练。使用四个DCNN模型,即AlexNet、VGG - 16、GoogLeNet和ResNet,对Pascal VOC 2007和MIR Flickr 25K多标签基准数据集进行的综合实验评估表明,所提出的目标函数可以显著提高DCNN模型在多标签图像分类任务中的性能准确率。