Suppr超能文献

用于行人重识别中生成数据的多伪正则化标签

Multi-pseudo Regularized Label for Generated Data in Person Re-Identification.

作者信息

Huang Yan, Xu Jingsong, Wu Qiang, Zheng Zhedong, Zhang Zhaoxiang, Zhang Jian

出版信息

IEEE Trans Image Process. 2018 Oct 8. doi: 10.1109/TIP.2018.2874715.

Abstract

Sufficient training data normally is required to train deeply learned models. However, due to the expensive manual process for labelling large number of images (i.e., annotation), the amount of available training data (i.e., real data) is always limited. To produce more data for training a deep network, Generative Adversarial Network (GAN) can be used to generate artificial sample data (i.e., generated data). However, the generated data usually does not have annotation labels. To solve this problem, in this paper, we propose a virtual label called Multi-pseudo Regularized Label (MpRL) and assign it to the generated data. With MpRL, the generated data will be used as the supplementary of real training data to train a deep neural network in a semi-supervised learning fashion. To build the corresponding relationship between the real data and generated data, MpRL assigns each generated data a proper virtual label which reflects the likelihood of the affiliation of the generated data to predefined training classes in the real data domain. Unlike the traditional label which usually is a single integral number, the virtual label proposed in this work is a set of weight-based values each individual of which is a number in (0,1] called multi-pseudo label and reflects the degree of relation between each generated data to every pre-defined class of real data. A comprehensive evaluation is carried out by adopting two state-of-the-art convolutional neural networks (CNNs) in our experiments to verify the effectiveness of MpRL. Experiments demonstrate that by assigning MpRL to generated data, we can further improve the person re-ID performance on five re-ID datasets, i.e., Market-1501, DukeMTMC-reID, CUHK03, VIPeR, and CUHK01. The proposed method obtains +6.29%, +6.30%, +5.58%, +5.84%, and +3.48% improvements in rank-1 accuracy over a strong CNN baseline on the five datasets respectively, and outperforms state-of-the-art methods.

摘要

通常需要足够的训练数据来训练深度模型。然而,由于大量图像标注的人工过程成本高昂(即注释),可用训练数据(即真实数据)的数量总是有限的。为了生成更多数据来训练深度网络,可以使用生成对抗网络(GAN)来生成人工样本数据(即生成数据)。然而,生成的数据通常没有注释标签。为了解决这个问题,在本文中,我们提出了一种名为多伪正则化标签(MpRL)的虚拟标签,并将其分配给生成的数据。有了MpRL,生成的数据将作为真实训练数据的补充,以半监督学习的方式训练深度神经网络。为了建立真实数据和生成数据之间的对应关系,MpRL为每个生成的数据分配一个适当的虚拟标签,该标签反映了生成的数据在真实数据域中属于预定义训练类别的可能性。与传统标签通常是单个整数不同,本文提出的虚拟标签是一组基于权重的值,其中每个值都是一个在(0,1]范围内的数字,称为多伪标签,反映了每个生成的数据与真实数据的每个预定义类别的关系程度。在我们的实验中,采用两个最先进的卷积神经网络(CNN)进行了全面评估,以验证MpRL的有效性。实验表明,通过将MpRL分配给生成的数据,我们可以在五个重识别数据集(即Market-1501、DukeMTMC-reID、CUHK03、VIPeR和CUHK01)上进一步提高行人重识别性能。所提出的方法在这五个数据集上的秩-1准确率分别比强大的CNN基线提高了+6.29%、+6.30%、+5.58%、+5.84%和+3.48%,并且优于最先进的方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验