Kan Shichao, Cen Yigang, He Zhihai, Zhang Zhi, Zhang Linna, Wang Yanhong
IEEE Trans Image Process. 2019 Dec;28(12):5809-5823. doi: 10.1109/TIP.2019.2901407. Epub 2019 Feb 25.
Image representation methods based on deep convolutional neural networks (CNNs) have achieved the state-of-the-art performance in various computer vision tasks, such as image retrieval and person re-identification. We recognize that more discriminative feature embeddings can be learned with supervised deep metric learning and handcrafted features for image retrieval and similar applications. In this paper, we propose a new supervised deep feature embedding with a handcrafted feature model. To fuse handcrafted feature information into CNNs and realize feature embeddings, a general fusion unit is proposed (called Fusion-Net). We also define a network loss function with image label information to realize supervised deep metric learning. Our extensive experimental results on the Stanford online products' data set and the in-shop clothes retrieval data set demonstrate that our proposed methods outperform the existing state-of-the-art methods of image retrieval by a large margin. Moreover, we also explore the applications of the proposed methods in person re-identification and vehicle re-identification; the experimental results demonstrate both the effectiveness and efficiency of the proposed methods.
基于深度卷积神经网络(CNN)的图像表示方法在各种计算机视觉任务中取得了领先的性能,如图像检索和行人重识别。我们认识到,通过监督深度度量学习和用于图像检索及类似应用的手工特征,可以学习到更具判别力的特征嵌入。在本文中,我们提出了一种带有手工特征模型的新型监督深度特征嵌入方法。为了将手工特征信息融合到CNN中并实现特征嵌入,我们提出了一个通用融合单元(称为融合网络)。我们还定义了一个带有图像标签信息的网络损失函数,以实现监督深度度量学习。我们在斯坦福在线产品数据集和店内服装检索数据集上的大量实验结果表明,我们提出的方法在图像检索方面比现有的领先方法有大幅提升。此外,我们还探索了所提出方法在行人重识别和车辆重识别中的应用;实验结果证明了所提出方法的有效性和效率。