Deng Cheng, Chen Zhaojia, Liu Xianglong, Gao Xinbo, Tao Dacheng
IEEE Trans Image Process. 2018 Aug;27(8):3893-3903. doi: 10.1109/TIP.2018.2821921. Epub 2018 Apr 4.
Given the benefits of its low storage requirements and high retrieval efficiency, hashing has recently received increasing attention. In particular, cross-modal hashing has been widely and successfully used in multimedia similarity search applications. However, almost all existing methods employing cross-modal hashing cannot obtain powerful hash codes due to their ignoring the relative similarity between heterogeneous data that contains richer semantic information, leading to unsatisfactory retrieval performance. In this paper, we propose a tripletbased deep hashing (TDH) network for cross-modal retrieval. First, we utilize the triplet labels, which describes the relative relationships among three instances as supervision in order to capture more general semantic correlations between cross-modal instances. We then establish a loss function from the inter-modal view and the intra-modal view to boost the discriminative abilities of the hash codes. Finally, graph regularization is introduced into our proposed TDH method to preserve the original semantic similarity between hash codes in Hamming space. Experimental results show that our proposed method outperforms several state-of-the-art approaches on two popular cross-modal datasets.
鉴于哈希具有低存储需求和高检索效率的优点,近年来受到了越来越多的关注。特别是,跨模态哈希已被广泛且成功地应用于多媒体相似性搜索应用中。然而,几乎所有现有的采用跨模态哈希的方法都无法获得强大的哈希码,因为它们忽略了包含更丰富语义信息的异构数据之间的相对相似性,导致检索性能不尽人意。在本文中,我们提出了一种用于跨模态检索的基于三元组的深度哈希(TDH)网络。首先,我们利用描述三个实例之间相对关系的三元组标签作为监督,以便捕获跨模态实例之间更一般的语义相关性。然后,我们从模态间视角和模态内视角建立一个损失函数,以增强哈希码的判别能力。最后,将图正则化引入我们提出的TDH方法中,以保持汉明空间中哈希码之间的原始语义相似性。实验结果表明,我们提出的方法在两个流行的跨模态数据集上优于几种现有技术方法。