Chen Shubai, Wu Song, Wang Li
College of Computer and Information Science, Southwest University, Chongqing, People's Republic of China.
College of Electronic and Information Engineering, Southwest University, Chongqing, People's Republic of China.
PeerJ Comput Sci. 2021 May 25;7:e552. doi: 10.7717/peerj-cs.552. eCollection 2021.
Due to the high efficiency of hashing technology and the high abstraction of deep networks, deep hashing has achieved appealing effectiveness and efficiency for large-scale cross-modal retrieval. However, how to efficiently measure the similarity of fine-grained multi-labels for multi-modal data and thoroughly explore the intermediate layers specific information of networks are still two challenges for high-performance cross-modal hashing retrieval. Thus, in this paper, we propose a novel Hierarchical Semantic Interaction-based Deep Hashing Network (HSIDHN) for large-scale cross-modal retrieval. In the proposed HSIDHN, the multi-scale and fusion operations are first applied to each layer of the network. A Bidirectional Bi-linear Interaction (BBI) policy is then designed to achieve the hierarchical semantic interaction among different layers, such that the capability of hash representations can be enhanced. Moreover, a dual-similarity measurement ("hard" similarity and "soft" similarity) is designed to calculate the semantic similarity of different modality data, aiming to better preserve the semantic correlation of multi-labels. Extensive experiment results on two large-scale public datasets have shown that the performance of our HSIDHN is competitive to state-of-the-art deep cross-modal hashing methods.
由于哈希技术的高效性和深度网络的高度抽象性,深度哈希在大规模跨模态检索中取得了引人注目的有效性和效率。然而,如何有效地度量多模态数据的细粒度多标签的相似性以及深入探索网络中间层的特定信息,仍然是高性能跨模态哈希检索面临的两个挑战。因此,在本文中,我们提出了一种用于大规模跨模态检索的基于分层语义交互的深度哈希网络(HSIDHN)。在所提出的HSIDHN中,首先将多尺度和融合操作应用于网络的每一层。然后设计了一种双向双线性交互(BBI)策略,以实现不同层之间的分层语义交互,从而增强哈希表示的能力。此外,设计了一种双相似性度量(“硬”相似性和“软”相似性)来计算不同模态数据的语义相似性,旨在更好地保留多标签的语义相关性。在两个大规模公共数据集上的大量实验结果表明,我们的HSIDHN的性能与当前最先进的深度跨模态哈希方法具有竞争力。