Jin Lu, Li Kai, Li Zechao, Xiao Fu, Qi Guo-Jun, Tang Jinhui
IEEE Trans Neural Netw Learn Syst. 2019 May;30(5):1429-1440. doi: 10.1109/TNNLS.2018.2869601. Epub 2018 Oct 1.
Cross-modal hashing has attracted increasing research attention due to its efficiency for large-scale multimedia retrieval. With simultaneous feature representation and hash function learning, deep cross-modal hashing (DCMH) methods have shown superior performance. However, most existing methods on DCMH adopt binary quantization functions (e.g., [Formula: see text]) to generate hash codes, which limit the retrieval performance since binary quantization functions are sensitive to the variations of numeric values. Toward this end, we propose a novel end-to-end ranking-based hashing framework, in this paper, termed as deep semantic-preserving ordinal hashing (DSPOH), to learn hash functions with deep neural networks by exploring the ranking structure of feature dimensions. In DSPOH, the ordinal representation, which encodes the relative rank ordering of feature dimensions, is explored to generate hash codes. Such ordinal embedding benefits from the numeric stability of rank correlation measures. To make the hash codes discriminative, the ordinal representation is expected to well predict the class labels so that the ranking-based hash function learning is optimally compatible with the label predicting. Meanwhile, the intermodality similarity is preserved to guarantee that the hash codes of different modalities are consistent. Importantly, DSPOH can be effectively integrated with different types of network architectures, which demonstrates the flexibility and scalability of our proposed hashing framework. Extensive experiments on three widely used multimodal data sets show that DSPOH outperforms state of the art for cross-modal retrieval tasks.
跨模态哈希因其在大规模多媒体检索中的高效性而受到越来越多的研究关注。通过同时进行特征表示和哈希函数学习,深度跨模态哈希(DCMH)方法已展现出卓越的性能。然而,大多数现有的DCMH方法采用二进制量化函数(例如,[公式:见正文])来生成哈希码,这限制了检索性能,因为二进制量化函数对数值变化敏感。为此,我们在本文中提出了一种新颖的基于排序的端到端哈希框架,称为深度语义保留序数哈希(DSPOH),通过探索特征维度的排序结构,利用深度神经网络学习哈希函数。在DSPOH中,探索了对特征维度的相对排序进行编码的序数表示来生成哈希码。这种序数嵌入受益于秩相关度量的数值稳定性。为使哈希码具有判别力,期望序数表示能很好地预测类别标签,从而使基于排序的哈希函数学习与标签预测最优地兼容。同时,保留跨模态相似性以确保不同模态的哈希码一致。重要的是,DSPOH可以有效地与不同类型的网络架构集成,这证明了我们提出的哈希框架的灵活性和可扩展性。在三个广泛使用的多模态数据集上进行的大量实验表明,DSPOH在跨模态检索任务中优于现有技术。