多模态判别式二值嵌入的大规模跨模态检索。

Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval.

出版信息

IEEE Trans Image Process. 2016 Oct;25(10):4540-54. doi: 10.1109/TIP.2016.2592800. Epub 2016 Jul 18.

DOI:10.1109/TIP.2016.2592800

Abstract

Multimodal hashing, which conducts effective and efficient nearest neighbor search across heterogeneous data on large-scale multimedia databases, has been attracting increasing interest, given the explosive growth of multimedia content on the Internet. Recent multimodal hashing research mainly aims at learning the compact binary codes to preserve semantic information given by labels. The overwhelming majority of these methods are similarity preserving approaches which approximate pairwise similarity matrix with Hamming distances between the to-be-learnt binary hash codes. However, these methods ignore the discriminative property in hash learning process, which results in hash codes from different classes undistinguished, and therefore reduces the accuracy and robustness for the nearest neighbor search. To this end, we present a novel multimodal hashing method, named multimodal discriminative binary embedding (MDBE), which focuses on learning discriminative hash codes. First, the proposed method formulates the hash function learning in terms of classification, where the binary codes generated by the learned hash functions are expected to be discriminative. And then, it exploits the label information to discover the shared structures inside heterogeneous data. Finally, the learned structures are preserved for hash codes to produce similar binary codes in the same class. Hence, the proposed MDBE can preserve both discriminability and similarity for hash codes, and will enhance retrieval accuracy. Thorough experiments on benchmark data sets demonstrate that the proposed method achieves excellent accuracy and competitive computational efficiency compared with the state-of-the-art methods for large-scale cross-modal retrieval task.

摘要

多模态散列技术在互联网上多媒体内容的爆炸式增长的背景下，因其可以在大规模多媒体数据库中对异构数据进行有效和高效的最近邻搜索而受到越来越多的关注。最近的多模态散列研究主要旨在学习紧凑的二进制代码以保留标签提供的语义信息。这些方法中的绝大多数都是相似性保留方法，即用待学习的二进制散列码之间的汉明距离来近似成对相似矩阵。然而，这些方法忽略了散列学习过程中的判别特性，导致不同类别的散列码无法区分，从而降低了最近邻搜索的准确性和鲁棒性。为此，我们提出了一种新的多模态散列方法，称为多模态判别二进制嵌入（MDBE），它专注于学习判别散列码。首先，该方法从分类的角度来表述散列函数的学习，即期望通过学习的散列函数生成的二进制代码具有判别性。然后，它利用标签信息来发现异构数据中的共享结构。最后，将学习到的结构保留在散列码中，以生成同一类中相似的二进制码。因此，所提出的 MDBE 可以同时保留散列码的判别能力和相似性，从而提高检索的准确性。在基准数据集上的大量实验表明，与大规模跨模态检索任务的最新方法相比，该方法具有出色的准确性和有竞争力的计算效率。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

多模态判别式二值嵌入的大规模跨模态检索。

Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval.

出版信息

相似文献

引用本文的文献

多模态判别式二值嵌入的大规模跨模态检索。

Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval.

出版信息

相似文献

引用本文的文献