Yang Erkun, Deng Cheng, Li Chao, Liu Wei, Li Jie, Tao Dacheng
IEEE Trans Neural Netw Learn Syst. 2018 Nov;29(11):5292-5303. doi: 10.1109/TNNLS.2018.2793863. Epub 2018 Feb 14.
With explosive growth of data volume and ever-increasing diversity of data modalities, cross-modal similarity search, which conducts nearest neighbor search across different modalities, has been attracting increasing interest. This paper presents a deep compact code learning solution for efficient cross-modal similarity search. Many recent studies have proven that quantization-based approaches perform generally better than hashing-based approaches on single-modal similarity search. In this paper, we propose a deep quantization approach, which is among the early attempts of leveraging deep neural networks into quantization-based cross-modal similarity search. Our approach, dubbed shared predictive deep quantization (SPDQ), explicitly formulates a shared subspace across different modalities and two private subspaces for individual modalities, and representations in the shared subspace and the private subspaces are learned simultaneously by embedding them to a reproducing kernel Hilbert space, where the mean embedding of different modality distributions can be explicitly compared. In addition, in the shared subspace, a quantizer is learned to produce the semantics preserving compact codes with the help of label alignment. Thanks to this novel network architecture in cooperation with supervised quantization training, SPDQ can preserve intramodal and intermodal similarities as much as possible and greatly reduce quantization error. Experiments on two popular benchmarks corroborate that our approach outperforms state-of-the-art methods.
随着数据量的爆炸式增长以及数据模态的日益多样化,跨模态相似性搜索(即在不同模态间进行最近邻搜索)已引起越来越多的关注。本文提出了一种用于高效跨模态相似性搜索的深度紧凑编码学习解决方案。最近的许多研究表明,在单模态相似性搜索中,基于量化的方法通常比基于哈希的方法表现更好。在本文中,我们提出了一种深度量化方法,这是将深度神经网络应用于基于量化的跨模态相似性搜索的早期尝试之一。我们的方法称为共享预测深度量化(SPDQ),它明确地在不同模态间构建一个共享子空间以及为各个模态构建两个私有子空间,并通过将它们嵌入到再生核希尔伯特空间中同时学习共享子空间和私有子空间中的表示,在该空间中可以明确比较不同模态分布的均值嵌入。此外,在共享子空间中,借助标签对齐学习一个量化器以生成保留语义的紧凑编码。得益于这种新颖的网络架构与监督量化训练的结合,SPDQ能够尽可能多地保留模态内和模态间的相似性,并大大降低量化误差。在两个流行基准上的实验证实,我们的方法优于现有方法。