Wang Min, Zhou Wengang, Yao Xin, Tian Qi, Li Houqiang
IEEE Trans Pattern Anal Mach Intell. 2024 Jan;46(1):626-640. doi: 10.1109/TPAMI.2023.3324021. Epub 2023 Dec 5.
As a classical feature compression technique, quantization is usually coupled with inverted indices for scalable image retrieval. Most quantization methods explicitly divide feature space into Voronoi cells, and quantize feature vectors in each cell into the centroids learned from data distribution. However, Voronoi decomposition is difficult to achieve discriminative space partition for semantic image retrieval. In this paper, we explore semantic-aware feature space partition by deep neural network instead of Voronoi cells. To this end, we propose a new deep probabilistic quantization method, abbreviated as DeepIndex, which constructs inverted indices without explicit centroid learning. In our method, the deep neural network takes an image as input and outputs its probability of being put into each inverted index list. During training, we progressively quantize each image into the inverted lists with the top- T maximal probabilities, and calculate the reward of each trial based on retrieval accuracy. We optimize the deep neural network to maximize the probability of the inverted list with maximal reward. In this way, the retrieval performance is directly optimized, leading to a more semantically discriminative space partition than other quantization methods. The experiments on public image datasets demonstrate the effectiveness of our DeepIndex method on semantic image retrieval.
作为一种经典的特征压缩技术,量化通常与倒排索引相结合用于可扩展图像检索。大多数量化方法明确地将特征空间划分为Voronoi单元,并将每个单元中的特征向量量化为从数据分布中学习到的质心。然而,Voronoi分解难以实现用于语义图像检索的判别性空间划分。在本文中,我们探索通过深度神经网络而非Voronoi单元进行语义感知特征空间划分。为此,我们提出了一种新的深度概率量化方法,简称为DeepIndex,它在不进行显式质心学习的情况下构建倒排索引。在我们的方法中,深度神经网络将图像作为输入,并输出其被放入每个倒排索引列表的概率。在训练期间,我们逐步将每个图像量化到具有前T个最大概率的倒排列表中,并根据检索精度计算每次试验的奖励。我们优化深度神经网络以最大化具有最大奖励的倒排列表的概率。通过这种方式,检索性能得到直接优化,从而导致比其他量化方法更具语义判别性的空间划分。在公共图像数据集上的实验证明了我们的DeepIndex方法在语义图像检索上的有效性。