通用哈希问题的等效连续公式。

Equivalent Continuous Formulation of General Hashing Problem.

作者信息

Wang Shengnan, Li Chunguang, Shen Hui-Liang

出版信息

IEEE Trans Cybern. 2021 Aug;51(8):4089-4099. doi: 10.1109/TCYB.2019.2894020. Epub 2021 Aug 4.

DOI:10.1109/TCYB.2019.2894020

Abstract

Hashing-based approximate nearest neighbors search has attracted broad research interest, due to its low computational cost and fast retrieval speed. The hashing technique maps the data points into binary codes and, meanwhile, preserves the similarity in the original space. Generally, we need to solve a discrete optimization problem to learn the binary codes and hash functions, which is NP-hard. In the literature, most hashing methods choose to solve a relaxed problem by discarding the discrete constraints. However, such a relaxation scheme will cause large quantization error, which makes the learned binary codes less effective. In this paper, we present an equivalent continuous formulation of the discrete hashing problem. Specifically, we show that the discrete hashing problem can be transformed into a continuous optimization problem without any relaxations, while the transformed continuous optimization problem has the same optimal solutions and the same optimal value as the original discrete hashing problem. After transformation, the continuous optimization methods can be applied. We devise the algorithms based on the idea of DC (difference of convex functions) programming to solve this problem. The proposed continuous hashing scheme can be easily applied to the existing hashing models, including both supervised and unsupervised hashing. We evaluate the proposed method on several benchmarks and the results show the superiority of the proposed method compared with the state-of-the-art hashing methods.

摘要

基于哈希的近似最近邻搜索因其低计算成本和快速检索速度而吸引了广泛的研究兴趣。哈希技术将数据点映射为二进制代码，同时保留原始空间中的相似性。通常，我们需要解决一个离散优化问题来学习二进制代码和哈希函数，这是一个NP难问题。在文献中，大多数哈希方法选择通过丢弃离散约束来解决一个松弛问题。然而，这种松弛方案会导致较大的量化误差，使得学习到的二进制代码效果较差。在本文中，我们提出了离散哈希问题的一个等价连续形式。具体来说，我们表明离散哈希问题可以在不进行任何松弛的情况下转化为一个连续优化问题，而转化后的连续优化问题与原始离散哈希问题具有相同的最优解和相同的最优值。转化后，可以应用连续优化方法。我们基于DC（凸函数之差）规划的思想设计算法来解决这个问题。所提出的连续哈希方案可以很容易地应用于现有的哈希模型，包括有监督和无监督哈希。我们在几个基准上评估了所提出的方法，结果表明所提出的方法相对于现有最先进的哈希方法具有优越性。