IEEE Trans Pattern Anal Mach Intell. 2019 Jun;41(6):1501-1514. doi: 10.1109/TPAMI.2018.2833865. Epub 2018 May 8.
Binary descriptors have been widely used for efficient image matching and retrieval. However, most existing binary descriptors are designed with hand-craft sampling patterns or learned with label annotation provided by datasets. In this paper, we propose a new unsupervised deep learning approach, called DeepBit, to learn compact binary descriptor for efficient visual object matching. We enforce three criteria on binary descriptors which are learned at the top layer of the deep neural network: 1) minimal quantization loss, 2) evenly distributed codes and 3) transformation invariant bit. Then, we estimate the parameters of the network through the optimization of the proposed objectives with a back-propagation technique. Extensive experimental results on various visual recognition tasks demonstrate the effectiveness of the proposed approach. We further demonstrate our proposed approach can be realized on the simplified deep neural network, and enables efficient image matching and retrieval speed with very competitive accuracies.
二进制描述符已被广泛用于高效的图像匹配和检索。然而,大多数现有的二进制描述符是通过手工制作的采样模式设计的,或者是利用数据集提供的标签注释进行学习的。在本文中,我们提出了一种新的无监督深度学习方法,称为深度位(DeepBit),用于学习紧凑的二进制描述符以实现高效的视觉对象匹配。我们对在深度神经网络顶层学习的二进制描述符强制执行三个标准:1)最小量化损失,2)均匀分布的代码,3)变换不变位。然后,我们通过使用反向传播技术优化所提出的目标来估计网络的参数。在各种视觉识别任务上的大量实验结果证明了所提出方法的有效性。我们进一步证明,我们提出的方法可以在简化的深度神经网络上实现,并能够以极具竞争力的准确率实现高效的图像匹配和检索速度。