深度无参注意力哈希检索图像。

Deep parameter-free attention hashing for image retrieval.

机构信息

College of Software, Xinjiang University, Urumqi, 830046, China.

College of Information Science and Engineering, Xinjiang University, Urumqi, 830046, China.

出版信息

Sci Rep. 2022 Apr 30;12(1):7082. doi: 10.1038/s41598-022-11217-5.

DOI:10.1038/s41598-022-11217-5

PMID:35490175

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9056524/

Abstract

Deep hashing method is widely applied in the field of image retrieval because of its advantages of low storage consumption and fast retrieval speed. There is a defect of insufficiency feature extraction when existing deep hashing method uses the convolutional neural network (CNN) to extract images semantic features. Some studies propose to add channel-based or spatial-based attention modules. However, embedding these modules into the network can increase the complexity of model and lead to over fitting in the training process. In this study, a novel deep parameter-free attention hashing (DPFAH) is proposed to solve these problems, that designs a parameter-free attention (PFA) module in ResNet18 network. PFA is a lightweight module that defines an energy function to measure the importance of each neuron and infers 3-D attention weights for feature map in a layer. A fast closed-form solution for this energy function proves that the PFA module does not add any parameters to the network. Otherwise, this paper designs a novel hashing framework that includes the hash codes learning branch and the classification branch to explore more label information. The like-binary codes are constrained by a regulation term to reduce the quantization error in the continuous relaxation. Experiments on CIFAR-10, NUS-WIDE and Imagenet-100 show that DPFAH method achieves better performance.

摘要

深度哈希方法由于其存储消耗低、检索速度快的优点，在图像检索领域得到了广泛的应用。现有的深度哈希方法在使用卷积神经网络（CNN）提取图像语义特征时，存在特征提取不足的缺陷。一些研究提出添加基于通道或基于空间的注意力模块。然而，将这些模块嵌入网络会增加模型的复杂性，并导致训练过程中的过拟合。在这项研究中，提出了一种新颖的无参数深度注意力哈希（DPFAH）方法来解决这些问题，该方法在 ResNet18 网络中设计了一个无参数注意力（PFA）模块。PFA 是一个轻量级模块，它定义了一个能量函数来衡量每个神经元的重要性，并为该层的特征图推断出 3D 注意力权重。该能量函数的快速闭式解证明 PFA 模块不会向网络添加任何参数。此外，本文设计了一种新颖的哈希框架，包括哈希码学习分支和分类分支，以探索更多的标签信息。相似的二进制码受到正则项的约束，以减少连续松弛中的量化误差。在 CIFAR-10、NUS-WIDE 和 Imagenet-100 上的实验表明，DPFAH 方法取得了更好的性能。