Wang Zihao, Li Zhen, Li Xueyi, Chen Wenjie, Liu Xiangdong
IEEE Trans Neural Netw Learn Syst. 2024 Apr;35(4):4839-4851. doi: 10.1109/TNNLS.2022.3208837. Epub 2024 Apr 4.
Confronted with the task environment full of repetitive textures, the state-of-art description and detection methods for local features greatly suffer from the "pseudo-negatives," bringing inconsistent optimization objectives during training. To address this problem, this article develops a self-supervised graph-based contrastive learning framework to train the model for local features, GCLFeat. The proposed approach learns to alleviate the pseudo-negatives specifically from three aspects: 1) designing a graph neural network (GNN), which focuses on mining the local transformational invariance across different views and global textual knowledge within individual images; 2) generating the dense correspondence annotations from a diverse natural dataset with a self-supervised paradigm; and 3) adopting a keypoints-aware sampling strategy to compute the loss across the whole dataset. The experimental results show that the unsupervised framework outperforms the state-of-the-art supervised baselines on diverse downstream benchmarks including image matching, 3-D reconstruction and visual localization. The code will be made public and available at https://github.com/RealZihaoWang/GCLFeat.
面对充满重复纹理的任务环境,当前最先进的局部特征描述和检测方法深受“伪负样本”之苦,在训练过程中带来不一致的优化目标。为解决这一问题,本文开发了一种基于自监督图的对比学习框架来训练局部特征模型GCLFeat。所提出的方法从三个方面专门学习减轻伪负样本:1)设计一个图神经网络(GNN),其专注于挖掘不同视图之间的局部变换不变性以及单个图像内的全局文本知识;2)使用自监督范式从多样的自然数据集中生成密集对应标注;3)采用关键点感知采样策略来计算整个数据集上的损失。实验结果表明,该无监督框架在包括图像匹配、三维重建和视觉定位在内的各种下游基准测试中优于当前最先进的有监督基线。代码将在https://github.com/RealZihaoWang/GCLFeat上公开提供。