Suppr超能文献

上下文补丁网络局部聚合描述符:用于视觉场所识别的上下文感知补丁特征描述符和补丁匹配机制

Contextual Patch-NetVLAD: Context-Aware Patch Feature Descriptor and Patch Matching Mechanism for Visual Place Recognition.

作者信息

Sun Wenyuan, Chen Wentang, Huang Runxiang, Tian Jing

机构信息

Institute of Systems Science, National University of Singapore, Singapore 119615, Singapore.

State Key Laboratory of Fluid Power and Mechatronic Systems, School of Mechanical Engineering, Zhejiang University, Hangzhou 310027, China.

出版信息

Sensors (Basel). 2024 Jan 28;24(3):855. doi: 10.3390/s24030855.

Abstract

The goal of (VPR) is to determine the location of a query image by identifying its place in a collection of image databases. Visual sensor technologies are crucial for visual place recognition as they allow for precise identification and location of query images within a database. Global descriptor-based VPR methods face the challenge of accurately capturing the local specific regions within a scene; consequently, it leads to an increasing probability of confusion during localization in such scenarios. To tackle feature extraction and feature matching challenges in VPR, we propose a modified patch-NetVLAD strategy that includes two new modules: a and a mechanism. Firstly, we propose a context-driven patch feature descriptor to overcome the limitations of global and local descriptors in visual place recognition. This descriptor aggregates features from each patch's surrounding neighborhood. Secondly, we introduce a context-driven feature matching mechanism that utilizes cluster and saliency context-driven weighting rules to assign higher weights to patches that are less similar to densely populated or locally similar regions for improved localization performance. We further incorporate both of these modules into the patch-NetVLAD framework, resulting in a new approach called . Experimental results are provided to show that our proposed approach outperforms other state-of-the-art methods to achieve a Recall@10 score of 99.82 on Pittsburgh30k, 99.82 on FMDataset, and 97.68 on our benchmark dataset.

摘要

视觉场所识别(VPR)的目标是通过确定查询图像在图像数据库集合中的位置来找到其所在位置。视觉传感器技术对于视觉场所识别至关重要,因为它们能够在数据库中精确识别和定位查询图像。基于全局描述符的VPR方法面临着准确捕捉场景中局部特定区域的挑战;因此,在这种情况下进行定位时混淆的可能性会增加。为了解决VPR中的特征提取和特征匹配挑战,我们提出了一种改进的补丁NetVLAD策略,该策略包括两个新模块:一个[此处原文缺失具体模块名称]和一个[此处原文缺失具体模块名称]机制。首先,我们提出了一种上下文驱动的补丁特征描述符,以克服全局和局部描述符在视觉场所识别中的局限性。该描述符聚合来自每个补丁周围邻域的特征。其次,我们引入了一种上下文驱动的特征匹配机制,该机制利用聚类和显著上下文驱动的加权规则,为与人口密集或局部相似区域不太相似的补丁分配更高的权重,以提高定位性能。我们进一步将这两个模块都纳入补丁NetVLAD框架,从而产生了一种名为[此处原文缺失具体名称]的新方法。实验结果表明,我们提出的方法优于其他现有方法,在匹兹堡30k数据集上的召回率@10得分为99.82,在FMDataset数据集上为99.82,在我们的基准数据集上为97.68。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c932/10857504/f02368d303a3/sensors-24-00855-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验