Suppr超能文献

GFA-Net:用于六自由度物体姿态估计的几何聚焦注意力网络

GFA-Net: Geometry-Focused Attention Network for Six Degrees of Freedom Object Pose Estimation.

作者信息

Lin Shuai, Yu Junhui, Su Peng, Xue Weitao, Qin Yang, Fu Lina, Wen Jing, Huang Hong

机构信息

Shandong Non-Metallic Materials Institute, Jinan 250031, China.

Key Laboratory of Optoelectronic Technology and Systems of the Education Ministry of China, Chongqing University, Chongqing 400044, China.

出版信息

Sensors (Basel). 2024 Dec 31;25(1):168. doi: 10.3390/s25010168.

Abstract

Six degrees of freedom (6-DoF) object pose estimation is essential for robotic grasping and autonomous driving. While estimating pose from a single RGB image is highly desirable for real-world applications, it presents significant challenges. Many approaches incorporate supplementary information, such as depth data, to derive valuable geometric characteristics. However, the challenge of deep neural networks inadequately extracting features from object regions in RGB images remains. To overcome these limitations, we introduce the Geometry-Focused Attention Network (GFA-Net), a novel framework designed for more comprehensive feature extraction by analyzing critical geometric and textural object characteristics. GFA-Net leverages Point-wise Feature Attention (PFA) to capture subtle pose differences, guiding the network to localize object regions and identify point-wise discrepancies as pose shifts. In addition, a Geometry Feature Aggregation Module (GFAM) integrates multi-scale geometric feature maps to distill crucial geometric features. Then, the resulting dense 2D-3D correspondences are passed to a Perspective-n-Point (PnP) module for 6-DoF pose computation. Experimental results on the LINEMOD and Occlusion LINEMOD datasets indicate that our proposed method is highly competitive with state-of-the-art approaches, achieving 96.54% and 49.35% accuracy, respectively, utilizing the ADD-S metric with a 0.10d threshold.

摘要

六自由度(6-DoF)物体姿态估计对于机器人抓取和自动驾驶至关重要。虽然从单张RGB图像估计姿态在实际应用中非常理想,但它带来了重大挑战。许多方法纳入了补充信息,如深度数据,以获取有价值的几何特征。然而,深度神经网络难以从RGB图像中的物体区域充分提取特征的挑战依然存在。为了克服这些限制,我们引入了几何聚焦注意力网络(GFA-Net),这是一个新颖的框架,旨在通过分析关键的几何和纹理物体特征进行更全面的特征提取。GFA-Net利用逐点特征注意力(PFA)来捕捉细微的姿态差异,引导网络定位物体区域并将逐点差异识别为姿态变化。此外,一个几何特征聚合模块(GFAM)整合多尺度几何特征图以提炼关键几何特征。然后,将得到的密集2D-3D对应关系传递到透视n点(PnP)模块进行六自由度姿态计算。在LINEMOD和遮挡LINEMOD数据集上的实验结果表明,我们提出的方法与最先进的方法相比具有很强的竞争力,分别使用0.10d阈值的ADD-S度量标准,准确率达到96.54%和49.35%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b91/11722991/ee270c4d4fbf/sensors-25-00168-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验