Suppr超能文献

单声道AMP:用于单目3D车辆检测的自适应多阶感知聚合

MonoAMP: Adaptive Multi-Order Perceptual Aggregation for Monocular 3D Vehicle Detection.

作者信息

Hu Xiaoxi, Chen Tao, Zhang Wentao, Ji Guangyi, Jia Hongxia

机构信息

School of Mathematics and Computer Science, Shaanxi University of Technology, Hanzhong 723001, China.

出版信息

Sensors (Basel). 2025 Jan 28;25(3):787. doi: 10.3390/s25030787.

Abstract

Monocular 3D object detection is rapidly emerging as a key research direction in autonomous driving, owing to its resource efficiency and ease of implementation. However, existing methods face certain limitations in cross-dimensional feature attention mechanisms and multi-order contextual information modeling, which constrain their detection performance in complex scenes. Thus, we propose MonoAMP, an adaptive multi-order perceptual aggregation algorithm for monocular 3D object detection. We first introduce triplet attention to enhance the interaction of cross-dimensional feature attention. Second, we design an adaptive multi-order perceptual aggregation module. It dynamically captures multi-order contextual information and employs an adaptive aggregation strategy to enhance target perception. Finally, we propose an uncertainty-guided adaptive depth ensemble strategy, which models the uncertainty distribution in depth estimation and effectively fuses multiple depth predictions. Experiments demonstrate that MonoAMP significantly enhances performance on the KITTI dataset at the moderate difficulty level, achieving 16.80% AP3D and 24.47% APBEV. Additionally, the ablation study shows a 3.78% improvement in object detection accuracy over the baseline method. Compared to other advanced methods, MonoAMP demonstrates superior detection capabilities, especially in complex scenarios.

摘要

单目3D目标检测因其资源效率高和易于实现,正迅速成为自动驾驶领域的一个关键研究方向。然而,现有方法在跨维度特征注意力机制和多阶上下文信息建模方面存在一定局限性,这限制了它们在复杂场景中的检测性能。因此,我们提出了MonoAMP,一种用于单目3D目标检测的自适应多阶感知聚合算法。我们首先引入三元组注意力来增强跨维度特征注意力的交互。其次,我们设计了一个自适应多阶感知聚合模块。它动态捕捉多阶上下文信息,并采用自适应聚合策略来增强目标感知。最后,我们提出了一种不确定性引导的自适应深度融合策略,该策略对深度估计中的不确定性分布进行建模,并有效地融合多个深度预测。实验表明,MonoAMP在中等难度水平的KITTI数据集上显著提高了性能,实现了16.80%的3D平均精度和24.47%的鸟瞰图平均精度。此外,消融研究表明,与基线方法相比,目标检测准确率提高了3.78%。与其他先进方法相比,MonoAMP展示了卓越的检测能力,尤其是在复杂场景中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a67a/11819936/2bd739588091/sensors-25-00787-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验