MAT-PointPillars：基于多尺度注意力机制和Transformer的增强型PointPillars算法

MAT-PointPillars: Enhanced PointPillars algorithm based on multi-scale attention mechanisms and transformer.

作者信息

Yao Xinpeng, Liu Peiyuan, Zhou Jingmei, Wang Zijian, Fan Songhua, Wang Yuchen

机构信息

Shandong Key Laboratory of Smart Transportation (Preparation), Jinan, China.

School of Electronics and Control Engineering, Chang'an University, Xi'an, China.

出版信息

PLoS One. 2025 Jun 27;20(6):e0325373. doi: 10.1371/journal.pone.0325373. eCollection 2025.

DOI:10.1371/journal.pone.0325373

PMID:40577385

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12204542/

Abstract

Aiming at the problem that small and irregular detection targets such as cyclists have low detection accuracy and inaccurate recognition by existing 3D target detection algorithms, MAT-PointPillars (Multi-scale Attention and Transformer PointPillars), a 3D object detection algorithm, extends PointPillars with multi-scale vision Transformers and attention mechanisms. First, the algorithm employs pillar coding for semantic point cloud encoding and introduces an attention mechanism to refine the backbone's upsampling process. Furthermore, the Transformer Encoder is introduced to improve the upsampling structure of the third stage of the backbone. On the KITTI dataset, our algorithm achieved 3D average detection accuracy (AP3D) of 81.15%, 62.02%, and 58.68% across three difficulty levels. Compared with the baseline model, the proposed algorithm improves AP3D by 2.44%, 1.19%, and 1.23% respectively. The real-time 3D object detection system is built based on ROS, and average running frames per second of the system is 22.63, which is higher than the sampling frequency of conventional LiDAR. By ensuring sufficient detection speed, the MAT-PointPillars algorithm can increase detection accuracy of cyclists in real-world scenarios.

摘要

针对现有3D目标检测算法对诸如骑自行车者等小而不规则检测目标检测精度低且识别不准确的问题，一种3D目标检测算法MAT-PointPillars（多尺度注意力与Transformer的PointPillars）通过多尺度视觉Transformer和注意力机制对PointPillars进行了扩展。首先，该算法采用柱体编码进行语义点云编码，并引入注意力机制来优化主干网络的上采样过程。此外，引入Transformer编码器以改进主干网络第三阶段的上采样结构。在KITTI数据集上，我们的算法在三个难度级别上分别实现了81.15%、62.02%和58.68%的3D平均检测精度（AP3D）。与基线模型相比，所提算法的AP3D分别提高了2.44%、1.19%和1.23%。基于ROS构建了实时3D目标检测系统，系统的平均每秒运行帧数为22.63，高于传统激光雷达的采样频率。通过确保足够的检测速度，MAT-PointPillars算法可以提高现实场景中骑自行车者的检测精度。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7cd7/12204542/2c8d48b9e5b6/pone.0325373.g001.jpg

相似文献

MAT-PointPillars: Enhanced PointPillars algorithm based on multi-scale attention mechanisms and transformer.MAT-PointPillars：基于多尺度注意力机制和Transformer的增强型PointPillars算法

PLoS One. 2025 Jun 27;20(6):e0325373. doi: 10.1371/journal.pone.0325373. eCollection 2025.

Algorithm-based pain management for people with dementia in nursing homes.基于算法的养老院痴呆患者疼痛管理。

Cochrane Database Syst Rev. 2022 Apr 1;4(4):CD013339. doi: 10.1002/14651858.CD013339.pub2.

SODU2-NET: a novel deep learning-based approach for salient object detection utilizing U-NET.SODU2-NET：一种基于深度学习的利用U-NET进行显著目标检测的新方法。

PeerJ Comput Sci. 2025 May 19;11:e2623. doi: 10.7717/peerj-cs.2623. eCollection 2025.

Revisiting Siamese-Based 3D Single Object Tracking With a Versatile Transformer.基于连体网络的3D单目标跟踪再探：一种通用的Transformer方法

IEEE Trans Pattern Anal Mach Intell. 2025 Sep;47(9):8148-8164. doi: 10.1109/TPAMI.2025.3581381.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.对紫杉醇、多西他赛、吉西他滨和长春瑞滨在非小细胞肺癌中的临床疗效和成本效益进行的快速系统评价。

Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.

Int Ophthalmol. 2025 Jun 27;45(1):266. doi: 10.1007/s10792-025-03602-6.

Reading aids for adults with low vision.针对视力低下成年人的阅读辅助工具。

Cochrane Database Syst Rev. 2018 Apr 17;4(4):CD003303. doi: 10.1002/14651858.CD003303.pub4.

Research on nighttime IPPG algorithm based on ROI delay expansion and fundamental frequency constrained FastICA.基于感兴趣区域延迟扩展和基频约束快速独立成分分析的夜间容积脉搏波成像算法研究

Physiol Meas. 2025 Jun 27;46(6). doi: 10.1088/1361-6579/ade653.

Technological aids for the rehabilitation of memory and executive functioning in children and adolescents with acquired brain injury.脑损伤儿童和青少年记忆与执行功能康复的技术辅助手段。

Cochrane Database Syst Rev. 2016 Jul 1;7(7):CD011020. doi: 10.1002/14651858.CD011020.pub2.

本文引用的文献

End-to-end multi-scale residual network with parallel attention mechanism for fault diagnosis under noise and small samples.

ISA Trans. 2025 Feb;157:419-433. doi: 10.1016/j.isatra.2024.12.023. Epub 2024 Dec 19.

SECOND: Sparsely Embedded Convolutional Detection.第二：稀疏嵌入卷积检测。

Sensors (Basel). 2018 Oct 6;18(10):3337. doi: 10.3390/s18103337.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

MAT-PointPillars：基于多尺度注意力机制和Transformer的增强型PointPillars算法

MAT-PointPillars: Enhanced PointPillars algorithm based on multi-scale attention mechanisms and transformer.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献