用于自动驾驶车辆中鲁棒视觉目标检测的双阶段特征专业化网络。

Dual-stage feature specialization network for robust visual object detection in autonomous vehicles.

作者信息

Liu Ze, Wu Junhua, Cai Yingfeng, Wang Hai, Chen Long, Liu Qingchao

机构信息

Automotive Engineering Research Institute, Jiangsu University, Zhenjiang, 212013, Jiangsu, China.

School of Automotive and Traffic Engineering, Jiangsu University, Zhenjiang, 212013, Jiangsu, China.

出版信息

Sci Rep. 2025 May 3;15(1):15501. doi: 10.1038/s41598-025-99363-4.

DOI:10.1038/s41598-025-99363-4

PMID:40319138

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12049487/

Abstract

Efficient feature representation is critical for accurate visual perception in autonomous vehicles. Existing two-stage object detection methods often suffer from feature interference between candidate region generation and classification regression tasks, leading to suboptimal performance in complex scenes. To address this, we propose a Dual-Stage Feature Specialization Network (DSFSN) that decouples feature extraction: MobileNetV3 is employed for lightweight candidate region generation, while ResNet-FPN enhances multi-scale feature fusion for precise classification. Extensive experiments on PASCAL VOC and MS COCO datasets demonstrate state-of-the-art performance, achieving 81.6% mAP (9.3% higher than Faster R-CNN) and 29.3% AP on MS COCO, with a 14.9% improvement in small object detection. Real-world tests under diverse conditions (e.g., rain, night) validate the robustness of our method for autonomous driving applications. This work provides a novel framework for balancing accuracy and efficiency in visual perception systems.

摘要

高效的特征表示对于自动驾驶车辆中的准确视觉感知至关重要。现有的两阶段目标检测方法通常在候选区域生成和分类回归任务之间存在特征干扰，导致在复杂场景中的性能次优。为了解决这个问题，我们提出了一种双阶段特征专业化网络（DSFSN），它将特征提取解耦：使用MobileNetV3进行轻量级候选区域生成，而ResNet-FPN增强多尺度特征融合以进行精确分类。在PASCAL VOC和MS COCO数据集上进行的大量实验证明了其具有领先的性能，在MS COCO上实现了81.6%的平均精度均值（mAP）（比Faster R-CNN高9.3%）和29.3%的平均精度（AP），在小目标检测方面有14.9%的提升。在不同条件（如下雨、夜间）下的实际测试验证了我们的方法在自动驾驶应用中的鲁棒性。这项工作为视觉感知系统中平衡准确性和效率提供了一个新颖的框架。