PDT-YOLO：一种用于多尺度和遮挡目标的路边目标检测算法。

PDT-YOLO: A Roadside Object-Detection Algorithm for Multiscale and Occluded Targets.

作者信息

Liu Ruoying, Huang Miaohua, Wang Liangzi, Bi Chengcheng, Tao Ye

机构信息

Hubei Key Laboratory of Advanced Technology for Automotive Components, Wuhan University of Technology, Wuhan 430070, China.

Hubei Collaborative Innovation Center for Automotive Components Technology, Wuhan University of Technology, Wuhan 430070, China.

出版信息

Sensors (Basel). 2024 Apr 4;24(7):2302. doi: 10.3390/s24072302.

DOI:10.3390/s24072302

PMID:38610513

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11014219/

Abstract

To tackle the challenges of weak sensing capacity for multi-scale objects, high missed detection rates for occluded targets, and difficulties for model deployment in detection tasks of intelligent roadside perception systems, the PDT-YOLO algorithm based on YOLOv7-tiny is proposed. Firstly, we introduce the intra-scale feature interaction module (AIFI) and reconstruct the feature pyramid structure to enhance the detection accuracy of multi-scale targets. Secondly, a lightweight convolution module (GSConv) is introduced to construct a multi-scale efficient layer aggregation network module (ETG), enhancing the network feature extraction ability while maintaining weight. Thirdly, multi-attention mechanisms are integrated to optimize the feature expression ability of occluded targets in complex scenarios, Finally, Wise-IoU with a dynamic non-monotonic focusing mechanism improves the accuracy and generalization ability of model sensing. Compared with YOLOv7-tiny, PDT-YOLO on the DAIR-V2X-C dataset improves mAP50 and mAP50:95 by 4.6% and 12.8%, with a parameter count of 6.1 million; on the IVODC dataset by 15.7% and 11.1%. We deployed the PDT-YOLO in an actual traffic environment based on a robot operating system (ROS), with a detection frame rate of 90 FPS, which can meet the needs of roadside object detection and edge deployment in complex traffic scenes.

摘要

为应对智能路边感知系统检测任务中多尺度物体感知能力弱、遮挡目标漏检率高以及模型部署困难等挑战，提出了基于YOLOv7-tiny的PDT-YOLO算法。首先，引入尺度内特征交互模块（AIFI）并重构特征金字塔结构，以提高多尺度目标的检测精度。其次，引入轻量级卷积模块（GSConv）构建多尺度高效层聚合网络模块（ETG），在保持权重的同时增强网络特征提取能力。第三，集成多注意力机制以优化复杂场景中遮挡目标的特征表达能力。最后，带有动态非单调聚焦机制的Wise-IoU提高了模型感知的准确性和泛化能力。与YOLOv7-tiny相比，在DAIR-V2X-C数据集上，PDT-YOLO的mAP50和mAP50:95分别提高了4.6%和12.8%，参数数量为610万；在IVODC数据集上分别提高了15.7%和11.1%。我们基于机器人操作系统（ROS）在实际交通环境中部署了PDT-YOLO，检测帧率为90 FPS，能够满足复杂交通场景下路边目标检测和边缘部署的需求。