China FAW Corporation Limited, Global R&D Center, Changchun 130013, China.
School of Vehicle and Energy, Yanshan University, Qinhuangdao 066000, China.
Sensors (Basel). 2023 Mar 23;23(7):3385. doi: 10.3390/s23073385.
Vehicle view object detection technology is the key to the environment perception modules of autonomous vehicles, which is crucial for driving safety. In view of the characteristics of complex scenes, such as dim light, occlusion, and long distance, an improved YOLOv4-based vehicle view object detection model, VV-YOLO, is proposed in this paper. The VV-YOLO model adopts the implementation mode based on anchor frames. In the anchor frame clustering, the improved K-means++ algorithm is used to reduce the possibility of instability in anchor frame clustering results caused by the random selection of a cluster center, so that the model can obtain a reasonable original anchor frame. Firstly, the CA-PAN network was designed by adding a coordinate attention mechanism, which was used in the neck network of the VV-YOLO model; the multidimensional modeling of image feature channel relationships was realized; and the extraction effect of complex image features was improved. Secondly, in order to ensure the sufficiency of model training, the loss function of the VV-YOLO model was reconstructed based on the focus function, which alleviated the problem of training imbalance caused by the unbalanced distribution of training data. Finally, the KITTI dataset was selected as the test set to conduct the index quantification experiment. The results showed that the precision and average precision of the VV-YOLO model were 90.68% and 80.01%, respectively, which were 6.88% and 3.44% higher than those of the YOLOv4 model, and the model's calculation time on the same hardware platform did not increase significantly. In addition to testing on the KITTI dataset, we also selected the BDD100K dataset and typical complex traffic scene data collected in the field to conduct a visual comparison test of the results, and then the validity and robustness of the VV-YOLO model were verified.
车辆目标检测技术是自动驾驶车辆环境感知模块的关键,对于驾驶安全至关重要。针对光照不足、遮挡和远距离等复杂场景的特点,提出了一种改进的基于 YOLOv4 的车辆目标检测模型 VV-YOLO。VV-YOLO 模型采用基于锚框的实现方式,在锚框聚类中采用改进的 K-means++算法,减少聚类中心随机选择导致锚框聚类结果不稳定的可能性,从而使模型能够获得合理的原始锚框。首先,通过添加坐标注意力机制设计了 CA-PAN 网络,将其应用于 VV-YOLO 模型的颈部网络;实现了图像特征通道关系的多维建模,提高了复杂图像特征的提取效果。其次,为保证模型训练的充分性,基于焦点函数重构了 VV-YOLO 模型的损失函数,缓解了因训练数据分布不均衡导致的训练失衡问题。最后,选择 KITTI 数据集作为测试集进行指标量化实验。结果表明,VV-YOLO 模型的精度和平均精度分别为 90.68%和 80.01%,分别比 YOLOv4 模型提高了 6.88%和 3.44%,且在相同硬件平台上模型的计算时间没有明显增加。除了在 KITTI 数据集上进行测试外,还选择了 BDD100K 数据集和现场采集的典型复杂交通场景数据进行结果的可视化对比测试,进一步验证了 VV-YOLO 模型的有效性和鲁棒性。