Yan Dongqi, Zhang Tao
Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China.
Sensors (Basel). 2025 May 2;25(9):2876. doi: 10.3390/s25092876.
Although deep learning has exhibited remarkable performance in lane detection, lane detection remains challenging in complex scenarios, including those with damaged lane markings, obstructions, and insufficient lighting. Furthermore, a significant drawback of most existing lane-detection algorithms lies in their reliance on complex post-processing and strong prior knowledge. Inspired by the DETR architecture, we propose an end-to-end Transformer-based model, MHFS-FORMER, to resolve these issues. To tackle the interference with lane detection in complex scenarios, we have designed MHFNet. It fuses multi-scale features with the Transformer Encoder to obtain enhanced multi-scale features. These enhanced multi-scale features are then fed into the Transformer Decoder. A novel multi-reference deformable attention module is introduced to disperse the attention around the objects to enhance the model's representation ability during the training process and better capture the elongated structure of lanes and the global environment. We also designed ShuffleLaneNet, which meticulously explores the channel and spatial information of multi-scale lane features, significantly improving the accuracy of target recognition. Our method has achieved an accuracy score of 96.88%, a real-time processing speed of 87 fps on the TuSimple dataset, and an F1 score of 77.38% on the CULane dataset. Compared with the methods based on CNN and those based on Transformer, our method has demonstrated excellent performance.
尽管深度学习在车道检测中表现出了卓越的性能,但在复杂场景中,车道检测仍然具有挑战性,这些复杂场景包括车道标记损坏、有障碍物以及照明不足的情况。此外,大多数现有车道检测算法的一个显著缺点在于它们依赖复杂的后处理和强大的先验知识。受DETR架构的启发,我们提出了一种基于Transformer的端到端模型MHFS-FORMER来解决这些问题。为了解决复杂场景中对车道检测的干扰,我们设计了MHFNet。它将多尺度特征与Transformer编码器融合以获得增强的多尺度特征。然后将这些增强的多尺度特征输入到Transformer解码器中。引入了一种新颖的多参考可变形注意力模块,在训练过程中将注意力分散到物体周围,以增强模型的表征能力,并更好地捕捉车道的细长结构和全局环境。我们还设计了ShuffleLaneNet,它精心探索了多尺度车道特征的通道和空间信息,显著提高了目标识别的准确率。我们的方法在TuSimple数据集上的准确率达到了96.88%,实时处理速度为87帧每秒,在CULane数据集上的F1分数为77.38%。与基于CNN的方法和基于Transformer的方法相比,我们的方法表现出了优异的性能。