Liu Ruijun, Zhang Yijun, Chen Jieying, Wu Zhigang, Zhu Yaohui, Liu Jun, Chen Min
Jiangxi Dongrui Intelligent Equipment Technology Co., LTD., Nanchang, 330028, China.
Gansu Power Transmission Engineering Co., Ltd., Lanzhou, 730050, China.
Sci Rep. 2025 Apr 16;15(1):13193. doi: 10.1038/s41598-025-95470-4.
Real-time semantic segmentation is one of the most researched areas in the field of computer vision, and research on dual-branch networks has gradually become a popular direction in network architecture research. In this paper, a dual-branch automatic driving image segmentation network integrating spatial and channel attention mechanisms is proposed with named as "BiAttentionNet". The network aims to balance network accuracy and real-time performance by processing high-level semantic information and low-level detail information separately. BiAttentionNet consists of three main parts: the detail branch, the semantic branch, and the proposed attention-guided fusion layer. The detail branch extracts local and surrounding context features using the designed PCSD convolution module to process wide-channel low-level feature information. The semantic branch utilizes an improved lightweight Unet network to extract semantic information from deep narrow channels. Finally, the proposed attention-guided fusion layer fuses the features of the dual branches using detail attention and channel attention mechanisms to achieve image segmentation tasks in road scenes. Comparative experiments with recent mainstream networks such as BiseNet v2, Fast-SCNN, ConvNeXt, SegNeXt, Segformer, CGNet, etc., on the Cityscapes dataset show that BiAttentionNet achieves a highest accuracy of 65.89% in the mIoU metric for the backbone network. This validates the effectiveness of the proposed BiAttentionNet.
实时语义分割是计算机视觉领域研究最多的领域之一,对双分支网络的研究逐渐成为网络架构研究中的一个热门方向。本文提出了一种集成空间和通道注意力机制的双分支自动驾驶图像分割网络,命名为“BiAttentionNet”。该网络旨在通过分别处理高级语义信息和低级细节信息来平衡网络准确性和实时性能。BiAttentionNet由三个主要部分组成:细节分支、语义分支和提出的注意力引导融合层。细节分支使用设计的PCSD卷积模块处理宽通道低级特征信息,以提取局部和周围上下文特征。语义分支利用改进的轻量级Unet网络从深窄通道中提取语义信息。最后,提出的注意力引导融合层使用细节注意力和通道注意力机制融合双分支的特征,以实现道路场景中的图像分割任务。在Cityscapes数据集上与最近的主流网络如BiseNet v2、Fast-SCNN、ConvNeXt、SegNeXt、Segformer、CGNet等进行的对比实验表明,BiAttentionNet在骨干网络的mIoU指标中达到了65.89%的最高准确率。这验证了所提出的BiAttentionNet的有效性。