Zhao Huaqi, Wang Su, Peng Xiang, Pan Jeng-Shyang, Wang Rui, Liu Xiaomin
The Heilongjiang Provincial Key Laboratory of Autonomous Intelligence and Information Processing, School of Information and Electronic Technology, Jiamusi University, Jiamusi, Heilongjiang, China.
College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, Shandong, China.
PeerJ Comput Sci. 2024 Sep 25;10:e2250. doi: 10.7717/peerj-cs.2250. eCollection 2024.
Although semantic segmentation is widely employed in autonomous driving, its performance in segmenting road surfaces falls short in complex traffic environments. This study proposes a frequency-based semantic segmentation with a transformer (FSSFormer) based on the sensitivity of semantic segmentation to frequency information. Specifically, we propose a weight-sharing factorized attention to select important frequency features that can improve the segmentation performance of overlapping targets. Moreover, to address boundary information loss, we used a cross-attention method combining spatial and frequency features to obtain further detailed pixel information. To improve the segmentation accuracy in complex road scenarios, we adopted a parallel-gated feedforward network segmentation method to encode the position information. Extensive experiments demonstrate that the mIoU of FSSFormer increased by 2% compared with existing segmentation methods on the Cityscapes dataset.
尽管语义分割在自动驾驶中被广泛应用,但其在复杂交通环境下分割路面的性能仍存在不足。本研究基于语义分割对频率信息的敏感性,提出了一种基于Transformer的频率语义分割方法(FSSFormer)。具体而言,我们提出了一种权重共享因子化注意力机制,以选择能够提高重叠目标分割性能的重要频率特征。此外,为了解决边界信息丢失的问题,我们采用了一种结合空间和频率特征的交叉注意力方法,以获取更详细的像素信息。为了提高在复杂道路场景下的分割精度,我们采用了一种并行门控前馈网络分割方法来编码位置信息。大量实验表明,在Cityscapes数据集上,FSSFormer的平均交并比(mIoU)比现有分割方法提高了2%。