Yang Zhuolin, Cao Zhen, Cao Jianfang, Chen Zhiqiang, Peng Cunhe
Department of Computer Science and Technology, Xinzhou Normal University, Xinzhou, China.
School of Computer Science and Technology, Taiyuan University of Science and Technology, Taiyuan, China.
PLoS One. 2024 Dec 19;19(12):e0315621. doi: 10.1371/journal.pone.0315621. eCollection 2024.
In semantic image segmentation tasks, most methods fail to fully use the characteristics of different scales and levels but rather directly perform upsampling. This may cause some effective information to be mistaken for redundant information and discarded, which in turn causes object segmentation confusion. As a convolutional layer deepens, the loss of spatial detail information makes the segmentation effect achieved at the object boundary insufficiently accurate. To address the above problems, we propose an edge optimization and category-aware multibranch semantic segmentation network (ECMNet). First, an attention-guided multibranch fusion backbone network is used to connect features with different resolutions in parallel and perform multiscale information interaction to reduce the loss of spatial detail information. Second, a category perception module is used to learn category feature representations and guide the pixel classification process through an attention mechanism to optimize the resulting segmentation accuracy. Finally, an edge optimization module is used to integrate the edge features into the middle and the deep supervision layers of the network through an adaptive algorithm to enhance its ability to express edge features and optimize the edge segmentation effect. The experimental results show that the MIoU value reaches 79.2% on the Cityspaces dataset and 79.6% on the CamVid dataset, that the number of parameters is significantly lower than those of other models, and that the proposed method can effectively achieve improved semantic image segmentation performance and solve the partial category segmentation confusion problem, giving it certain application prospects.
在语义图像分割任务中,大多数方法未能充分利用不同尺度和层级的特征,而是直接进行上采样。这可能导致一些有效信息被误当作冗余信息而丢弃,进而造成目标分割混乱。随着卷积层加深,空间细节信息的丢失使得在目标边界处实现的分割效果不够精确。为了解决上述问题,我们提出了一种边缘优化和类别感知多分支语义分割网络(ECMNet)。首先,使用注意力引导的多分支融合主干网络并行连接不同分辨率的特征并进行多尺度信息交互,以减少空间细节信息的丢失。其次,使用类别感知模块学习类别特征表示,并通过注意力机制引导像素分类过程,以优化最终的分割精度。最后,使用边缘优化模块通过自适应算法将边缘特征整合到网络的中间层和深度监督层,以增强其表达边缘特征的能力并优化边缘分割效果。实验结果表明,在Cityspaces数据集上MIoU值达到79.2%,在CamVid数据集上达到79.6%,参数数量显著低于其他模型,且所提方法能够有效提升语义图像分割性能并解决部分类别分割混乱问题,具有一定的应用前景。