Lei Xiaochun, Cai Xiang, Lu Linjun, Cui Zihang, Jiang Zetao
School of Computer Science and Information Security, Guilin University of Electronic Technology, GuiLin, 541010, Guangxi, China.
Guangxi Key Laboratory of Image and Graphic Intelligent Processing, Guilin University of Electronic Technology, Guilin, 541004, Guangxi, China.
Sci Rep. 2023 Aug 15;13(1):13263. doi: 10.1038/s41598-023-40175-9.
Salient object detection is vital for non-specific class subject segmentation in computer vision applications. However, accurately segmenting foreground subjects with complex backgrounds and intricate boundaries remains a challenge for existing methods. To address these limitations, our study proposes SUGE-Net, which introduces several novel improvements. We replace the traditional CNN-based backbone with the transformer-based Swin-TransformerV2, known for its effectiveness in capturing long-range dependencies and rich contextual information. To tackle under and over-attention phenomena, we introduce Gated Channel Transformation (GCT). Furthermore, we adopted an edge-based loss (Edge Loss) for network training to capture spatial-wise structural details. Additionally, we propose Training-only Augmentation Loss (TTA Loss) to enhance spatial stability using augmented data. Our method is evaluated using six common datasets, achieving an impressive [Formula: see text] score of 0.883 on DUTS-TE. Compared with other models, SUGE-Net demonstrates excellent performance in various segmentation scenarios.
显著目标检测对于计算机视觉应用中的非特定类别目标分割至关重要。然而,对于现有方法而言,准确分割具有复杂背景和精细边界的前景目标仍然是一项挑战。为了解决这些局限性,我们的研究提出了SUGE-Net,它引入了几个新颖的改进。我们用基于Transformer的Swin-TransformerV2取代了传统的基于卷积神经网络(CNN)的主干,Swin-TransformerV2以其在捕获长距离依赖和丰富上下文信息方面的有效性而闻名。为了解决注意力不足和过度的现象,我们引入了门控通道变换(GCT)。此外,我们在网络训练中采用了基于边缘的损失(边缘损失)来捕获空间结构细节。此外,我们提出了仅训练增强损失(TTA损失),以使用增强数据提高空间稳定性。我们的方法使用六个常见数据集进行评估,在DUTS-TE上取得了令人印象深刻的0.883的[公式:见正文]分数。与其他模型相比,SUGE-Net在各种分割场景中都表现出优异的性能。