Jiang Fan, Zhong Yanmei, Yang Simin
Rehabilitation Medicine Department, The first Affiliated Hospital of Chongqing Medical University, Chongqing, China.
School of Foreign Languages and Cultures, Chongqing University, Chongqing, China.
Digit Health. 2024 Oct 14;10:20552076241289007. doi: 10.1177/20552076241289007. eCollection 2024 Jan-Dec.
Current tongue segmentation methods often struggle with extracting global features and performing selective filtering, particularly in complex environments where background objects resemble the tongue. These challenges significantly reduce segmentation efficiency. To address these issues, this article proposes a novel model for tongue segmentation in complex environments, combining Mamba and U-Net. By leveraging Mamba's global feature selection capabilities, this model assists U-Net in accurately excluding tongue-like objects from the background, thereby enhancing segmentation accuracy and efficiency.
To improved the segmentation accuracy of the U-Net backbone model, we incorporated the Mamba attention module along with a multi-stage feature fusion module. The Mamba attention module serially connects spatial and channel attention mechanisms at the U-Net 's skip connections, selectively filtering the feature maps passed into the deep network. Additionally, the multi-stage feature fusion module integrates feature maps from different stages, further improving segmentation performance.
Compared with state-of-the-art semantic segmentation and tongue segmentation models, our model improved the mean intersection over union by 1.17%. Ablation experiments further demonstrated that each module proposed in this study contributes to enhancing the model's segmentation efficiency.
This study constructs a ongue segmentation model based on -Net and (TUMamba). The model effectively extracted global spatial and channel features using the Mamba attention module, captured local detail features through U-Net, and enhanced image features via multi-stage feature fusion. The results demonstrate that the model performs exceptionally well in tongue segmentation tasks, proving its value in handling complex environments.
当前的舌部分割方法在提取全局特征和进行选择性过滤方面常常面临困难,尤其是在背景物体与舌头相似的复杂环境中。这些挑战显著降低了分割效率。为了解决这些问题,本文提出了一种用于复杂环境中舌部分割的新型模型,它结合了曼巴(Mamba)和U-Net。通过利用曼巴的全局特征选择能力,该模型协助U-Net准确地从背景中排除类似舌头的物体,从而提高分割的准确性和效率。
为了提高U-Net主干模型的分割精度,我们纳入了曼巴注意力模块以及一个多阶段特征融合模块。曼巴注意力模块在U-Net的跳跃连接处以串行方式连接空间和通道注意力机制,选择性地过滤传入深度网络的特征图。此外,多阶段特征融合模块整合来自不同阶段的特征图,进一步提高分割性能。
与最先进的语义分割和舌部分割模型相比,我们的模型将平均交并比提高了1.17%。消融实验进一步表明,本研究中提出的每个模块都有助于提高模型的分割效率。
本研究构建了一个基于U-Net和曼巴(TUMamba)的舌部分割模型。该模型使用曼巴注意力模块有效地提取了全局空间和通道特征,通过U-Net捕获了局部细节特征,并通过多阶段特征融合增强了图像特征。结果表明,该模型在舌部分割任务中表现出色,证明了其在处理复杂环境中的价值。