Wang Xuan, Cao Yifang, Chen Yijia, Li Huixia, Han Aiqing, Tang Yan
School of Management, Beijing University of Chinese Medicine, Beijing, China.
College of Humanities, Beijing University of Chinese Medicine, Beijing, China.
Sci Rep. 2025 Jul 29;15(1):27622. doi: 10.1038/s41598-025-05410-5.
Tongue diagnosis is a crucial component of the Four Diagnostic Methods in Traditional Chinese Medicine (TCM), which include observing, listening and smelling, inquiring, and palpation. Tongue image segmentation holds great significance in advancing the intelligentization of tongue diagnosis research. This paper introduces an improved model called Parallel Attention and Progressive Upsampling for Tongue Segmentation (PAPU_TonSeg), based on the Segformer architecture, to address the issues of inaccurate and blurred tongue edge segmentation in tongue semantic segmentation. The model incorporates three key enhancements: (1) the adoption of a Self-Attention Parallel Network that integrates the self-attention mechanism and residual modules to achieve simultaneous extraction of local and global features; (2) the integration of the Efficient Channel Attention(ECA) mechanism into the Mix-FFN component to enhance feature extraction efficiency; and (3) the utilization of Multi-dimensional Feature Progressive Upsampling to mitigate precision loss during the upsampling process. Evaluation results on the BioHit public dataset demonstrate that, compared to the original Segformer, PAPU_TonSeg achieves improvements of 2.42% in Mean Pixel Accuracy (MPA), 0.78% in Mean Intersection over Union (MIoU), and 2.02% in the Dice coefficient, while boasting a lower parameter count and computational complexity. On another dataset, PAPU_TonSeg outperforms Segformer with an MPA increase of 0.64%, an MIoU increase of 0.33%, and a Dice coefficient increase of 0.4%. The improved model not only has fewer parameters but also exhibits a notably lower computational complexity compared to classical models. The PAPU_TonSeg model accurately segments tongue body details, such as tooth marks, and distributes attention more evenly, capturing both global and local features. These findings position PAPU_TonSeg as a valuable tool for clinical diagnosis and research in TCM tongue diagnosis.
舌诊是中医四诊法的重要组成部分,四诊法包括望诊、闻诊、问诊和切诊。舌图像分割对于推动舌诊研究的智能化具有重要意义。本文基于Segformer架构,介绍了一种名为并行注意力和渐进上采样的舌部分割模型(PAPU_TonSeg),以解决舌语义分割中舌边缘分割不准确和模糊的问题。该模型包含三项关键改进:(1)采用自注意力并行网络,将自注意力机制与残差模块相结合,实现局部和全局特征的同时提取;(2)将高效通道注意力(ECA)机制集成到混合前馈网络(Mix-FFN)组件中,以提高特征提取效率;(3)利用多维度特征渐进上采样来减轻上采样过程中的精度损失。在BioHit公共数据集上的评估结果表明,与原始的Segformer相比,PAPU_TonSeg在平均像素精度(MPA)上提高了2.42%,在平均交并比(MIoU)上提高了0.78%,在Dice系数上提高了2.02%,同时参数数量和计算复杂度更低。在另一个数据集上,PAPU_TonSeg的表现优于Segformer,MPA提高了0.64%,MIoU提高了0.33%,Dice系数提高了0.4%。与经典模型相比,改进后的模型不仅参数更少,而且计算复杂度显著更低。PAPU_TonSeg模型能够准确分割舌体细节,如齿痕,并更均匀地分配注意力,同时捕捉全局和局部特征。这些发现使PAPU_TonSeg成为中医舌诊临床诊断和研究的有价值工具。