Sha Xiaopeng, Guan Zheng, Wang Ying, Han Jinglu, Wang Yi, Chen Zhaojun
Hebei Key Laboratory of Micro-Nano Precision Optical Sensing and Measurement Technology, Qinhuangdao, China.
School of Control Engineering, Northeastern University at Qinhuangdao, Qinhuangdao, China.
Digit Health. 2025 May 21;11:20552076251343696. doi: 10.1177/20552076251343696. eCollection 2025 Jan-Dec.
Traditional Chinese medicine (TCM) tongue diagnosis, through the comprehensive observation of tongue's diverse characteristics, allows an understanding of the state of the body's viscera as well as Qi and blood levels. Automatic tongue image recognition methods could support TCM practitioners by providing auxiliary diagnostic suggestions. However, most learning-based methods often address a narrow scope of the tongue's attributes, failing to fully exploit the information contained within the tongue images.
To classify multifaceted tongue characteristics, and fully utilize the latent correlation information between tongue segmentation and classification tasks, we proposed a multi-task joint learning network for simultaneous tongue body segmentation and multi-label Classification, named SSC-Net.
Firstly, the shared feature encoder extracts features for both segmentation and classification tasks, where the segmentation result is utilized to mask redundant features that may impede classification accuracy. Subsequently, the ROI extraction module locates and extracts the tongue body region, and the feature fusion module combines tongue body features from bottom to top. Finally, a fine-grained classification module is employed for multi-label classification on multiple tongue characteristics.
To evaluate the performance of the SSC-Net, we collected a tongue image dataset, BUCM, and conducted extensive experiments on it. The experimental results show that the proposed method when segmenting and classifying simultaneously, achieved 0.9943 DSC for the segmentation task, 92.02 mAP, and 0.851 overall F1-score for the classification task.
The proposed method can effectively classify multiple tongue characteristics with the support of the multi-task learning strategy and the integration of a fine-grained classification module. Code is available here.
中医舌诊通过对舌象多种特征的综合观察,可以了解人体脏腑及气血的状态。自动舌象识别方法可为中医从业者提供辅助诊断建议。然而,大多数基于学习的方法往往只涉及舌象属性的狭窄范围,未能充分利用舌象图像中包含的信息。
为了对多方面的舌象特征进行分类,并充分利用舌象分割与分类任务之间的潜在关联信息,我们提出了一种用于同时进行舌体分割和多标签分类的多任务联合学习网络,即SSC-Net。
首先,共享特征编码器为分割和分类任务提取特征,其中分割结果用于屏蔽可能妨碍分类准确性的冗余特征。随后,感兴趣区域(ROI)提取模块定位并提取舌体区域,特征融合模块从下到上组合舌体特征。最后,采用细粒度分类模块对多种舌象特征进行多标签分类。
为了评估SSC-Net的性能,我们收集了一个舌象图像数据集BUCM,并在其上进行了广泛的实验。实验结果表明,该方法在同时进行分割和分类时,分割任务的Dice相似系数(DSC)达到0.9943,分类任务的平均精度均值(mAP)为92.02,总体F1分数为0.851。
所提出的方法在多任务学习策略和细粒度分类模块的集成支持下,能够有效地对多种舌象特征进行分类。代码可在此处获取。