• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

舌诊网络:一种用于中医舌诊的多模态融合与多标签分类模型。

TongueNet: a multi-modal fusion and multi-label classification model for traditional Chinese Medicine tongue diagnosis.

作者信息

Yang Lijuan, Dong Qiumei, Lin Da, Lü Xinliang

机构信息

Department of Rheumatology, Inner Mongolia Autonomous Region Hospital of Traditional Chinese Medicine, Hohhot, China.

College of Traditional Chinese Medicine, Inner Mongolia Medical University, Hohhot, China.

出版信息

Front Physiol. 2025 Apr 25;16:1527751. doi: 10.3389/fphys.2025.1527751. eCollection 2025.

DOI:10.3389/fphys.2025.1527751
PMID:40352152
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12061702/
Abstract

Tongue diagnosis in Traditional Chinese Medicine (TCM) plays a crucial role in clinical practice. By observing the shape, color, and coating of the tongue, practitioners can assist in determining the nature and location of a disease. However, the field of tongue diagnosis currently faces challenges such as data scarcity and a lack of efficient multimodal diagnostic models, making it difficult to fully align with TCM theories and clinical needs. Additionally, existing methods generally lack multi-label classification capabilities, making it challenging to simultaneously meet the multidimensional requirements of TCM diagnosis for disease nature and location. To address these issues, this paper proposes TongueNet, a multimodal deep learning model that integrates tongue image data with text-based features. The model utilizes a Hierarchical Aggregation Network (HAN) and a Feature Space Projection Module to efficiently extract and fuse features while introducing consistency and complementarity constraints to optimize multimodal information fusion. Furthermore, the model incorporates a multi-scale attention mechanism (EMA) to enhance the diversity and accuracy of feature weighting and employs a Kolmogorov-Arnold Network (KAN) instead of traditional MLPs for output optimization, thereby improving the representation of complex features. For model training, this study integrates three publicly available tongue image datasets from the Roboflow platform and enlists multiple experts for multimodal annotation, incorporating multi-label information on disease nature and location to align with TCM clinical needs. Experimental results demonstrate that TongueNet outperforms existing models in both disease nature and disease location classification tasks. Specifically, in the disease nature classification task, it achieves 89.12% accuracy and an AUC of 83%; in the disease location classification task, it achieves 86.47% accuracy and an AUC of 81%. Moreover, TongueNet contains only 32.1 M parameters, significantly reducing computational resource requirements while maintaining high diagnostic performance. TongueNet provides a new approach for the intelligent development of TCM tongue diagnosis.

摘要

中医舌诊在临床实践中起着至关重要的作用。通过观察舌头的形状、颜色和舌苔,从业者可以辅助判断疾病的性质和部位。然而,目前舌诊领域面临数据稀缺以及缺乏高效多模态诊断模型等挑战,难以完全契合中医理论和临床需求。此外,现有方法普遍缺乏多标签分类能力,难以同时满足中医诊断对疾病性质和部位的多维度要求。为解决这些问题,本文提出了TongueNet,一种将舌图像数据与基于文本的特征相整合的多模态深度学习模型。该模型利用分层聚合网络(HAN)和特征空间投影模块来高效提取和融合特征,同时引入一致性和互补性约束以优化多模态信息融合。此外,该模型融入了多尺度注意力机制(EMA)以增强特征加权的多样性和准确性,并采用柯尔莫哥洛夫 - 阿诺德网络(KAN)代替传统多层感知器进行输出优化,从而改善复杂特征的表示。对于模型训练,本研究整合了来自Roboflow平台的三个公开可用舌图像数据集,并邀请多位专家进行多模态标注,纳入疾病性质和部位的多标签信息以符合中医临床需求。实验结果表明,TongueNet在疾病性质和疾病部位分类任务中均优于现有模型。具体而言,在疾病性质分类任务中,其准确率达到89.12%,曲线下面积(AUC)为83%;在疾病部位分类任务中,其准确率达到86.47%,AUC为81%。此外,TongueNet仅包含3210万个参数,在保持高诊断性能的同时显著降低了计算资源需求。TongueNet为中医舌诊的智能化发展提供了一种新方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa3a/12061702/50b50ab0d9df/fphys-16-1527751-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa3a/12061702/98fe264742c4/fphys-16-1527751-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa3a/12061702/dc81dfb34974/fphys-16-1527751-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa3a/12061702/830eb69830b6/fphys-16-1527751-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa3a/12061702/50b50ab0d9df/fphys-16-1527751-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa3a/12061702/98fe264742c4/fphys-16-1527751-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa3a/12061702/dc81dfb34974/fphys-16-1527751-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa3a/12061702/830eb69830b6/fphys-16-1527751-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa3a/12061702/50b50ab0d9df/fphys-16-1527751-g004.jpg

相似文献

1
TongueNet: a multi-modal fusion and multi-label classification model for traditional Chinese Medicine tongue diagnosis.舌诊网络:一种用于中医舌诊的多模态融合与多标签分类模型。
Front Physiol. 2025 Apr 25;16:1527751. doi: 10.3389/fphys.2025.1527751. eCollection 2025.
2
RTC_TongueNet: An improved tongue image segmentation model based on DeepLabV3.RTC_TongueNet:一种基于DeepLabV3的改进型舌图像分割模型。
Digit Health. 2024 Mar 28;10:20552076241242773. doi: 10.1177/20552076241242773. eCollection 2024 Jan-Dec.
3
Cross-modal attention model integrating tongue images and descriptions: a novel intelligent TCM approach for pathological organ diagnosis.整合舌象与描述的跨模态注意力模型:一种用于病理器官诊断的新型智能中医方法。
Front Physiol. 2025 Apr 23;16:1580985. doi: 10.3389/fphys.2025.1580985. eCollection 2025.
4
Research on multi-label recognition of tongue features in stroke patients based on deep learning.基于深度学习的中风患者舌象特征多标签识别研究
Sci Rep. 2024 Dec 30;14(1):32144. doi: 10.1038/s41598-024-84002-1.
5
MMAgentRec, a personalized multi-modal recommendation agent with large language model.MMAgentRec,一个带有大语言模型的个性化多模态推荐代理。
Sci Rep. 2025 Apr 8;15(1):12062. doi: 10.1038/s41598-025-96458-w.
6
Tongue feature recognition to monitor rehabilitation: deep neural network with visual attention mechanism.用于监测康复的舌部特征识别:具有视觉注意力机制的深度神经网络
Front Bioeng Biotechnol. 2024 May 9;12:1392513. doi: 10.3389/fbioe.2024.1392513. eCollection 2024.
7
Classification of fissured tongue images using deep neural networks.利用深度神经网络对裂舌图像进行分类。
Technol Health Care. 2022;30(S1):271-283. doi: 10.3233/THC-228026.
8
Computer-assisted lip diagnosis on Traditional Chinese Medicine using multi-class support vector machines.基于多类支持向量机的中医唇诊计算机辅助诊断。
BMC Complement Altern Med. 2012 Aug 16;12:127. doi: 10.1186/1472-6882-12-127.
9
Two-stream vision transformer based multi-label recognition for TCM prescriptions construction.基于双流视觉转换器的多标签识别在中药方剂构建中的应用。
Comput Biol Med. 2024 Mar;170:107920. doi: 10.1016/j.compbiomed.2024.107920. Epub 2024 Jan 12.
10
A multi-step approach for tongue image classification in patients with diabetes.一种用于糖尿病患者舌象分类的多步骤方法。
Comput Biol Med. 2022 Oct;149:105935. doi: 10.1016/j.compbiomed.2022.105935. Epub 2022 Aug 13.

本文引用的文献

1
Deep learning and machine intelligence: New computational modeling techniques for discovery of the combination rules and pharmacodynamic characteristics of Traditional Chinese Medicine.深度学习与机器智能:揭示中医药组合规律与药效特征的新型计算建模技术
Eur J Pharmacol. 2022 Oct 15;933:175260. doi: 10.1016/j.ejphar.2022.175260. Epub 2022 Sep 15.
2
A multi-step approach for tongue image classification in patients with diabetes.一种用于糖尿病患者舌象分类的多步骤方法。
Comput Biol Med. 2022 Oct;149:105935. doi: 10.1016/j.compbiomed.2022.105935. Epub 2022 Aug 13.
3
MATR: Multimodal Medical Image Fusion via Multiscale Adaptive Transformer.
MATR:基于多尺度自适应变换的多模态医学图像融合。
IEEE Trans Image Process. 2022;31:5134-5149. doi: 10.1109/TIP.2022.3193288. Epub 2022 Aug 2.
4
WatMIF: Multimodal Medical Image Fusion-Based Watermarking for Telehealth Applications.WatMIF:用于远程医疗应用的基于多模态医学图像融合的水印技术
Cognit Comput. 2022 Jul 7:1-17. doi: 10.1007/s12559-022-10040-4.
5
A review on multimodal medical image fusion: Compendious analysis of medical modalities, multimodal databases, fusion techniques and quality metrics.多模态医学图像融合综述:对医学模态、多模态数据库、融合技术和质量指标的简明分析。
Comput Biol Med. 2022 May;144:105253. doi: 10.1016/j.compbiomed.2022.105253. Epub 2022 Feb 3.
6
Tongue model construction based on ultrasound images with image processing and deep learning method.基于超声图像,采用图像处理和深度学习方法构建舌模型。
J Med Ultrason (2001). 2022 Apr;49(2):153-161. doi: 10.1007/s10396-022-01193-8. Epub 2022 Feb 18.
7
Panoramic tongue imaging and deep convolutional machine learning model for diabetes diagnosis in humans.全景舌成像和深度学习卷积机器模型在人类糖尿病诊断中的应用。
Sci Rep. 2022 Jan 7;12(1):186. doi: 10.1038/s41598-021-03879-4.
8
Network differentiation: A computational method of pathogenesis diagnosis in traditional Chinese medicine based on systems science.网络分化:基于系统科学的中医发病机制诊断的计算方法。
Artif Intell Med. 2021 Aug;118:102134. doi: 10.1016/j.artmed.2021.102134. Epub 2021 Jul 3.
9
A comprehensive review of integrative pharmacology-based investigation: A paradigm shift in traditional Chinese medicine.基于整合药理学的研究综述:中医领域的范式转变
Acta Pharm Sin B. 2021 Jun;11(6):1379-1399. doi: 10.1016/j.apsb.2021.03.024. Epub 2021 Mar 20.
10
Tongue image quality assessment based on a deep convolutional neural network.基于深度卷积神经网络的舌象质量评估。
BMC Med Inform Decis Mak. 2021 May 5;21(1):147. doi: 10.1186/s12911-021-01508-8.