Zhu Ling, He Shan, Zheng Wanting, Tong Yuanyuan, Yang Feng
Institute of Information on Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing, China.
School of Computer Science, Centre for Computational Biology, The University of Birmingham, Birmingham, UK.
Sci Rep. 2025 Jul 1;15(1):21985. doi: 10.1038/s41598-025-04824-5.
In recent years, the prevalence of chronic diseases such as Ulcerative Colitis (UC) has increased, bringing a heavy burden to healthcare systems. Traditional Chinese Medicine (TCM) stands out for its cost-effective and efficient treatment modalities, providing unique advantages in healthcare. But syndrome differentiation of UC presents a longstanding challenge in TCM due to its chronic nature and varied manifestations. While existing research has primarily explored machine learning applications for diagnosis and prognosis prediction, the critical issue of explainability in syndrome differentiation remains underexamined. To bridge this gap, we propose an ensemble prediction model enhanced with SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) to improve interpretability and clinical utility. Our study utilizes a dataset of 8078 electronic medical records from Dongfang Hospital, Beijing University of Chinese Medicine, collected between 2006 and 2019. Comprehensive evaluations demonstrate that our ensemble models outperform individual deep learning approaches, with the Gradient Boosting (GB) model achieving 83% F1 in syndrome differentiation. Furthermore, SHAP and LIME reveal key features associated with different syndromes, such as frequent stool in spleen-kidney yang deficiency and lower abdominal coldness in spleen yang deficiency, offering valuable insights for intelligent syndrome differentiation. These findings hold significant promise for advancing TCM-based UC management, enhancing clinical decision-making, and improving patient outcomes.
近年来,溃疡性结肠炎(UC)等慢性病的患病率有所上升,给医疗系统带来了沉重负担。中医凭借其经济高效的治疗方式脱颖而出,在医疗保健方面具有独特优势。但由于UC病程较长且表现多样,其在中医中的辨证一直是个长期存在的挑战。虽然现有研究主要探索了机器学习在诊断和预后预测方面的应用,但辨证中的可解释性这一关键问题仍未得到充分研究。为弥补这一差距,我们提出了一种用SHAP(Shapley值加法解释)和LIME(局部可解释模型无关解释)增强的集成预测模型,以提高可解释性和临床实用性。我们的研究使用了北京中医药大学东方医院2006年至2019年期间收集的8078份电子病历数据集。综合评估表明,我们的集成模型优于个体深度学习方法,梯度提升(GB)模型在辨证中F1值达到83%。此外,SHAP和LIME揭示了与不同证型相关的关键特征,如脾肾阳虚证中大便频数以及脾阳虚证中腹部冷感,为智能辨证提供了有价值的见解。这些发现对于推进基于中医的UC管理、加强临床决策和改善患者预后具有重大前景。