Suppr超能文献

基于优化梯度提升分类器模型的中西医结合预测糖尿病视网膜病变风险研究

Risk prediction of integrated traditional Chinese and western medicine for diabetes retinopathy based on optimized gradient boosting classifier model.

作者信息

Xiao Li, Tang Lixuan, Kuang Wenxuan, Yang Yijing, Deng Ying, Lu Jing, Peng Qinghua, Yan Junfeng

机构信息

School of Chinese Medicine, Hunan University of Chinese Medicine, Changsha, China.

School of Medicine, Hunan University of Chinese Medicine, Changsha, China.

出版信息

Medicine (Baltimore). 2024 Dec 20;103(51):e40896. doi: 10.1097/MD.0000000000040896.

Abstract

In order to take full advantage of traditional Chinese medicine (TCM) and western medicine, combined with machine learning technology, to study the risk factors and better risk prediction model of diabetic retinopathy (DR), and provide basis for the screening and treatment of it. Through a retrospective study of DR cases in the real world, the electronic medical records of patients who met screening criteria were collected. Moreover, Recursive Feature Elimination with Cross-Validation (RFECV) was used for feature selection. Then, the prediction model was built based on Gradient Boosting Machine (GBM) and it was compared with 4 other popular machine learning techniques, including Logistic Regression (LR), K-Nearest Neighbors (KNN), Random Forest, and Support Vector Machine (SVM). The models were evaluated with accuracy, precision, recall, F1 score, and area under the curve (AUC) value as indicators. In addition, grid search was used to optimize the model. To explain the results of the model more intuitively, the Shapley Additive exPlanation (SHAP) method was used. A total of 9034 type 2 diabetes mellitus (T2DM) patients meeting the screening criteria were included in this study, including 1118 patients with DR. 19 features were selected using RFECV in the model construction. We constructed 5 commonly used models, including GBM, LR, KNN, Random Forest, and SVM. By comparing model performance, GBM has the highest accuracy (0.85) and AUC value (0.934), which is the best prediction model. We also carried out hyperparameter optimization of grid search for this model, and the model accuracy reached 0.88, and the AUC value increased to 0.958. Through SHAP analysis, it was found that TCM syndrome types, albumin, low density lipoprotein, triglyceride, total protein, glycosylated hemoglobin were closely related to the increased risk of DR. It can be concluded that TCM syndrome type is the risk factor of DR. The GBM classifier based on grid search optimization, with relevant risk factors of TCM and western medicine as variables, can better predict the risk of DR.

摘要

为充分利用中医和西医,结合机器学习技术,研究糖尿病视网膜病变(DR)的危险因素及更好的风险预测模型,为其筛查和治疗提供依据。通过对现实世界中DR病例的回顾性研究,收集符合筛查标准患者的电子病历。此外,采用带交叉验证的递归特征消除法(RFECV)进行特征选择。然后,基于梯度提升机(GBM)构建预测模型,并将其与其他4种常用机器学习技术进行比较,包括逻辑回归(LR)、K近邻(KNN)、随机森林和支持向量机(SVM)。以准确率、精确率、召回率、F1分数和曲线下面积(AUC)值为指标对模型进行评估。此外,使用网格搜索对模型进行优化。为更直观地解释模型结果,采用了夏普利值附加解释(SHAP)方法。本研究共纳入9034例符合筛查标准的2型糖尿病(T2DM)患者,其中1118例患有DR。在模型构建中使用RFECV选择了19个特征。我们构建了5种常用模型,包括GBM、LR、KNN、随机森林和SVM。通过比较模型性能,GBM具有最高的准确率(0.85)和AUC值(0.934),是最佳预测模型。我们还对该模型进行了网格搜索的超参数优化,模型准确率达到0.88,AUC值增至0.958。通过SHAP分析发现,中医证型、白蛋白、低密度脂蛋白、甘油三酯、总蛋白、糖化血红蛋白与DR风险增加密切相关。可以得出结论,中医证型是DR的危险因素。基于网格搜索优化的GBM分类器,以中医和西医的相关危险因素为变量,能够更好地预测DR风险。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/062c/11666193/1bda1b400469/medi-103-e40896-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验