Suppr超能文献

基于机器学习算法通过常规实验室检查预测糖尿病视网膜病变

Predicting diabetic retinopathy based on routine laboratory tests by machine learning algorithms.

作者信息

Wan Xiaohua, Zhang Ruihuan, Wang Yanan, Wei Wei, Song Biao, Zhang Lin, Hu Yanwei

机构信息

Department of Clinical Laboratory, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, People's Republic of China.

Beijing Center for Clinical Laboratories, Beijing, People's Republic of China.

出版信息

Eur J Med Res. 2025 Mar 18;30(1):183. doi: 10.1186/s40001-025-02442-5.

Abstract

OBJECTIVES

This study aimed to identify risk factors for diabetic retinopathy (DR) and develop machine learning (ML)-based predictive models using routine laboratory data in patients with type 2 diabetes mellitus (T2DM).

METHODS

Clinical data from 4259 T2DM inpatients at Beijing Tongren Hospital were analyzed, divided into a model construction data set (N = 3936) and an external validation data set (N = 323). Using 39 optimal variables, a prediction model was constructed using the eXtreme Gradient Boosting (XGBoost) algorithm and compared with four other algorithms: support vector machine (SVM), gradient boosting decision tree (GBDT), neural network (NN), and logistic regression (LR). The Shapley Additive exPlanation (SHAP) method was employed to interpret the XGBoost model. External validation was performed to assess model performance.

RESULTS

DR was present in 47.69% (N = 1877) of T2DM patients in the model construction data set. Among the models tested, the XGBoost model performed best with an AUC of 0.831, accuracy of 0.757, sensitivity of 0.754, specificity of 0.759, and F1-score of 0.752. SHAP explained feature importance for XGBoost model and identified key risk factors for DR. External validation yielded an accuracy of 0.650 for the XGBoost model.

CONCLUSIONS

The XGBoost-based prediction model effectively assesses DR risk in T2DM patients using routine laboratory data, aiding clinicians in identifying high-risk individuals and guiding personalized management strategies, especially in medically underserved areas.

摘要

目的

本研究旨在确定糖尿病视网膜病变(DR)的危险因素,并利用2型糖尿病(T2DM)患者的常规实验室数据开发基于机器学习(ML)的预测模型。

方法

分析北京同仁医院4259例T2DM住院患者的临床资料,分为模型构建数据集(N = 3936)和外部验证数据集(N = 323)。使用39个最优变量,采用极端梯度提升(XGBoost)算法构建预测模型,并与其他四种算法进行比较:支持向量机(SVM)、梯度提升决策树(GBDT)、神经网络(NN)和逻辑回归(LR)。采用夏普利加性解释(SHAP)方法解释XGBoost模型。进行外部验证以评估模型性能。

结果

模型构建数据集中47.69%(N = 1877)的T2DM患者存在DR。在所测试的模型中,XGBoost模型表现最佳,AUC为0.831,准确率为0.757,灵敏度为0.754,特异性为0.759,F1分数为0.752。SHAP解释了XGBoost模型的特征重要性,并确定了DR的关键危险因素。XGBoost模型的外部验证准确率为0.650。

结论

基于XGBoost的预测模型利用常规实验室数据有效评估T2DM患者的DR风险,有助于临床医生识别高危个体并指导个性化管理策略,尤其是在医疗服务不足的地区。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/80a9/11921716/9a7f5a5165e9/40001_2025_2442_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验