South Campus Outpatient Clinic, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China.
Department of Ophthalmology, Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China.
Ophthalmic Res. 2024;67(1):537-548. doi: 10.1159/000541294. Epub 2024 Sep 4.
The aim of this study was to compare various machine learning algorithms for constructing a diabetic retinopathy (DR) prediction model among type 2 diabetes mellitus (DM) patients and to develop a nomogram based on the best model.
This cross-sectional study included DM patients receiving routine DR screening. Patients were randomly divided into training (244) and validation (105) sets. Least absolute shrinkage and selection operator regression was used for the selection of clinical characteristics. Six machine learning algorithms were compared: decision tree (DT), k-nearest neighbours (KNN), logistic regression model (LM), random forest (RF), support vector machine (SVM), and XGBoost (XGB). Model performance was assessed via receiver-operating characteristic (ROC), calibration, and decision curve analyses (DCAs). A nomogram was then developed on the basis of the best model.
Compared with the five other machine learning algorithms (DT, KNN, RF, SVM, and XGB), the LM demonstrated the highest area under the ROC curve (AUC, 0.894) and recall (0.92) in the validation set. Additionally, the calibration curves and DCA results were relatively favourable. Disease duration, DPN, insulin dosage, urinary protein, and ALB were included in the LM. The nomogram exhibited robust discrimination (AUC: 0.856 in the training set and 0.868 in the validation set), calibration, and clinical applicability across the two datasets after 1,000 bootstraps.
Among the six different machine learning algorithms, the LM algorithm demonstrated the best performance. A logistic regression-based nomogram for predicting DR in type 2 DM patients was established. This nomogram may serve as a valuable tool for DR detection, facilitating timely treatment.
本研究旨在比较各种机器学习算法在 2 型糖尿病患者中构建糖尿病视网膜病变(DR)预测模型,并基于最佳模型开发列线图。
这是一项横断面研究,纳入了接受常规 DR 筛查的 2 型糖尿病患者。患者被随机分为训练集(244 例)和验证集(105 例)。使用最小绝对收缩和选择算子回归选择临床特征。比较了 6 种机器学习算法:决策树(DT)、k 近邻(KNN)、逻辑回归模型(LM)、随机森林(RF)、支持向量机(SVM)和 XGBoost(XGB)。通过受试者工作特征(ROC)、校准和决策曲线分析(DCAs)评估模型性能。然后基于最佳模型开发列线图。
与其他 5 种机器学习算法(DT、KNN、RF、SVM 和 XGB)相比,LM 在验证集中表现出最高的 ROC 曲线下面积(AUC,0.894)和召回率(0.92)。此外,校准曲线和 DCA 结果也相对较好。纳入模型的临床特征有病程、DPN、胰岛素剂量、尿蛋白和 ALB。在经过 1000 次自举后,该列线图在两个数据集的训练集和验证集的区分度(AUC:0.856 和 0.868)、校准和临床适用性均表现稳健。
在 6 种不同的机器学习算法中,LM 算法表现最佳。建立了基于逻辑回归的 2 型糖尿病患者 DR 预测列线图。该列线图可能成为 DR 检测的有用工具,有助于及时治疗。