Kim Min Seok, Choi Young Wook, Prakash Borghare Shubham, Lee Youngju, Lim Soo, Woo Se Joon
Department of Ophthalmology, Seoul National University College of Medicine, Seoul National University Bundang Hospital, Seongnam-si, Republic of Korea.
RetiMark R&D Center, Seoul, Republic of Korea.
Front Med (Lausanne). 2025 May 30;12:1542860. doi: 10.3389/fmed.2025.1542860. eCollection 2025.
Machine learning technology that uses available clinical data to predict diabetic retinopathy (DR) can be highly valuable in medical settings where fundus cameras are not accessible.
This study aimed to develop and compare machine learning algorithms for predicting DR without fundus image.
We used data from Korea National Health and Nutrition Examination Survey (2008-2012 and 2017-2021) and enrolled individuals aged ≥ 20 years with diabetes who received fundus examination. Predictive models for DR were developed using logistic regression and three machine learning algorithms: extreme gradient boosting, decision tree, and random forest. Model performance was evaluated using area under the receiver operating characteristic curve (AUC) and accuracy for the diagnosis of DR, and feature importance was determined using Shapley Additive Explanations (SHAP).
Among the 3,026 diabetic participants (male, 50.7%; mean age, 63.7 ± 10.5 years), 671 (22.2%) had DR. The random forest model, using 16 variables, achieved the highest AUC of 0.748 (95% confidence interval, 0.705-0.790) with a sensitivity 0.669, specificity of 0.729 and an accuracy of 0.715. As interpreted by SHAP, HbA1c, fasting glucose levels, duration of diabetes, and body mass index were identified as common key determinants influencing the model's outcomes.
The DR prediction models using machine learning techniques demonstrated reliable performance even without fundus imaging, with the random forest model showing particularly strong results. These models could assist in managing DR by identifying high-risk patients, enabling timely ophthalmic referrals.
利用现有临床数据预测糖尿病视网膜病变(DR)的机器学习技术,在无法使用眼底相机的医疗环境中具有很高的价值。
本研究旨在开发和比较用于预测无眼底图像的DR的机器学习算法。
我们使用了韩国国家健康与营养检查调查(2008 - 2012年和2017 - 2021年)的数据,纳入年龄≥20岁且接受过眼底检查的糖尿病患者。使用逻辑回归和三种机器学习算法(极端梯度提升、决策树和随机森林)开发DR预测模型。使用受试者工作特征曲线下面积(AUC)和DR诊断准确性评估模型性能,并使用Shapley加性解释(SHAP)确定特征重要性。
在3026名糖尿病参与者中(男性占50.7%;平均年龄63.7±10.5岁),671人(22.2%)患有DR。使用16个变量的随机森林模型实现了最高AUC为0.748(95%置信区间,0.705 - 0.790),灵敏度为0.669,特异度为0.729,准确率为0.715。经SHAP解释,糖化血红蛋白、空腹血糖水平、糖尿病病程和体重指数被确定为影响模型结果的常见关键决定因素。
使用机器学习技术的DR预测模型即使在没有眼底成像的情况下也表现出可靠的性能,随机森林模型的结果尤为突出。这些模型可以通过识别高危患者来协助DR管理,实现及时的眼科转诊。