Suppr超能文献

基于病历的逻辑回归和机器学习模型的开发和验证,用于诊断糖尿病性视网膜病变。

Development and validation of medical record-based logistic regression and machine learning models to diagnose diabetic retinopathy.

机构信息

Beijing Tongren Eye Center, Beijing Key Laboratory of Intraocular Tumor Diagnosis and Treatment, Beijing Ophthalmology & Visual Sciences Key Lab, Medical Artificial Intelligence Research and Verification Key Laboratory of the Ministry of Industry and Information Technology, Beijing Tongren Hospital, Capital Medical University, 1 Dong Jiao Min Lane, Beijing, 100730, China.

出版信息

Graefes Arch Clin Exp Ophthalmol. 2023 Mar;261(3):681-689. doi: 10.1007/s00417-022-05854-9. Epub 2022 Oct 14.

Abstract

PURPOSES

Many factors were reported to be associated with diabetic retinopathy (DR); however, their contributions remained unclear. We aimed to evaluate the prognostic and diagnostic accuracy of logistic regression and three machine learning models based on various medical records.

METHODS

This was a cross-sectional study. We investigated the prevalence and associations of DR among 757 participants aged 40 years or older in the 2005-2006 National Health and Nutrition Examination Survey (NHANES). We trained the models to predict if the participants had DR with 15 predictor variables. Area under the receiver operating characteristic (AUROC) and mean squared error (MSE) of each algorithm were compared in the external validation dataset using a replicate cohort from NHANES 2007-2008.

RESULTS

Among the 757 participants, 53 (7.00%) subjects had DR, the mean (standard deviation, SD) age was 57.7 (13.04), and 78.0% were male (n = 42). Logistic regression revealed that female gender (OR = 4.130, 95% CI: 1.820-9.380; P < 0.05), HbA1c (OR = 1.665, 95% CI: 1.197-2.317; P < 0.05), serum creatine level (OR = 2.952, 95% CI: 1.274-6.851; P < 0.05), and eGFR level (OR = 1.009, 95% CI: 1.000-1.014, P < 0.05) increased the risk of DR. The average performance obtained from internal validation was similar in all models (AUROC ≥ 0.945), and k-nearest neighbors (KNN) had the highest value with an AUROC of 0.984. In external validation, they remained robust or with modest reductions in discrimination with AUROC still ≥ 0.902, and KNN also performed the best with an AUROC of 0.982. Both logistic regression and machine learning models had good performance in the clinical diagnosis of DR.

CONCLUSIONS

This study highlights the utility of comparing traditional logistic regression to machine learning models. We found that logistic regression performed as well as optimized machine learning methods when classifying DR patients.

摘要

目的

许多因素被报道与糖尿病视网膜病变(DR)有关;然而,它们的贡献仍不清楚。我们旨在评估逻辑回归和三种基于各种病历的机器学习模型的预后和诊断准确性。

方法

这是一项横断面研究。我们调查了 2005-2006 年国家健康和营养检查调查(NHANES)中 757 名年龄在 40 岁或以上的参与者中 DR 的患病率和相关性。我们使用 NHANES 2007-2008 年的重复队列,用 15 个预测变量训练模型来预测参与者是否患有 DR。在外部验证数据集中,使用重复样本比较每个算法的接收者操作特征(ROC)曲线下面积(AUROC)和均方误差(MSE)。

结果

在 757 名参与者中,53 名(7.00%)患有 DR,平均(标准差)年龄为 57.7(13.04),78.0%为男性(n=42)。逻辑回归显示,女性性别(OR=4.130,95%CI:1.820-9.380;P<0.05)、HbA1c(OR=1.665,95%CI:1.197-2.317;P<0.05)、血清肌酐水平(OR=2.952,95%CI:1.274-6.851;P<0.05)和 eGFR 水平(OR=1.009,95%CI:1.000-1.014,P<0.05)增加了 DR 的风险。内部验证得到的平均性能在所有模型中都相似(AUROC≥0.945),并且 K 最近邻(KNN)具有最高的 AUROC 值为 0.984。在外部验证中,它们仍然稳健,或者在判别力上略有降低,AUROC 仍然≥0.902,并且 KNN 也具有最佳的 AUROC 值为 0.982。逻辑回归和机器学习模型在 DR 的临床诊断中均具有良好的性能。

结论

本研究强调了将传统逻辑回归与机器学习模型进行比较的实用性。我们发现,在分类 DR 患者时,逻辑回归的表现与优化后的机器学习方法一样好。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验