Suppr超能文献

基于机器学习和深度学习的糖尿病肾病预测模型的构建、验证和可解释性。

Integrated machine learning and deep learning for predicting diabetic nephropathy model construction, validation, and interpretability.

机构信息

Department of Clinical Medicine, Bengbu Medical University, Bengbu, China.

Department of Oncology Surgery, the Second Affiliated Hospital of Bengbu Medical University, Bengbu, China.

出版信息

Endocrine. 2024 Aug;85(2):615-625. doi: 10.1007/s12020-024-03735-1. Epub 2024 Feb 23.

Abstract

OBJECTIVE

To construct a risk prediction model for assisted diagnosis of Diabetic Nephropathy (DN) using machine learning algorithms, and to validate it internally and externally.

METHODS

Firstly, the data was cleaned and enhanced, and was divided into training and test sets according to the 7:3 ratio. Then, the metrics related to DN were filtered by difference analysis, Least Absolute Shrinkage and Selection Operator (LASSO), Recursive Feature Elimination (RFE), and Max-relevance and Min-redundancy (MRMR) algorithms. Ten machine learning models were constructed based on the key variables. The best model was filtered by Receiver Operating Characteristic (ROC), Precision-Recall (PR), Accuracy, Matthews Correlation Coefficient (MCC), and Kappa, and was internally and externally validated. Based on the best model, an online platform had been constructed.

RESULTS

15 key variables were selected, and among the 10 machine learning models, the Random Forest model achieved the best predictive performance. In the test set, the area under the ROC curve was 0.912, and in two external validation cohorts, the area under the ROC curve was 0.828 and 0.863, indicating excellent predictive and generalization abilities.

CONCLUSION

The model has a good predictive value and is expected to help in the early diagnosis and screening of clinical DN.

摘要

目的

使用机器学习算法构建用于辅助诊断糖尿病肾病 (DN) 的风险预测模型,并进行内部和外部验证。

方法

首先,对数据进行清理和增强,并根据 7:3 的比例将其分为训练集和测试集。然后,通过差异分析、最小绝对值收缩和选择算子 (LASSO)、递归特征消除 (RFE) 和最大相关性和最小冗余度 (MRMR) 算法筛选与 DN 相关的指标。基于关键变量构建了 10 个机器学习模型。通过接收者操作特征 (ROC)、精度-召回率 (PR)、准确性、马修斯相关系数 (MCC) 和 Kappa 筛选最佳模型,并进行内部和外部验证。基于最佳模型,构建了一个在线平台。

结果

筛选出 15 个关键变量,在 10 个机器学习模型中,随机森林模型的预测性能最佳。在测试集中,ROC 曲线下面积为 0.912,在两个外部验证队列中,ROC 曲线下面积分别为 0.828 和 0.863,表明具有良好的预测和泛化能力。

结论

该模型具有良好的预测价值,有望有助于临床 DN 的早期诊断和筛查。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验