Suppr超能文献

比较四种机器学习模型预测中国人群2型糖尿病发病的准确性:一项回顾性研究。

Comparing the accuracy of four machine learning models in predicting type 2 diabetes onset within the Chinese population: a retrospective study.

作者信息

Liu Hongzhou, Dong Song, Yang Hua, Wang Linlin, Liu Jia, Du Yangfan, Liu Jing, Lyu Zhaohui, Wang Yuhan, Jiang Li, Yu Shasha, Fu Xiaomin

机构信息

Department of Endocrinology, Aerospace Center Hospital, Beijing, China.

Department of Endocrinology, First Hospital of Handan City, Handan, China.

出版信息

J Int Med Res. 2024 Jun;52(6):3000605241253786. doi: 10.1177/03000605241253786.

Abstract

OBJECTIVE

To evaluate the effectiveness of machine learning (ML) models in predicting 5-year type 2 diabetes mellitus (T2DM) risk within the Chinese population by retrospectively analyzing annual health checkup records.

METHODS

We included 46,247 patients (32,372 and 13,875 in training and validation sets, respectively) from a national health checkup center database. Univariate and multivariate Cox analyses were performed to identify factors influencing T2DM risk. Extreme Gradient Boosting (XGBoost), support vector machine (SVM), logistic regression (LR), and random forest (RF) models were trained to predict 5-year T2DM risk. Model performances were analyzed using receiver operating characteristic (ROC) curves for discrimination and calibration plots for prediction accuracy.

RESULTS

Key variables included fasting plasma glucose, age, and sedentary time. The LR model showed good accuracy with respective areas under the ROC (AUCs) of 0.914 and 0.913 in training and validation sets; the RF model exhibited favorable AUCs of 0.998 and 0.838. In calibration analysis, the LR model displayed good fit for low-risk patients; the RF model exhibited satisfactory fit for low- and high-risk patients.

CONCLUSIONS

LR and RF models can effectively predict T2DM risk in the Chinese population. These models may help identify high-risk patients and guide interventions to prevent complications and disabilities.

摘要

目的

通过回顾性分析年度健康体检记录,评估机器学习(ML)模型在中国人群中预测5年2型糖尿病(T2DM)风险的有效性。

方法

我们纳入了来自国家健康体检中心数据库的46247例患者(训练集和验证集分别为32372例和13875例)。进行单因素和多因素Cox分析以确定影响T2DM风险的因素。训练极端梯度提升(XGBoost)、支持向量机(SVM)、逻辑回归(LR)和随机森林(RF)模型以预测5年T2DM风险。使用受试者工作特征(ROC)曲线分析模型性能以进行鉴别,并使用校准图分析预测准确性。

结果

关键变量包括空腹血糖、年龄和久坐时间。LR模型显示出良好的准确性,训练集和验证集的ROC曲线下面积(AUC)分别为0.914和0.913;RF模型的AUC分别为0.998和0.838。在校准分析中,LR模型对低风险患者显示出良好的拟合度;RF模型对低风险和高风险患者均显示出令人满意的拟合度。

结论

LR和RF模型可以有效预测中国人群的T2DM风险。这些模型可能有助于识别高危患者,并指导采取干预措施以预防并发症和残疾。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e824/11179491/72af7c15fe14/10.1177_03000605241253786-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验