Suppr超能文献

基于多种体格检查指标的大数据挖掘对糖尿病进展的风险预测

Risk Prediction of Diabetes Progression Using Big Data Mining with Multifarious Physical Examination Indicators.

作者信息

Chen Xiaohong, Zhou Shiqi, Yang Lin, Zhong Qianqian, Liu Hongguang, Zhang Yongjian, Yu Hanyi, Cai Yongjiang

机构信息

Center of Health Management, Peking University Shenzhen Hospital, Shenzhen, People's Republic of China.

School of Future Technology, South China University of Technology, Guangzhou, People's Republic of China.

出版信息

Diabetes Metab Syndr Obes. 2024 Mar 11;17:1249-1265. doi: 10.2147/DMSO.S449955. eCollection 2024.

Abstract

PURPOSE

The purpose of this study is to explore the independent-influencing factors from normal people to prediabetes and from prediabetes to diabetes and use different prediction models to build diabetes prediction models.

METHODS

The original data in this retrospective study are collected from the participants who took physical examinations in the Health Management Center of Peking University Shenzhen Hospital. Regression analysis is individually applied between the populations of normal and prediabetes, as well as the populations of prediabetes and diabetes, for feature selection. Afterward,the independent influencing factors mentioned above are used as predictive factors to construct a prediction model.

RESULTS

Selecting physical examination indicators for training different ML models through univariate and multivariate logistic regression, the study finds Age, PRO, TP, and ALT are four independent risk factors for normal people to develop prediabetes, and GLB and HDL.C are two independent protective factors, while logistic regression performs best on the testing set (Acc: 0.76, F-measure: 0.74, AUC: 0.78). We also find Age, Gender, BMI, SBP, U.GLU, PRO, ALT, and TG are independent risk factors for prediabetes people to diabetes, and AST is an independent protective factor, while logistic regression performs best on the testing set (Acc: 0.86, F-measure: 0.84, AUC: 0.74).

CONCLUSION

The discussion of the clinical relationships between these indicators and diabetes supports the interpretability of our feature selection. Among four prediction models, the logistic regression model achieved the best performance on the testing set.

摘要

目的

本研究旨在探索从正常人到糖尿病前期以及从糖尿病前期到糖尿病的独立影响因素,并使用不同的预测模型构建糖尿病预测模型。

方法

本回顾性研究的原始数据收集自北京大学深圳医院健康管理中心进行体检的参与者。分别对正常人和糖尿病前期人群以及糖尿病前期和糖尿病人群进行回归分析以进行特征选择。之后,将上述独立影响因素用作预测因子来构建预测模型。

结果

通过单变量和多变量逻辑回归选择体检指标来训练不同的机器学习模型,研究发现年龄、前白蛋白(PRO)、总蛋白(TP)和谷丙转氨酶(ALT)是正常人发展为糖尿病前期的四个独立危险因素,而球蛋白(GLB)和高密度脂蛋白胆固醇(HDL.C)是两个独立保护因素,同时逻辑回归在测试集上表现最佳(准确率:0.76,F值:0.74,曲线下面积:0.78)。我们还发现年龄、性别、体重指数(BMI)、收缩压(SBP)、尿葡萄糖(U.GLU)、前白蛋白(PRO)、谷丙转氨酶(ALT)和甘油三酯(TG)是糖尿病前期人群发展为糖尿病的独立危险因素,而谷草转氨酶(AST)是独立保护因素,同时逻辑回归在测试集上表现最佳(准确率:0.86,F值:0.84,曲线下面积:0.74)。

结论

对这些指标与糖尿病之间临床关系的讨论支持了我们特征选择的可解释性。在四个预测模型中,逻辑回归模型在测试集上表现最佳。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f31c/10942017/ccdc50234a35/DMSO-17-1249-g0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验