Suppr超能文献

用于识别阿扎尔队列人群中糖尿病发病相关危险因素的人工智能生存模型。

Artificial intelligence survival models for identifying relevant risk factors for incident diabetes in Azar cohort population.

作者信息

Gilani Neda, Somi Mohammadhossein, Hamidi Farzaneh, Santaguida Pasqualina, Faramarzi Elnaz, Arabi Belaghi Reza

机构信息

Department of Statistics and Epidemiology, Faculty of Health, Tabriz University of Medical Sciences, Tabriz, Iran.

Liver and Gastrointestinal Diseases Research Center, Tabriz University of Medical Sciences, Tabriz, Iran.

出版信息

Health Promot Perspect. 2025 May 6;15(1):82-92. doi: 10.34172/hpp.025.43105. eCollection 2025 May.

Abstract

BACKGROUND

This study aimed to identify some risk factors associated with time to diabetes type II events using artificial intelligence (AI) survival models (SM) in a population cohort from East Azerbaijan, Iran.

METHODS

Data from Azar-Cohort spanning from 2014 to 2020 was analyzed using the random forest (RF) variable selection method along with Cox regression to identify the most relevant risk factors associated with diabetes. We then developed prediction models using RF survival analysis. Lasso-variable selection and RF variable selection were used to select the most important variables. The concordance index (C-index) was used to evaluate the concordance of the prediction models.

RESULTS

Our LASSO-Cox regression identified six factors to be significantly associated with diabetes: age, mean corpuscular hemoglobin concentration (MCHC), waist circumference (WC), body mass index (BMI), use of sleep medication, and hypertension stage 1 and stage 2. The model included all variables with a C-index of 76.3%. In contrast, the RF analysis identified 21 important variables predicting a higher probability of having diabetes. Of those, WC, MCHC, triglyceride, and age were the most important predictors of diabetes. The RF model converged after 500 trees with an out-of-bag (OOB) of 0.28 and a C-index of 79.5%.

CONCLUSION

RF machine learning algorithms and LASSO-Cox regression analyses consistently identified WC, hypertension, and MCHC as the main risk factors for developing diabetes. The RF approach demonstrated slightly better accuracy in predicting the likelihood of diabetes at different time points.

摘要

背景

本研究旨在利用人工智能(AI)生存模型(SM),在伊朗东阿塞拜疆省的人群队列中,确定与II型糖尿病发病时间相关的一些风险因素。

方法

使用随机森林(RF)变量选择方法和Cox回归分析2014年至2020年阿扎尔队列的数据,以确定与糖尿病最相关的风险因素。然后,我们使用RF生存分析开发预测模型。采用套索变量选择和RF变量选择来选择最重要的变量。一致性指数(C指数)用于评估预测模型的一致性。

结果

我们的套索Cox回归确定了六个与糖尿病显著相关的因素:年龄、平均红细胞血红蛋白浓度(MCHC)、腰围(WC)、体重指数(BMI)、睡眠药物使用情况以及1期和2期高血压。该模型纳入了所有变量,C指数为76.3%。相比之下,RF分析确定了21个预测患糖尿病概率较高的重要变量。其中,WC、MCHC、甘油三酯和年龄是糖尿病最重要的预测因素。RF模型在500棵树后收敛,袋外估计值(OOB)为0.28,C指数为79.5%。

结论

RF机器学习算法和套索Cox回归分析一致确定WC、高血压和MCHC是患糖尿病的主要风险因素。RF方法在预测不同时间点患糖尿病的可能性方面表现出略高的准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a09f/12125507/ff27488b50b8/hpp-15-82-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验