Wei Hongcheng, Sun Jie, Shan Wenqi, Xiao Wenwen, Wang Bingqian, Ma Xuan, Hu Weiyue, Wang Xinru, Xia Yankai
State Key Laboratory of Reproductive Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China; Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing 211166, China.
Department of Endocrinology, Drum Tower hospital affiliated to Nanjing University Medical School, No 321 Zhongshan Road, Nanjing 210008, China.
Sci Total Environ. 2022 Feb 1;806(Pt 2):150674. doi: 10.1016/j.scitotenv.2021.150674. Epub 2021 Sep 29.
With dramatically increasing prevalence, diabetes mellitus has imposed a tremendous toll on individual well-being. Humans are exposed to various environmental chemicals, which have been postulated as underappreciated but potentially modifiable diabetes risk factors.
To determine the utility of environmental chemical exposure in predicting diabetes mellitus.
A total of 8501 eligible participants from NHANES 2005-2016 were randomly assigned to a discovery (N = 5953) set and a validation (N = 2548) set. We applied random forest (RF) and least absolute shrinkage and selection operator (LASSO) regression with 10-fold cross-validation in the discovery set to select features, and built an optimal model to predict diabetes mellitus, blood insulin, fasting plasma glucose (FPG) and 2-h plasma glucose after oral glucose tolerance test (2-h PG after OGTT).
The machine learning model using LASSO regression predicted diabetes with an area under the receiver operating characteristics (AUROC) of 0.80 and 0.78 in the discovery set and validation set, respectively. The linear model predicted blood insulin level with an R of 0.42 and 0.40 in the discovery set and validation set, respectively. For FPG, the discovery set and validation set yielded an R of 0.16 and 0.15, respectively. For 2-h PG after OGTT, the discovery set and validation set yielded an R of 0.18 and 0.17, respectively.
We used environmental chemical exposure, constructed machine learning models and achieved relatively accurate prediction for diabetes, emphasizing the predictive value of widespread environmental chemicals for complicated diseases.
随着糖尿病患病率急剧上升,它给个人健康带来了巨大负担。人类接触各种环境化学物质,这些物质被认为是未得到充分重视但可能可改变的糖尿病风险因素。
确定环境化学物质暴露在预测糖尿病方面的效用。
将来自2005 - 2016年美国国家健康与营养检查调查(NHANES)的8501名符合条件的参与者随机分为发现组(N = 5953)和验证组(N = 2548)。我们在发现组中应用随机森林(RF)和最小绝对收缩和选择算子(LASSO)回归并进行10折交叉验证以选择特征,并建立一个最优模型来预测糖尿病、血液胰岛素、空腹血糖(FPG)和口服葡萄糖耐量试验后2小时血浆葡萄糖(OGTT后2小时PG)。
使用LASSO回归的机器学习模型在发现组和验证组中预测糖尿病的受试者工作特征曲线下面积(AUROC)分别为0.80和0.78。线性模型在发现组和验证组中预测血液胰岛素水平的R分别为0.42和0.40。对于FPG,发现组和验证组的R分别为0.16和0.15。对于OGTT后2小时PG,发现组和验证组的R分别为0.18和0.17。
我们利用环境化学物质暴露构建了机器学习模型,并对糖尿病实现了相对准确的预测,强调了广泛存在的环境化学物质对复杂疾病的预测价值。