Suppr超能文献

预测接触重金属的成年人患心血管疾病的风险:可解释的机器学习

Predicting the risk of cardiovascular disease in adults exposed to heavy metals: Interpretable machine learning.

作者信息

Shen Meiyue, Zhang Yine, Zhan Runqing, Du Tingwei, Shen Peixuan, Lu Xiaochuan, Liu Shengnan, Guo Rongrong, Shen Xiaoli

机构信息

Department of Epidemiology and Health Statistics, School of Public Health, Qingdao University, Qingdao 266071, China.

Ningxia Center for Disease Control and Prevention, Yinchuan, China.

出版信息

Ecotoxicol Environ Saf. 2025 Jan 15;290:117570. doi: 10.1016/j.ecoenv.2024.117570. Epub 2024 Dec 24.

Abstract

Machine learning exhibits excellent performance in terms of predictive power. We aimed to construct an interpretable machine learning model utilizing National Health and Nutrition Examination Survey data to investigate the relationship between heavy metal exposure and cardiovascular disease (CVD). A total of 4600 adults were included in the analysis. The Least Absolute Shrinkage and Selection Operator regression method was employed to select relevant feature variables. Subsequently, six machine learning models were constructed, including random forest, decision tree, gradient boosting decision tree, k-nearest neighbor, support vector machine, and AdaBoost algorithms. Feature importance analysis, partial dependence plot, and shapley additive explanations were integrated to enhance the interpretability of the CVD prediction model. Among all models, the random forest exhibited the best performance, with an accuracy of 90 %, an area under the curve of 0.85, and an F1 score of 0.86. Urine cadmium (Cd), blood lead (Pb), urine thallium (Tl), and urine tungsten (W) were identified as the most significant predictors of CVD, with importance scores of 0.062, 0.057, 0.051, and 0.050, respectively. At the overall level, higher levels of urine Cd, blood Pb, and urine W were associated with an increased risk of CVD, whereas a lower level of urine Tl was linked to a reduced CVD risk. Additionally, the analysis of synergistic effects revealed that Cd was the predominant determinant of CVD risk. The random forest-based CVD prediction model demonstrated excellent predictive power and provided valuable insights for personalized patient care and optimal resource allocation in populations exposed to heavy metals.

摘要

机器学习在预测能力方面表现出色。我们旨在利用国家健康与营养检查调查数据构建一个可解释的机器学习模型,以研究重金属暴露与心血管疾病(CVD)之间的关系。共有4600名成年人纳入分析。采用最小绝对收缩和选择算子回归方法来选择相关特征变量。随后,构建了六个机器学习模型,包括随机森林、决策树、梯度提升决策树、k近邻、支持向量机和AdaBoost算法。整合特征重要性分析、部分依赖图和夏普利值分解法来提高CVD预测模型的可解释性。在所有模型中,随机森林表现最佳,准确率为90%,曲线下面积为0.85,F1分数为0.86。尿镉(Cd)、血铅(Pb)、尿铊(Tl)和尿钨(W)被确定为CVD的最重要预测因子,重要性得分分别为0.062、0.057、0.051和0.050。总体而言,尿Cd、血Pb和尿W水平较高与CVD风险增加相关,而尿Tl水平较低与CVD风险降低相关。此外,协同效应分析表明,Cd是CVD风险的主要决定因素。基于随机森林的CVD预测模型具有出色的预测能力,为接触重金属人群的个性化患者护理和最佳资源分配提供了有价值的见解。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验