• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用环境挥发性有机化合物暴露识别美国人群心血管疾病风险:基于 SHAP 方法的机器学习预测模型。

Identifying cardiovascular disease risk in the U.S. population using environmental volatile organic compounds exposure: A machine learning predictive model based on the SHAP methodology.

机构信息

Cardiovascular medicine department, The Second Affiliated Hospital of Nanchang University, Nanchang, Jiangxi 330006, China.

Department of Neurosurgery, The Second Affiliated Hospital, Jiangxi Medical College, Nanchang University, Nanchang, Jiangxi 330006, China.

出版信息

Ecotoxicol Environ Saf. 2024 Nov 1;286:117210. doi: 10.1016/j.ecoenv.2024.117210. Epub 2024 Oct 23.

DOI:10.1016/j.ecoenv.2024.117210
PMID:39447292
Abstract

BACKGROUND

Cardiovascular disease (CVD) remains a leading cause of mortality globally. Environmental pollutants, specifically volatile organic compounds (VOCs), have been identified as significant risk factors. This study aims to develop a machine learning (ML) model to predict CVD risk based on VOC exposure and demographic data using SHapley Additive exPlanations (SHAP) for interpretability.

METHODS

We utilized data from the National Health and Nutrition Examination Survey (NHANES) from 2011 to 2018, comprising 5098 participants. VOC exposure was assessed through 15 urinary metabolite metrics. The dataset was split into a training set (70 %) and a test set (30 %). Six ML models were developed, including Random Forest (RF), Light Gradient Boosting Machine (LightGBM), Decision Tree (DT), Extreme Gradient Boosting (XGBoost), Multi-Layer Perceptron (MLP), and Support Vector Machines (SVM). Model performance was evaluated using the Area Under the Receiver Operating Characteristic Curve (AUROC), accuracy, balanced accuracy, F1 score, J-index, kappa, Matthew's correlation coefficient (MCC), positive predictive value (PPV), negative predictive value (NPV), sensitivity (sens), specificity (spec) and SHAP was applied to interpret the best-performing model.

RESULTS

The RF model exhibited the highest predictive performance with an ROC of 0.8143. SHAP analysis identified age and ATCA as the most significant predictors, with ATCA showing a protective effect against CVD, particularly in older adults and those with hypertension. The study found a significant interaction between ATCA levels and age, indicating that the protective effect of ATCA is more pronounced in older individuals due to increased oxidative stress and inflammatory responses associated with aging. E-values analysis suggested robustness to unmeasured confounders.

CONCLUSIONS

This study is the first to utilize VOC exposure data to construct an ML model for predicting CVD risk. The findings highlight the potential of combining environmental exposure data with demographic information to enhance CVD risk prediction, supporting the development of personalized prevention and intervention strategies.

摘要

背景

心血管疾病(CVD)仍然是全球主要的死亡原因。环境污染物,特别是挥发性有机化合物(VOCs),已被确定为重要的危险因素。本研究旨在开发一种机器学习(ML)模型,通过 SHapley Additive exPlanations(SHAP)进行可解释性分析,根据 VOC 暴露和人口统计学数据预测 CVD 风险。

方法

我们利用了 2011 年至 2018 年国家健康和营养检查调查(NHANES)的数据,包括 5098 名参与者。通过 15 种尿代谢物指标评估 VOC 暴露。数据集分为训练集(70%)和测试集(30%)。开发了六种 ML 模型,包括随机森林(RF)、轻梯度提升机(LightGBM)、决策树(DT)、极端梯度提升机(XGBoost)、多层感知机(MLP)和支持向量机(SVM)。使用接收器操作特征曲线下面积(AUROC)、准确性、平衡准确性、F1 分数、J 指数、kappa、马修相关系数(MCC)、阳性预测值(PPV)、阴性预测值(NPV)、灵敏度(sens)、特异性(spec)和 SHAP 评估模型性能,应用 SHAP 分析来解释性能最佳的模型。

结果

RF 模型表现出最高的预测性能,ROC 为 0.8143。SHAP 分析确定年龄和 ATCA 是最重要的预测因子,ATCA 对 CVD 具有保护作用,特别是在老年人和高血压患者中。研究发现 ATCA 水平与年龄之间存在显著的交互作用,表明由于与衰老相关的氧化应激和炎症反应增加,ATCA 的保护作用在老年人中更为明显。E 值分析表明对未测量的混杂因素具有稳健性。

结论

这是首次利用 VOC 暴露数据构建用于预测 CVD 风险的 ML 模型的研究。研究结果强调了将环境暴露数据与人口统计学信息相结合以增强 CVD 风险预测的潜力,支持开发个性化的预防和干预策略。

相似文献

1
Identifying cardiovascular disease risk in the U.S. population using environmental volatile organic compounds exposure: A machine learning predictive model based on the SHAP methodology.利用环境挥发性有机化合物暴露识别美国人群心血管疾病风险:基于 SHAP 方法的机器学习预测模型。
Ecotoxicol Environ Saf. 2024 Nov 1;286:117210. doi: 10.1016/j.ecoenv.2024.117210. Epub 2024 Oct 23.
2
Personal exposure to mixtures of volatile organic compounds: modeling and further analysis of the RIOPA data.个人对挥发性有机化合物混合物的暴露:RIOPA数据的建模与进一步分析
Res Rep Health Eff Inst. 2014 Jun(181):3-63.
3
A Risk Prediction Model for Physical Restraints Among Older Chinese Adults in Long-term Care Facilities: Machine Learning Study.长期护理机构中老年人身体约束的风险预测模型:机器学习研究。
J Med Internet Res. 2023 Apr 6;25:e43815. doi: 10.2196/43815.
4
Associations between specific volatile organic chemical exposures and cardiovascular disease risks: insights from NHANES.特定挥发性有机化合物暴露与心血管疾病风险之间的关联:来自 NHANES 的见解。
Front Public Health. 2024 May 23;12:1378444. doi: 10.3389/fpubh.2024.1378444. eCollection 2024.
5
Effects of Various Heavy Metal Exposures on Insulin Resistance in Non-diabetic Populations: Interpretability Analysis from Machine Learning Modeling Perspective.各种重金属暴露对非糖尿病人群胰岛素抵抗的影响:基于机器学习建模视角的可解释性分析
Biol Trace Elem Res. 2024 Dec;202(12):5438-5452. doi: 10.1007/s12011-024-04126-3. Epub 2024 Feb 26.
6
Associations of urinary volatile organic compounds with cardiovascular disease among the general adult population.尿液挥发性有机化合物与普通成年人群中心血管疾病的关联。
Int J Environ Health Res. 2024 Nov;34(11):3876-3890. doi: 10.1080/09603123.2024.2331732. Epub 2024 Mar 24.
7
Building a predictive model for hypertension related to environmental chemicals using machine learning.利用机器学习构建与环境化学有关的高血压预测模型。
Environ Sci Pollut Res Int. 2024 Jan;31(3):4595-4605. doi: 10.1007/s11356-023-31384-w. Epub 2023 Dec 17.
8
Predictive model and risk analysis for peripheral vascular disease in type 2 diabetes mellitus patients using machine learning and shapley additive explanation.基于机器学习和 Shapley 加法解释的 2 型糖尿病患者外周血管疾病预测模型和风险分析。
Front Endocrinol (Lausanne). 2024 Feb 28;15:1320335. doi: 10.3389/fendo.2024.1320335. eCollection 2024.
9
Application of machine learning model in predicting the likelihood of blood transfusion after hip fracture surgery.机器学习模型在预测髋部骨折手术后输血可能性中的应用。
Aging Clin Exp Res. 2023 Nov;35(11):2643-2656. doi: 10.1007/s40520-023-02550-4. Epub 2023 Sep 21.
10
Development of interpretable machine learning models to predict in-hospital prognosis of acute heart failure patients.开发可解释的机器学习模型以预测急性心力衰竭患者的院内预后。
ESC Heart Fail. 2024 Oct;11(5):2798-2812. doi: 10.1002/ehf2.14834. Epub 2024 May 15.

引用本文的文献

1
Combined association of chewing capacity and depression with constipation: a cross-sectional study.咀嚼能力与抑郁合并便秘的关联:一项横断面研究。
BMC Gastroenterol. 2025 Jul 14;25(1):517. doi: 10.1186/s12876-025-04123-3.
2
Machine learning models integrating dietary data predict all-cause mortality in U.S. NAFLD patients: an NHANES-based study.整合饮食数据的机器学习模型可预测美国非酒精性脂肪性肝病患者的全因死亡率:一项基于美国国家健康与营养检查调查的研究
Nutr J. 2025 Jul 1;24(1):100. doi: 10.1186/s12937-025-01170-0.
3
Machine learning prediction model with shap interpretation for chronic bronchitis risk assessment based on heavy metal exposure: a nationally representative study.
基于重金属暴露的慢性支气管炎风险评估的具有SHAP解释的机器学习预测模型:一项全国代表性研究。
BMC Pulm Med. 2025 May 22;25(1):252. doi: 10.1186/s12890-025-03724-8.
4
The mediating roles of obesity indicators and serum albumin in the association of DEET exposure with depression and sleep disorders in adults: evidence from NHANES 2007-2016.肥胖指标和血清白蛋白在成年人中DEET暴露与抑郁及睡眠障碍关联中的中介作用:来自2007 - 2016年美国国家健康与营养检查调查(NHANES)的证据
BMC Public Health. 2025 May 6;25(1):1658. doi: 10.1186/s12889-025-22880-4.
5
FOSB is a key factor in the genetic link between inflammatory bowel disease and acute myocardial infarction: multiple bioinformatics analyses and validation.FOSB是炎症性肠病与急性心肌梗死之间遗传联系的关键因素:多项生物信息学分析与验证
BMC Med Genomics. 2025 Apr 3;18(1):63. doi: 10.1186/s12920-025-02129-0.
6
Association between pollinosis and obstructive sleep apnea hypopnea syndrome in the US population: evidence from the NHANES database 2005-2018.美国人群中花粉症与阻塞性睡眠呼吸暂停低通气综合征之间的关联:来自2005 - 2018年美国国家健康与营养检查调查(NHANES)数据库的证据
BMC Pulm Med. 2025 Mar 13;25(1):113. doi: 10.1186/s12890-025-03581-5.