• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

将代谢组学领域知识与可解释机器学习整合用于动脉粥样硬化性心血管疾病分类

Integrating Metabolomics Domain Knowledge with Explainable Machine Learning in Atherosclerotic Cardiovascular Disease Classification.

作者信息

Santana Everton, Ibrahimi Eliana, Ntalianis Evangelos, Cauwenberghs Nicholas, Kuznetsova Tatiana

机构信息

Research Unit Hypertension and Cardiovascular Epidemiology, KU Leuven Department of Cardiovascular Sciences, University of Leuven, 3000 Leuven, Belgium.

Department of Biology, University of Tirana, 1001 Tirana, Albania.

出版信息

Int J Mol Sci. 2024 Nov 30;25(23):12905. doi: 10.3390/ijms252312905.

DOI:10.3390/ijms252312905
PMID:39684618
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11641503/
Abstract

Metabolomic data often present challenges due to high dimensionality, collinearity, and variability in metabolite concentrations. Machine learning (ML) application in metabolomic analyses is enabling the extraction of meaningful information from complex data. Bringing together domain-specific knowledge from metabolomics with explainable ML methods can refine the predictive performance and interpretability of models used in atherosclerosis research. In this work, we aimed to identify the most impactful metabolites associated with the presence of atherosclerotic cardiovascular disease (ASCVD) in cross-sectional case-control studies using explainable ML methods integrated with metabolomics domain knowledge. For this, a subset from the FLEMENGHO cohort with metabolomic data available was used as the training cohort, including 63 patients with a history of ASCVD and 52 non-smoking controls matched by age, sex, and body mass index from the same population. First, Partial Least Squares Discriminant Analysis (PLS-DA) was applied for dimensionality reduction. The selected metabolites' correlations were analyzed by considering their chemical categorization. Then, eXtreme Gradient Boosting (XGBoost) was used to identify metabolites that characterize ASCVD. Next, the selected metabolites were evaluated in an external cohort to determine their effectiveness in distinguishing between cases and controls. A total of 56 metabolites were selected for ASCVD discrimination using PLS-DA. The primary identified metabolites' superclasses included lipids, organic acids, and organic oxygen compounds. Upon integrating these metabolites with the XGBoost model, the classification yielded a test area under the curve (AUC) of 0.75. SHAP analyses ranked cholesterol, 3-methylhistidine, and glucuronic acid among the most impactful features and showed the diversity of metabolites considered for building the ASCVD discriminator. Also using XGBoost, the selected metabolites achieved an AUC of 0.93 in an independent external validation cohort. In conclusion, the combination of different metabolites has the potential to build classifiers for ASCVD. Integrating metabolite categorization within the SHAP analysis further enhanced the interpretability of the model, offering insights into metabolite-specific contributions to ASCVD risk.

摘要

由于代谢组学数据具有高维度、共线性以及代谢物浓度的变异性等特点,常常带来挑战。机器学习(ML)在代谢组学分析中的应用能够从复杂数据中提取有意义的信息。将代谢组学的领域特定知识与可解释的ML方法相结合,可以提升动脉粥样硬化研究中所用模型的预测性能和可解释性。在这项工作中,我们旨在通过将可解释的ML方法与代谢组学领域知识相结合,在横断面病例对照研究中识别与动脉粥样硬化性心血管疾病(ASCVD)存在相关的最具影响力的代谢物。为此,将来自FLEMENGHO队列且有代谢组学数据的一个子集用作训练队列,其中包括63例有ASCVD病史的患者以及52名来自同一人群的按年龄、性别和体重指数匹配的非吸烟对照。首先,应用偏最小二乘判别分析(PLS-DA)进行降维。通过考虑所选代谢物的化学分类来分析它们的相关性。然后,使用极端梯度提升(XGBoost)来识别表征ASCVD的代谢物。接下来,在一个外部队列中对所选代谢物进行评估,以确定它们在区分病例和对照方面的有效性。使用PLS-DA共选择了56种代谢物用于ASCVD判别。初步鉴定出的代谢物超类包括脂质、有机酸和有机氧化合物。将这些代谢物与XGBoost模型整合后,分类得到的曲线下面积(AUC)为0.75。SHAP分析将胆固醇、3-甲基组氨酸和葡萄糖醛酸列为最具影响力的特征,并展示了用于构建ASCVD判别器的代谢物的多样性。同样使用XGBoost,所选代谢物在一个独立的外部验证队列中实现了0.93的AUC。总之,不同代谢物的组合有潜力构建ASCVD分类器。在SHAP分析中整合代谢物分类进一步增强了模型的可解释性,为代谢物对ASCVD风险的特定贡献提供了见解。

相似文献

1
Integrating Metabolomics Domain Knowledge with Explainable Machine Learning in Atherosclerotic Cardiovascular Disease Classification.将代谢组学领域知识与可解释机器学习整合用于动脉粥样硬化性心血管疾病分类
Int J Mol Sci. 2024 Nov 30;25(23):12905. doi: 10.3390/ijms252312905.
2
Enhancing type 2 diabetes mellitus prediction by integrating metabolomics and tree-based boosting approaches.通过整合代谢组学和基于树的提升方法来增强 2 型糖尿病预测。
Front Endocrinol (Lausanne). 2024 Nov 11;15:1444282. doi: 10.3389/fendo.2024.1444282. eCollection 2024.
3
Proposed Comprehensive Methodology Integrated with Explainable Artificial Intelligence for Prediction of Possible Biomarkers in Metabolomics Panel of Plasma Samples for Breast Cancer Detection.结合可解释人工智能的拟议综合方法,用于预测血浆样本代谢组学面板中乳腺癌检测的潜在生物标志物。
Medicina (Kaunas). 2025 Mar 25;61(4):581. doi: 10.3390/medicina61040581.
4
Interpretable machine learning identifies metabolites associated with glomerular filtration rate in type 2 diabetes patients.可解释机器学习确定 2 型糖尿病患者肾小球滤过率相关的代谢物。
Front Endocrinol (Lausanne). 2024 Jun 10;15:1279034. doi: 10.3389/fendo.2024.1279034. eCollection 2024.
5
Interpretable machine learning with tree-based shapley additive explanations: Application to metabolomics datasets for binary classification.基于树的 Shapley 加性解释的可解释机器学习:在代谢组学数据集的二元分类中的应用。
PLoS One. 2023 May 4;18(5):e0284315. doi: 10.1371/journal.pone.0284315. eCollection 2023.
6
Mid-life anti-inflammatory metabolites are inversely associated with long-term cardiovascular disease events.中年抗炎代谢物与长期心血管疾病事件呈负相关。
EBioMedicine. 2025 Feb;112:105551. doi: 10.1016/j.ebiom.2024.105551. Epub 2025 Jan 9.
7
Improved prediction and risk stratification of major adverse cardiovascular events using an explainable machine learning approach combining plasma biomarkers and traditional risk factors.使用结合血浆生物标志物和传统危险因素的可解释机器学习方法改善主要不良心血管事件的预测和风险分层。
Cardiovasc Diabetol. 2025 Apr 2;24(1):153. doi: 10.1186/s12933-025-02711-x.
8
Machine learning based on metabolomics reveals potential targets and biomarkers for primary Sjogren's syndrome.基于代谢组学的机器学习揭示了原发性干燥综合征的潜在靶点和生物标志物。
Front Mol Biosci. 2022 Sep 5;9:913325. doi: 10.3389/fmolb.2022.913325. eCollection 2022.
9
Pilot-Study to Explore Metabolic Signature of Type 2 Diabetes: A Pipeline of Tree-Based Machine Learning and Bioinformatics Techniques for Biomarkers Discovery.探索 2 型糖尿病代谢特征的初步研究:基于树的机器学习和生物信息学技术的生物标志物发现管道。
Nutrients. 2024 May 20;16(10):1537. doi: 10.3390/nu16101537.
10
An interpretable machine learning model with demographic variables and dietary patterns for ASCVD identification: from U.S. NHANES 1999-2018.一种用于识别动脉粥样硬化性心血管疾病(ASCVD)的、包含人口统计学变量和饮食模式的可解释机器学习模型:基于1999 - 2018年美国国家健康与营养检查调查(NHANES)
BMC Med Inform Decis Mak. 2025 Mar 3;25(1):105. doi: 10.1186/s12911-025-02937-5.

引用本文的文献

1
Relationship between amino acid metabolism and inflammation in coronary heart disease (Review).冠心病中氨基酸代谢与炎症的关系(综述)
Int J Mol Med. 2025 Aug;56(2). doi: 10.3892/ijmm.2025.5561. Epub 2025 Jun 6.

本文引用的文献

1
Metabolomics-Based Machine Learning for Predicting Mortality: Unveiling Multisystem Impacts on Health.基于代谢组学的机器学习预测死亡率:揭示多系统对健康的影响。
Int J Mol Sci. 2024 Oct 30;25(21):11636. doi: 10.3390/ijms252111636.
2
Identification of novel hypertension biomarkers using explainable AI and metabolomics.使用可解释人工智能和代谢组学鉴定新型高血压生物标志物。
Metabolomics. 2024 Nov 3;20(6):124. doi: 10.1007/s11306-024-02182-3.
3
A Systematic Review of Artificial Intelligence Models for Time-to-Event Outcome Applied in Cardiovascular Disease Risk Prediction.
人工智能模型在心血管疾病风险预测中应用的时间事件结局的系统评价。
J Med Syst. 2024 Jul 19;48(1):68. doi: 10.1007/s10916-024-02087-7.
4
Plasma Metabolic Profiling and Multiclass Diagnostic Model Development for Stable Angina Pectoris and Acute Myocardial Infarction.稳定型心绞痛和急性心肌梗死的血浆代谢谱分析及多类诊断模型开发
ACS Omega. 2024 Mar 26;9(14):16322-16333. doi: 10.1021/acsomega.3c10474. eCollection 2024 Apr 9.
5
Metabolomic Profiling of Cholesterol Efflux Capacity in a Multiethnic Population: Insights From MESA.多民族人群胆固醇外排能力的代谢组学分析:来自 MESA 的见解。
Arterioscler Thromb Vasc Biol. 2023 Oct;43(10):2030-2041. doi: 10.1161/ATVBAHA.122.318222. Epub 2023 Aug 24.
6
Interpretable machine learning with tree-based shapley additive explanations: Application to metabolomics datasets for binary classification.基于树的 Shapley 加性解释的可解释机器学习:在代谢组学数据集的二元分类中的应用。
PLoS One. 2023 May 4;18(5):e0284315. doi: 10.1371/journal.pone.0284315. eCollection 2023.
7
Association of Circulating Metabolites With Racial Disparities in Hypertension and Stroke in the REGARDS Study.循环代谢物与 REGARDS 研究中高血压和中风种族差异的关联。
Neurology. 2023 May 30;100(22):e2312-e2320. doi: 10.1212/WNL.0000000000207264. Epub 2023 Apr 17.
8
Applications of machine learning in metabolomics: Disease modeling and classification.机器学习在代谢组学中的应用:疾病建模与分类。
Front Genet. 2022 Nov 24;13:1017340. doi: 10.3389/fgene.2022.1017340. eCollection 2022.
9
A circular network of purine metabolism as coregulators of dilated cardiomyopathy.嘌呤代谢的环状网络作为扩张型心肌病的共调节剂。
J Transl Med. 2022 Nov 18;20(1):532. doi: 10.1186/s12967-022-03739-3.
10
Serum Cholesterol Levels and Risk of Cardiovascular Death: A Systematic Review and a Dose-Response Meta-Analysis of Prospective Cohort Studies.血清胆固醇水平与心血管死亡风险:系统评价和前瞻性队列研究的剂量反应荟萃分析。
Int J Environ Res Public Health. 2022 Jul 6;19(14):8272. doi: 10.3390/ijerph19148272.