Suppr超能文献

利用机器学习模型识别和评估冠心病的危险因素。

Use machine learning models to identify and assess risk factors for coronary artery disease.

机构信息

School of Management, Jinan University, Guangzhou, China.

The Second Affiliated Hospital of Xinxiang Medical University, Xinxiang, China.

出版信息

PLoS One. 2024 Sep 6;19(9):e0307952. doi: 10.1371/journal.pone.0307952. eCollection 2024.

Abstract

Accurate prediction of coronary artery disease (CAD) is crucial for enabling early clinical diagnosis and tailoring personalized treatment options. This study attempts to construct a machine learning (ML) model for predicting CAD risk and further elucidate the complex nonlinear interactions between the disease and its risk factors. Employing the Z-Alizadeh Sani dataset, which includes records of 303 patients, univariate analysis and the Boruta algorithm were applied for feature selection, and nine different ML techniques were subsequently deployed to produce predictive models. To elucidate the intricate pathogenesis of CAD, this study harnessed the analytical capabilities of Shapley values, alongside the use of generalized additive models for curve fitting, to probe into the nonlinear interactions between the disease and its associated risk factors. Furthermore, we implemented a piecewise linear regression model to precisely pinpoint inflection points within these complex nonlinear dynamics. The findings of this investigation reveal that logistic regression (LR) stands out as the preeminent predictive model, demonstrating remarkable efficacy, it achieved an Area Under the Receiver Operating Characteristic curve (AUROC) of 0.981 (95% CI: 0.952-1), and an Area Under the Precision-Recall Curve (AUPRC) of 0.993. The utilization of the 14 most pivotal features in constructing a dynamic nomogram. Analysis of the Shapley smoothing curves uncovered distinctive "S"-shaped and "C"-shaped relationships linking age and triglycerides to CAD, respectively. In summary, machine learning models could provide valuable insights for the early diagnosis of CAD. The SHAP method may provide a personalized risk assessment of the relationship between CAD and its risk factors.

摘要

准确预测冠状动脉疾病 (CAD) 对于实现早期临床诊断和制定个性化治疗方案至关重要。本研究试图构建一个用于预测 CAD 风险的机器学习 (ML) 模型,并进一步阐明疾病与其危险因素之间复杂的非线性相互作用。本研究采用包括 303 名患者记录的 Z-Alizadeh Sani 数据集,应用单变量分析和 Boruta 算法进行特征选择,随后使用九种不同的 ML 技术来生成预测模型。为了阐明 CAD 的复杂发病机制,本研究利用 Shapley 值的分析能力,结合广义加性模型进行曲线拟合,探究疾病与其相关危险因素之间的非线性相互作用。此外,我们实施了分段线性回归模型,以准确确定这些复杂非线性动力学中的拐点。本研究的结果表明,逻辑回归 (LR) 是卓越的预测模型,具有显著的效果,其获得了 0.981(95%置信区间:0.952-1)的接收器操作特征曲线下面积 (AUROC) 和 0.993 的精度-召回曲线下面积 (AUPRC)。利用构建动态列线图的 14 个最重要特征。Shapley 平滑曲线的分析揭示了年龄和甘油三酯与 CAD 之间分别具有独特的“S”形和“C”形关系。综上所述,机器学习模型可以为 CAD 的早期诊断提供有价值的见解。SHAP 方法可以提供 CAD 及其危险因素之间关系的个性化风险评估。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2cd/11379138/cde6625e9ba4/pone.0307952.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验