西太平洋地区心血管疾病风险预测的增强：一种针对马来西亚人群的机器学习方法。

Enhanced cardiovascular risk prediction in the Western Pacific: A machine learning approach tailored to the Malaysian population.

作者信息

Kasim Sazzli, Amir Rudin Putri Nur Fatin, Malek Sorayya, Ibrahim Nurulain, Kiew Xue Ning, Nasir Nafiza Mat, Ibrahim Khairul Shafiq, Raja Shariff Raja Ezman

机构信息

Cardiology Department, Faculty of Medicine, Universiti Teknologi MARA (UiTM), Shah Alam, Malaysia.

Cardiac Vascular and Lung Research Institute, Universiti Teknologi MARA (UiTM), Shah Alam, Malaysia.

出版信息

PLoS One. 2025 Jun 17;20(6):e0323949. doi: 10.1371/journal.pone.0323949. eCollection 2025.

DOI:10.1371/journal.pone.0323949

PMID:40526616

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12173414/

Abstract

BACKGROUND

Cardiovascular disease (CVD) is a significant public health challenge in the Western Pacific region, including Malaysia.

OBJECTIVE

This study aimed to develop and validate machine learning (ML) models to predict 10-year CVD risk in a Malaysian cohort, which could serve as a model for other Asian populations with similar genetic and environmental backgrounds.

METHODS

Utilizing data from the REDISCOVER Registry (5,688 participants from 2007 to 2017), 30 clinically relevant features were selected, and several ML algorithms were trained: Support Vector Machine (SVM), Logistic Regression (LR), Random Forest (RF), Extreme Gradient Boosting (XGBoost), Neural Network (NN) and Naive Bayes (NB). Ensemble model were also created using three commonly used meta learners, including RF, Generalized Linear Model (GLM), and Gradient Boosting Model (GBM). The dataset was split into a 70:30 train-test ratio, with 5-fold cross-validation to ensure robust performance. Model evaluation was primarily based on the Area Under the Curve (AUC), with additional metrics such as sensitivity, specificity, and the Net Reclassification Index (NRI) to compare the ML models against traditional risk scores like the Framingham Risk Score (FRS) and Revised Pooled Cohort Equations (RPCE).

RESULTS

The LR model achieved the highest AUC of 0.77, outperforming the FRS (AUC = 0.72) and RPCE (AUC = 0.74). The ensemble model provided robust performance, though it did not significantly exceed the best individual model. SHAP (SHapley Additive exPlanations) analysis identified key predictors such as systolic blood pressure, weight and waist circumference. The study showed a significant NRI improvement of 13.15% compared to the FRS and 7.00% compared to the RPCE, highlighting the potential of ML approaches to enhance CVD risk prediction in Malaysia. The best-performing model was deployed on a web platform for real-time use, ensuring ongoing validation and clinical applicability.

CONCLUSIONS

These findings underscore the effectiveness of ML models in improving CVD risk stratification and decision-making in Malaysia and beyond.

摘要

背景

心血管疾病（CVD）是包括马来西亚在内的西太平洋地区一项重大的公共卫生挑战。

目的

本研究旨在开发并验证机器学习（ML）模型，以预测马来西亚队列中的10年心血管疾病风险，该模型可作为具有相似遗传和环境背景的其他亚洲人群的模型。

方法

利用REDISCOVER注册研究的数据（2007年至2017年的5688名参与者），选择了30个临床相关特征，并训练了几种ML算法：支持向量机（SVM）、逻辑回归（LR）、随机森林（RF）、极端梯度提升（XGBoost）、神经网络（NN）和朴素贝叶斯（NB）。还使用三种常用的元学习器创建了集成模型，包括随机森林、广义线性模型（GLM）和梯度提升模型（GBM）。数据集按70:30的训练-测试比例划分，并进行5折交叉验证以确保稳健性能。模型评估主要基于曲线下面积（AUC），并使用其他指标，如敏感性、特异性和净重新分类指数（NRI），将ML模型与传统风险评分（如弗雷明汉风险评分（FRS）和修订后的合并队列方程（RPCE））进行比较。

结果

LR模型的AUC最高，为0.77，优于FRS（AUC = 0.72）和RPCE（AUC = 0.74）。集成模型表现稳健，尽管没有显著超过最佳的单个模型。SHAP（SHapley加性解释）分析确定了关键预测因素，如收缩压、体重和腰围。研究表明，与FRS相比，NRI显著提高了13.15%，与RPCE相比提高了7.00%，突出了ML方法在增强马来西亚心血管疾病风险预测方面的潜力。性能最佳的模型部署在网络平台上以供实时使用，以确保持续验证和临床适用性。

结论

这些发现强调了ML模型在改善马来西亚及其他地区心血管疾病风险分层和决策方面的有效性。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

西太平洋地区心血管疾病风险预测的增强：一种针对马来西亚人群的机器学习方法。

Enhanced cardiovascular risk prediction in the Western Pacific: A machine learning approach tailored to the Malaysian population.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

本文引用的文献

西太平洋地区心血管疾病风险预测的增强：一种针对马来西亚人群的机器学习方法。

Enhanced cardiovascular risk prediction in the Western Pacific: A machine learning approach tailored to the Malaysian population.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

本文引用的文献