Suppr超能文献

使用机器学习模型对急性心肌梗死短期和长期死亡率预测进行的系统比较。

A systematic comparison of short-term and long-term mortality prediction in acute myocardial infarction using machine learning models.

作者信息

Yang Yawei, Tang Junjie, Ma Liping, Wu Feng, Guan Xiaoqing

机构信息

Department of Cardiology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, Shanghai, 200437, China.

Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, 201203, China.

出版信息

BMC Med Inform Decis Mak. 2025 Jun 5;25(1):208. doi: 10.1186/s12911-025-03052-1.

Abstract

BACKGROUND AND OBJECTIVE

The machine learning (ML) models for acute myocardial infarction (AMI) are considered to have better predictive ability for mortality compared to conventional risk scoring models. However, previous ML prediction models have mostly been short-term (1 year or less) models. Here, we established ML models for long-term prediction of AMI mortality (5 years or 10 years) and systematically compare the predictive capabilities of short-term models versus long-term models across varying survival time periods.

METHODS

An observational retrospective study was conducted to analyse mortality prediction in patients with varying survival times. A total of 4,173 patients were enrolled from two different hospitals in China. The dataset was allocated into three groups and an external test set based on their survival duration: the 1-year group (n = 3,626), the 5-year group (n = 2,102), the 10-year group (n = 721), and the external test set (n = 545). A comprehensive set of 53 variables was collected and utilized for model development. Mortality prediction was analysed using oversampling and feature selection methods coupled with machine learning algorithms. SHapley Additive exPlanations (SHAP) values were utilized to quantify the feature importance of AMI risk. The best-performing models from each group were selected for a systematic comparison of predictive accuracy using the external test set with follow-up exceeding 10 years but with varying survival times.

RESULTS

For the 1-year model, the RF model achieved the best performance on the test dataset, with an F1 score of 97.81% using only oversampling without feature selection. Conversely, in the case of the 5-years, the combination of LASSO and RF yielded the best performance, achieving F1 scores of 91.35% with both feature selection and oversampling. The best model of 10-years group was SVM with only oversampling without feature selection, yielding an F1 score of 80.7%. Age, BNP, and the Killip classification of AMI were consistently identified as robust predictors across all three groups. This underscores aging as a critical AMI risk factor contributing to mortality. However, despite the model's success, when examining the actual survival times of the 545 patients, of which 64% survived beyond 5 years and 37% beyond 10 years, the 1-year model failed to distinguish between these patients, predicting all as low risk. This highlights the limitation of short-term models, indicating their inability to accurately predict actual long-term survival times despite being commonly used in AMI mortality prediction.

CONCLUSIONS

The study identifies Age, BNP, and Killip classification as consistent predictors of AMI mortality across all groups, with Age being the most significant factor. CBC parameters and renal biomarkers were pivotal in short-term models, while therapeutic interventions gained prominence over time. The 10-year group emphasised disease severity and treatment history, indicating survivorship bias. Short-term models, typically relying on data spanning 1 year or less, commonly established as predictive models for AMI risk, demonstrate limited capability in accurately predicting actual long-term survival times. To effectively issue early warnings for genuine long-term mortality risks, it is imperative to collect longer-term patient information and establish ML prediction models tailored to long-term outcomes. Further research is warranted to validate these findings in diverse populations.

摘要

背景与目的

与传统风险评分模型相比,用于急性心肌梗死(AMI)的机器学习(ML)模型被认为对死亡率具有更好的预测能力。然而,先前的ML预测模型大多是短期(1年或更短时间)模型。在此,我们建立了用于AMI死亡率长期预测(5年或10年)的ML模型,并系统地比较了短期模型与长期模型在不同生存时间段的预测能力。

方法

进行了一项观察性回顾性研究,以分析不同生存时间患者的死亡率预测。从中国两家不同医院招募了总共4173例患者。根据生存时间将数据集分为三组和一个外部测试集:1年组(n = 3626)、5年组(n = 2102)、10年组(n = 721)和外部测试集(n = 545)。收集了一组全面的53个变量并用于模型开发。使用过采样和特征选择方法结合机器学习算法分析死亡率预测。利用SHapley加性解释(SHAP)值来量化AMI风险的特征重要性。从每组中选择性能最佳的模型,使用随访超过10年但生存时间不同的外部测试集对预测准确性进行系统比较。

结果

对于1年模型,随机森林(RF)模型在测试数据集上表现最佳,仅使用过采样而不进行特征选择时,F1分数为97.8%。相反,在5年模型中,套索(LASSO)和RF的组合表现最佳,通过特征选择和过采样,F1分数达到91.35%。10年组的最佳模型是仅使用过采样而不进行特征选择的支持向量机(SVM),F1分数为80.7%。年龄、脑钠肽(BNP)和AMI的Killip分级在所有三组中均被一致确定为强有力的预测因素。这突出了衰老作为导致死亡率的关键AMI风险因素。然而,尽管模型取得了成功,但在检查545例患者的实际生存时间时,其中64%存活超过5年,37%存活超过10年,1年模型无法区分这些患者,将所有患者预测为低风险。这突出了短期模型的局限性,表明尽管它们常用于AMI死亡率预测,但无法准确预测实际长期生存时间。

结论

该研究确定年龄、BNP和Killip分级是所有组中AMI死亡率的一致预测因素,年龄是最重要的因素。全血细胞计数(CBC)参数和肾脏生物标志物在短期模型中至关重要,而治疗干预随着时间的推移变得更加突出。10年组强调疾病严重程度和治疗史,表明存在生存偏差。短期模型通常依赖1年或更短时间的数据,通常作为AMI风险的预测模型建立,在准确预测实际长期生存时间方面能力有限。为了有效地对真正的长期死亡风险发出早期预警,必须收集更长期的患者信息并建立针对长期结果的ML预测模型。有必要进行进一步研究以在不同人群中验证这些发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cebb/12143097/ea843379ee78/12911_2025_3052_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验