Kou Yanqi, Ye Shicai, Tian Yuan, Yang Ke, Qin Ling, Huang Zhe, Luo Botao, Ha Yanping, Zhan Liping, Ye Ruyin, Huang Yujie, Zhang Qing, He Kun, Liang Mouji, Zheng Jieming, Huang Haoyuan, Wu Chunyi, Ge Lei, Yang Yuping
Department of Gastroenterology, Affiliated Hospital of Guangdong Medical University, Zhanjiang, China.
Department of Pathology, Guangdong Medical University, Zhanjiang, China.
J Med Internet Res. 2025 Jan 30;27:e67346. doi: 10.2196/67346.
Gastrointestinal bleeding (GIB) is a severe and potentially life-threatening complication in patients with acute myocardial infarction (AMI), significantly affecting prognosis during hospitalization. Early identification of high-risk patients is essential to reduce complications, improve outcomes, and guide clinical decision-making.
This study aimed to develop and validate a machine learning (ML)-based model for predicting in-hospital GIB in patients with AMI, identify key risk factors, and evaluate the clinical applicability of the model for risk stratification and decision support.
A multicenter retrospective cohort study was conducted, including 1910 patients with AMI from the Affiliated Hospital of Guangdong Medical University (2005-2024). Patients were divided into training (n=1575) and testing (n=335) cohorts based on admission dates. For external validation, 1746 patients with AMI were included in the publicly available MIMIC-IV (Medical Information Mart for Intensive Care IV) database. Propensity score matching was adjusted for demographics, and the Boruta algorithm identified key predictors. A total of 7 ML algorithms-logistic regression, k-nearest neighbors, support vector machine, decision tree, random forest (RF), extreme gradient boosting, and neural networks-were trained using 10-fold cross-validation. The models were evaluated for the area under the receiver operating characteristic curve, accuracy, sensitivity, specificity, recall, Fscore, and decision curve analysis. Shapley additive explanations analysis ranked variable importance. Kaplan-Meier survival analysis evaluated the impact of GIB on short-term survival. Multivariate logistic regression assessed the relationship between coronary heart disease (CHD) and in-hospital GIB after adjusting for clinical variables.
The RF model outperformed other ML models, achieving an area under the receiver operating characteristic curve of 0.77 in the training cohort, 0.77 in the testing cohort, and 0.75 in the validation cohort. Key predictors included red blood cell count, hemoglobin, maximal myoglobin, hematocrit, CHD, and other variables, all of which were strongly associated with GIB risk. Decision curve analysis demonstrated the clinical use of the RF model for early risk stratification. Kaplan-Meier survival analysis showed no significant differences in 7- and 15-day survival rates between patients with AMI with and without GIB (P=.83 for 7-day survival and P=.87 for 15-day survival). Multivariate logistic regression showed that CHD was an independent risk factor for in-hospital GIB (odds ratio 2.79, 95% CI 2.09-3.74). Stratified analyses by sex, age, occupation, marital status, and other subgroups consistently showed that the association between CHD and GIB remained robust across all subgroups.
The ML-based RF model provides a robust and clinically applicable tool for predicting in-hospital GIB in patients with AMI. By leveraging routinely available clinical and laboratory data, the model supports early risk stratification and personalized preventive strategies.
胃肠道出血(GIB)是急性心肌梗死(AMI)患者严重且可能危及生命的并发症,显著影响住院期间的预后。早期识别高危患者对于减少并发症、改善预后及指导临床决策至关重要。
本研究旨在开发并验证一种基于机器学习(ML)的模型,用于预测AMI患者的院内GIB,识别关键危险因素,并评估该模型在风险分层和决策支持方面的临床适用性。
进行了一项多中心回顾性队列研究,纳入了广东医科大学附属医院2005年至2024年的1910例AMI患者。根据入院日期将患者分为训练队列(n = 1575)和测试队列(n = 335)。为进行外部验证,将1746例AMI患者纳入公开可用的MIMIC-IV(重症监护医学信息数据库IV)数据库。对人口统计学进行倾向得分匹配,并使用Boruta算法识别关键预测因素。使用7种ML算法——逻辑回归、k近邻、支持向量机、决策树、随机森林(RF)、极端梯度提升和神经网络——通过10折交叉验证进行训练。对模型进行受试者操作特征曲线下面积、准确性、敏感性、特异性、召回率、F分数和决策曲线分析评估。Shapley加性解释分析对变量重要性进行排名。Kaplan-Meier生存分析评估GIB对短期生存的影响。多变量逻辑回归在调整临床变量后评估冠心病(CHD)与院内GIB之间的关系。
RF模型优于其他ML模型,在训练队列中的受试者操作特征曲线下面积为0.77,测试队列中为0.77,验证队列中为0.75。关键预测因素包括红细胞计数、血红蛋白、最大肌红蛋白、血细胞比容、CHD和其他变量,所有这些均与GIB风险密切相关。决策曲线分析证明了RF模型在早期风险分层中的临床应用。Kaplan-Meier生存分析显示,有和无GIB的AMI患者在7天和1天生存率方面无显著差异(7天生存率P = 0.83,15天生存率P = 0.87)。多变量逻辑回归显示,CHD是院内GIB的独立危险因素(比值比2.79,95%CI 2.09 - 3.74)。按性别、年龄、职业、婚姻状况和其他亚组进行的分层分析一致表明,CHD与GIB之间的关联在所有亚组中均保持稳健。
基于ML的RF模型为预测AMI患者的院内GIB提供了一种强大且临床适用的工具。通过利用常规可用的临床和实验室数据,该模型支持早期风险分层和个性化预防策略。