State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China.
Ping An Healthcare and Technology, Beijing, China.
J Med Internet Res. 2024 Jul 30;26:e50067. doi: 10.2196/50067.
Machine learning (ML) risk prediction models, although much more accurate than traditional statistical methods, are inconvenient to use in clinical practice due to their nontransparency and requirement of a large number of input variables.
We aimed to develop a precise, explainable, and flexible ML model to predict the risk of in-hospital mortality in patients with ST-segment elevation myocardial infarction (STEMI).
This study recruited 18,744 patients enrolled in the 2013 China Acute Myocardial Infarction (CAMI) registry and 12,018 patients from the China Patient-Centered Evaluative Assessment of Cardiac Events (PEACE)-Retrospective Acute Myocardial Infarction Study. The Extreme Gradient Boosting (XGBoost) model was derived from 9616 patients in the CAMI registry (2014, 89 variables) with 5-fold cross-validation and validated on both the 9125 patients in the CAMI registry (89 variables) and the independent China PEACE cohort (10 variables). The Shapley Additive Explanations (SHAP) approach was employed to interpret the complex relationships embedded in the proposed model.
In the XGBoost model for predicting all-cause in-hospital mortality, the variables with the top 8 most important scores were age, left ventricular ejection fraction, Killip class, heart rate, creatinine, blood glucose, white blood cell count, and use of angiotensin-converting enzyme inhibitors (ACEIs) and angiotensin II receptor blockers (ARBs). The area under the curve (AUC) on the CAMI validation set was 0.896 (95% CI 0.884-0.909), significantly higher than the previous models. The AUC for the Global Registry of Acute Coronary Events (GRACE) model was 0.809 (95% CI 0.790-0.828), and for the TIMI model, it was 0.782 (95% CI 0.763-0.800). Despite the China PEACE validation set only having 10 available variables, the AUC reached 0.840 (0.829-0.852), showing a substantial improvement to the GRACE (0.762, 95% CI 0.748-0.776) and TIMI (0.789, 95% CI 0.776-0.803) scores. Several novel and nonlinear relationships were discovered between patients' characteristics and in-hospital mortality, including a U-shape pattern of high-density lipoprotein cholesterol (HDL-C).
The proposed ML risk prediction model was highly accurate in predicting in-hospital mortality. Its flexible and explainable characteristics make the model convenient to use in clinical practice and could help guide patient management.
ClinicalTrials.gov NCT01874691; https://clinicaltrials.gov/study/NCT01874691.
机器学习(ML)风险预测模型虽然比传统统计方法准确得多,但由于其不透明性和对大量输入变量的要求,在临床实践中使用起来很不方便。
我们旨在开发一个精确、可解释和灵活的 ML 模型,以预测 ST 段抬高型心肌梗死(STEMI)患者住院期间的死亡风险。
本研究纳入了 2013 年中国急性心肌梗死(CAMI)注册研究中的 18744 例患者和中国患者中心评估心脏事件(PEACE)-回顾性急性心肌梗死研究中的 12018 例患者。XGBoost 模型源自 CAMI 注册研究中的 9616 例患者(2014 年,89 个变量),采用 5 折交叉验证,并在 CAMI 注册研究中的 9125 例患者(89 个变量)和独立的中国 PEACE 队列(10 个变量)上进行验证。采用 Shapley Additive Explanations(SHAP)方法解释拟议模型中嵌入的复杂关系。
在用于预测全因住院死亡率的 XGBoost 模型中,前 8 个最重要评分变量分别为年龄、左心室射血分数、Killip 分级、心率、肌酐、血糖、白细胞计数和血管紧张素转换酶抑制剂(ACEI)/血管紧张素 II 受体阻滞剂(ARB)的使用。CAMI 验证集的曲线下面积(AUC)为 0.896(95%CI 0.884-0.909),显著高于以前的模型。全球急性冠状动脉事件注册(GRACE)模型的 AUC 为 0.809(95%CI 0.790-0.828),TIMI 模型的 AUC 为 0.782(95%CI 0.763-0.800)。尽管中国 PEACE 验证集仅包含 10 个可用变量,但 AUC 仍达到 0.840(0.829-0.852),与 GRACE(0.762,95%CI 0.748-0.776)和 TIMI(0.789,95%CI 0.776-0.803)评分相比有显著提高。发现患者特征与住院死亡率之间存在一些新的非线性关系,包括高密度脂蛋白胆固醇(HDL-C)的 U 型模式。
所提出的 ML 风险预测模型在预测住院死亡率方面具有很高的准确性。其灵活和可解释的特点使其在临床实践中使用方便,并有助于指导患者管理。
ClinicalTrials.gov NCT01874691;https://clinicaltrials.gov/study/NCT01874691。