Division of Cardiology, Jinan Central Hospital, Cheeloo College of Medicine, Shandong University, No. 105 Jiefang Road, Jinan, 250013, China.
Heart Failure Center, The First Affiliated Hospital of USTC, University of Science and Technology of China, Hefei, China.
ESC Heart Fail. 2021 Dec;8(6):5363-5371. doi: 10.1002/ehf2.13627. Epub 2021 Sep 28.
Predicting the risk of malignant arrhythmias (MA) in hospitalized patients with heart failure (HF) is challenging. Machine learning (ML) can handle a large volume of complex data more effectively than traditional statistical methods. This study explored the feasibility of ML methods for predicting the risk of MA in hospitalized HF patients.
We evaluated the baseline data and MA events of 2794 hospitalized HF patients in the HF cohort in Anhui Province and randomly divided the study population into training and validation sets in a 7:3 ratio. The Lasso-logistic regression, multivariate adaptive regression splines (MARS), classification and regression tree (CART), random forest (RF), and eXtreme gradient boosting (XGBoost) algorithms were used to construct risk prediction models in the training set, and model performance was verified in the validation set. The area under the receiver operating characteristic curve (AUC) and Brier score were employed to evaluate the discrimination and calibration of the model, respectively. Clinical utility of the Lasso-logistic regression model was analysed using decision curve analysis (DCA). The median (Q1, Q3) age of the study population was 70 (61, 77) years, and 39.5% were female. MA events occurred in 117 patients (4.2%) during hospitalization. In the training set (n = 1964), the AUC of the XGBoost model was 0.998 [95% confidence interval (CI) 0.997-1.000], which was higher than the other models (all P < 0.001). In the validation set (n = 830), there was no significant difference in AUC of Lasso-logistic model 1 [AUC: 0.867 (95% CI 0.819-0.915)], Lasso-logistic model 2 [AUC: 0.828 (95% CI 0.764-0.892)], MARS model [AUC: 0.852 (95% CI 0.793-0.910)], RF model [AUC: 0.804 (95% CI 0.726-0.881)], and XGBoost model [AUC: 0.864 (95% CI 0.810-0.918); all P > 0.05], which were higher than that of CART model [AUC: 0.743 (95% CI 0.661-0.824); all P < 0.05]. Brier scores for all prediction models were less than 0.05. DCA results showed that the Lasso-logistic model had a net clinical benefit. Oral antiarrhythmic drug, left bundle branch block, serum magnesium, d-dimer, and random blood glucose were significant predictors in half or more of the models.
The current study findings suggest that ML models based on the Lasso-logistic regression, MARS, RF, and XGBoost algorithms can effectively predict the risk of MA in hospitalized HF patients. The Lasso-logistic model had better clinical interpretability and ease of use than the other models.
预测住院心力衰竭(HF)患者恶性心律失常(MA)的风险具有挑战性。机器学习(ML)可以比传统的统计方法更有效地处理大量复杂数据。本研究探讨了 ML 方法在预测住院 HF 患者 MA 风险中的可行性。
我们评估了安徽省 HF 队列中 2794 例住院 HF 患者的基线数据和 MA 事件,并将研究人群以 7:3 的比例随机分为训练集和验证集。Lasso-logistic 回归、多元自适应回归样条(MARS)、分类和回归树(CART)、随机森林(RF)和极端梯度提升(XGBoost)算法用于构建训练集中的风险预测模型,并在验证集中验证模型性能。接受者操作特征曲线(ROC)下的面积(AUC)和 Brier 评分分别用于评估模型的区分度和校准度。通过决策曲线分析(DCA)分析 Lasso-logistic 回归模型的临床应用价值。研究人群的中位(Q1,Q3)年龄为 70(61,77)岁,39.5%为女性。住院期间 117 例(4.2%)患者发生 MA 事件。在训练集(n=1964)中,XGBoost 模型的 AUC 为 0.998[95%置信区间(CI)0.997-1.000],高于其他模型(均 P<0.001)。在验证集(n=830)中,Lasso-logistic 模型 1[AUC:0.867(95%CI 0.819-0.915)]、Lasso-logistic 模型 2[AUC:0.828(95%CI 0.764-0.892)]、MARS 模型[AUC:0.852(95%CI 0.793-0.910)]、RF 模型[AUC:0.804(95%CI 0.726-0.881)]和 XGBoost 模型[AUC:0.864(95%CI 0.810-0.918)]的 AUC 均无显著差异(均 P>0.05),均高于 CART 模型[AUC:0.743(95%CI 0.661-0.824)](均 P<0.05)。所有预测模型的 Brier 评分均小于 0.05。DCA 结果表明,Lasso-logistic 模型具有净临床获益。口服抗心律失常药物、左束支传导阻滞、血清镁、D-二聚体和随机血糖是半数以上模型中的显著预测因素。
本研究结果表明,基于 Lasso-logistic 回归、MARS、RF 和 XGBoost 算法的 ML 模型可有效预测住院 HF 患者 MA 的风险。Lasso-logistic 模型比其他模型具有更好的临床可解释性和易用性。