König Sebastian, Pellissier Vincent, Hohenstein Sven, Leiner Johannes, Meier-Hellmann Andreas, Kuhlen Ralf, Hindricks Gerhard, Bollmann Andreas
Heart Center Leipzig at University of Leipzig, Department of Electrophysiology, Strümpellstraße 39, 04289 Leipzig, Germany.
Leipzig Heart Institute, Leipzig, Germany.
Eur Heart J Digit Health. 2022 Mar 31;3(2):307-310. doi: 10.1093/ehjdh/ztac012. eCollection 2022 Jun.
Utilizing administrative data may facilitate risk prediction in heart failure inpatients. In this short report, we present different machine learning models that predict in-hospital mortality on an individual basis utilizing this widely available data source.
Inpatient cases with a main discharge diagnosis of heart failure hospitalized between 1 January 2016 and 31 December 2018 in one of 86 German Helios hospitals were examined. Comorbidities were defined by ICD-10 codes from administrative data. The data set was randomly split into 75/25% portions for model development and testing. Five algorithms were evaluated: logistic regression [generalized linear models (GLMs)], random forest (RF), gradient boosting machine (GBM), single-layer neural network (NNET), and extreme gradient boosting (XGBoost). After model tuning, the receiver operating characteristics area under the curves (ROC AUCs) were calculated and compared with DeLong's test. A total of 59 074 inpatient cases (mean age 77.6 ± 11.1 years, 51.9% female, 89.4% NYHA Class III/IV) were included and in-hospital mortality was 6.2%. In the test data set, calculated ROC AUCs were 0.853 [95% confidence interval (CI) 0.842-0.863] for GLM, 0.851 (95% CI 0.840-0.862) for RF, 0.855 (95% CI 0.844-0.865) for GBM, 0.836 (95% CI 0.823-0.849) for NNET, and 0.856 (95% CI 9.846-0.867) for XGBoost. XGBoost outperformed all models except GBM.
Machine learning-based processing of administrative data enables the creation of well-performing prediction models for in-hospital mortality in heart failure patients.
利用管理数据可能有助于预测心力衰竭住院患者的风险。在本简短报告中,我们展示了不同的机器学习模型,这些模型利用这一广泛可用的数据源对个体的院内死亡率进行预测。
对2016年1月1日至2018年12月31日期间在德国86家赫利俄斯医院之一住院、主要出院诊断为心力衰竭的患者进行了研究。合并症由管理数据中的ICD - 10编码定义。数据集被随机分为75/25%用于模型开发和测试。评估了五种算法:逻辑回归[广义线性模型(GLMs)]、随机森林(RF)、梯度提升机(GBM)、单层神经网络(NNET)和极端梯度提升(XGBoost)。经过模型调整后,计算曲线下面积的受试者工作特征(ROC AUCs),并与德龙检验进行比较。共纳入59074例住院患者(平均年龄77.6±11.1岁,51.9%为女性,89.4%为纽约心脏协会III/IV级),院内死亡率为6.2%。在测试数据集中,GLM的计算ROC AUCs为0.853[95%置信区间(CI)0.842 - 0.863],RF为0.851(95% CI 0.840 - 0.862),GBM为0.855(95% CI 0.844 - 0.865),NNET为0.836(95% CI 0.823 - 0.849),XGBoost为0.856(95% CI 9.846 - 0.867)。除GBM外,XGBoost的表现优于所有模型。
基于机器学习的管理数据处理能够为心力衰竭患者的院内死亡率创建性能良好的预测模型。