Department of Computer Science and Software Engineering, The University of Western Australia, Perth, Australia.
Discipline of Information Technology, Mathematics & Statistics, Murdoch University, Perth, Australia.
PLoS One. 2019 Jun 26;14(6):e0218760. doi: 10.1371/journal.pone.0218760. eCollection 2019.
The prediction of readmission or death after a hospital discharge for heart failure (HF) remains a major challenge. Modern healthcare systems, electronic health records, and machine learning (ML) techniques allow us to mine data to select the most significant variables (allowing for reduction in the number of variables) without compromising the performance of models used for prediction of readmission and death. Moreover, ML methods based on transformation of variables may potentially further improve the performance.
To use ML techniques to determine the most relevant and also transform variables for the prediction of 30-day readmission or death in HF patients.
We identified all Western Australian patients aged 65 years and above admitted for HF between 2003-2008 in linked administrative data. We evaluated variables associated with HF readmission or death using standard statistical and ML based selection techniques. We also tested the new variables produced by transformation of the original variables. We developed multi-layer perceptron prediction models and compared their predictive performance using metrics such as Area Under the receiver operating characteristic Curve (AUC), sensitivity and specificity.
Following hospital discharge, the proportion of 30-day readmissions or death was 23.7% in our cohort of 10,757 HF patients. The prediction model developed by us using a smaller set of variables (n = 8) had comparable performance (AUC 0.62) to the traditional model (n = 47, AUC 0.62). Transformation of the original 47 variables further improved (p<0.001) the performance of the predictive model (AUC 0.66).
A small set of variables selected using ML matched the performance of the model that used the full set of 47 variables for predicting 30-day readmission or death in HF patients. Model performance can be further significantly improved by transforming the original variables using ML methods.
心力衰竭(HF)患者出院后再入院或死亡的预测仍然是一个主要挑战。现代医疗保健系统、电子健康记录和机器学习(ML)技术使我们能够挖掘数据,选择最重要的变量(允许减少变量的数量),而不会影响用于预测再入院和死亡的模型的性能。此外,基于变量转换的 ML 方法可能会进一步提高性能。
使用 ML 技术确定与 HF 患者 30 天再入院或死亡预测最相关的变量,并对其进行转换。
我们在链接的行政数据中确定了 2003-2008 年期间因 HF 住院的所有 65 岁及以上的西澳大利亚患者。我们使用标准统计和基于 ML 的选择技术评估了与 HF 再入院或死亡相关的变量。我们还测试了原始变量转换后产生的新变量。我们开发了多层感知器预测模型,并使用 AUC、敏感性和特异性等指标比较了它们的预测性能。
在我们的 10757 名 HF 患者队列中,出院后 30 天内再入院或死亡的比例为 23.7%。我们使用较小变量集(n=8)开发的预测模型的性能与传统模型(n=47,AUC 0.62)相当(AUC 0.62)。原始 47 个变量的转换进一步提高了预测模型的性能(p<0.001)(AUC 0.66)。
使用 ML 选择的一小部分变量与使用 47 个完整变量的模型在预测 HF 患者 30 天内再入院或死亡的性能相当。通过使用 ML 方法转换原始变量,可以显著提高模型性能。