Department of Management Information Systems, College of Business, East Carolina University, USA.
Department of Management Information Systems, College of Business, East Carolina University, USA; Center for Healthcare Management Systems, College of Business, East Carolina University, USA; Big Data and Analytics Research Cluster, East Carolina University, USA.
J Biomed Inform. 2018 Oct;86:143-148. doi: 10.1016/j.jbi.2018.09.009. Epub 2018 Sep 18.
Readmission from inpatient rehabilitation facilities to acute care hospitals is a serious problem. This study aims to develop a predictive model based on machine learning algorithms to identify patients at high risk of readmission.
A retrospective dataset (2001-2017) including 16,902 patients admitted into a large inpatient rehabilitation facility in North Carolina was collected in 2017. Three types of machine learning models with different predictors were compared in 2018. The model with the highest c-statistic was selected as the best model and further tested by using five sets of training and validation data with different split time. The optimum threshold for classification was identified.
The logistic regression model with only functional independence measures has the highest validation c-statistic at 0.852. Using this model to predict the recent 5 years acute care readmissions yielded high discriminative ability (c-statistics: 0.841-0.869). Larger training data yielded better performance on the test data. The default cutoff (0.5) resulted in high specificity (>0.997) but low sensitivity (<0.07). The optimum threshold helped to achieve a balance between sensitivity (0.754-0.867) and specificity (0.747-0.780).
This study demonstrates that functional independence measures can be analyzed by using machine learning algorithms to predict acute care readmissions, thus improving the effectiveness of preventive medicine.
从住院康复机构再入院到急性护理医院是一个严重的问题。本研究旨在开发一种基于机器学习算法的预测模型,以识别有再入院高风险的患者。
2017 年收集了北卡罗来纳州一家大型住院康复机构 2001 年至 2017 年期间的 16902 名患者的回顾性数据集。2018 年比较了三种具有不同预测因子的机器学习模型。选择具有最高 c 统计量的模型作为最佳模型,并使用五组不同时间分割的训练和验证数据进行进一步测试。确定了分类的最佳阈值。
仅使用功能独立性测量的逻辑回归模型在验证中的 c 统计量最高,为 0.852。使用该模型预测最近 5 年的急性护理再入院,具有较高的判别能力(c 统计量:0.841-0.869)。较大的训练数据在测试数据上的性能更好。默认的截断值(0.5)导致特异性较高(>0.997)但敏感性较低(<0.07)。最佳阈值有助于在敏感性(0.754-0.867)和特异性(0.747-0.780)之间取得平衡。
本研究表明,功能独立性测量可以通过使用机器学习算法进行分析,以预测急性护理再入院,从而提高预防医学的效果。