Wu Chih-Ying, Hsu Chien-Ning, Wang Charlotte, Chien Jung-Yien, Wang Chi-Chuan, Lin Fang-Ju
Graduate Institute of Clinical Pharmacy, College of Medicine, National Taiwan University, Taipei, Taiwan.
School of Pharmacy, College of Pharmacy, Kaohsiung Medical University, Kaohsiung, Taiwan.
ERJ Open Res. 2025 May 12;11(3). doi: 10.1183/23120541.00651-2024. eCollection 2025 May.
Early readmission and death are critical adverse outcomes following hospitalisation due to exacerbation of chronic obstructive pulmonary disease (ECOPD). This study aimed to develop and validate machine learning models to enhance the prediction of these outcomes after ECOPD hospitalisation.
Utilising a nationwide database, data from the index ECOPD hospitalisation and the preceding year were collected. Prediction models for 30-day readmission and death were developed using logistic lasso regression, random forest, extreme gradient boosting (XGBoost) and neural network, with the LACE index serving as a reference. Model performance was assessed with receiver operating characteristic (ROC) curves and calibration plots from the validation dataset. Key predictors were identified using SHapley Additive exPlanations.
The study included 101 011 hospitalisations in the development dataset and 17 565 in the validation dataset. The rates of 30-day readmission and death were 29.1% and 4.3%, respectively. XGBoost outperformed other models, achieving an area under the ROC curve of 0.721 (95% CI 0.713-0.729) for readmission and 0.809 (95% CI 0.794-0.824) for death, both exceeding the corresponding values for the LACE index (0.651 and 0.641). All machine learning models demonstrated good calibration. The number of hospitalisations in the previous year and the lowest haemoglobin level during the index hospitalisation were the top predictors of readmission and death, respectively.
Applying machine learning techniques to large-scale data effectively improves the prediction of early readmission and death following ECOPD hospitalisation. Identifying critical prognostic factors could enhance targeted post-discharge care for this high-risk patient group.
因慢性阻塞性肺疾病急性加重(ECOPD)住院后的早期再入院和死亡是关键的不良结局。本研究旨在开发并验证机器学习模型,以加强对ECOPD住院后这些结局的预测。
利用全国性数据库,收集了首次ECOPD住院及前一年的数据。使用逻辑套索回归、随机森林、极端梯度提升(XGBoost)和神经网络开发了30天再入院和死亡的预测模型,以LACE指数作为参考。通过验证数据集的受试者工作特征(ROC)曲线和校准图评估模型性能。使用SHapley加性解释法确定关键预测因素。
研究纳入了101011例住院病例作为开发数据集,17565例作为验证数据集。30天再入院率和死亡率分别为29.1%和4.3%。XGBoost的表现优于其他模型,再入院的ROC曲线下面积为0.721(95%CI 0.713-0.729),死亡的ROC曲线下面积为0.809(95%CI 0.794-0.824),均超过了LACE指数的相应值(0.651和0.641)。所有机器学习模型均显示出良好的校准。前一年的住院次数和首次住院期间的最低血红蛋白水平分别是再入院和死亡的首要预测因素。
将机器学习技术应用于大规模数据可有效改善对ECOPD住院后早期再入院和死亡的预测。识别关键的预后因素可加强对这一高危患者群体出院后的针对性护理。