Hilton C Beau, Milinovich Alex, Felix Christina, Vakharia Nirav, Crone Timothy, Donovan Chris, Proctor Andrew, Nazha Aziz
1Center for Clinical Artificial Intelligence, Cleveland Clinic, Cleveland, OH 44121 USA.
2Cleveland Clinic Lerner College of Medicine of Case Western Reserve University, Cleveland, OH 44121 USA.
NPJ Digit Med. 2020 Apr 3;3:51. doi: 10.1038/s41746-020-0249-z. eCollection 2020.
Hospital systems, payers, and regulators have focused on reducing length of stay (LOS) and early readmission, with uncertain benefit. Interpretable machine learning (ML) may assist in transparently identifying the risk of important outcomes. We conducted a retrospective cohort study of hospitalizations at a tertiary academic medical center and its branches from January 2011 to May 2018. A consecutive sample of all hospitalizations in the study period were included. Algorithms were trained on medical, sociodemographic, and institutional variables to predict readmission, length of stay (LOS), and death within 48-72 h. Prediction performance was measured by area under the receiver operator characteristic curve (AUC), Brier score loss (BSL), which measures how well predicted probability matches observed probability, and other metrics. Interpretations were generated using multiple feature extraction algorithms. The study cohort included 1,485,880 hospitalizations for 708,089 unique patients (median age of 59 years, first and third quartiles (QI) [39, 73]; 55.6% female; 71% white). There were 211,022 30-day readmissions for an overall readmission rate of 14% (for patients ≥65 years: 16%). Median LOS, including observation and labor and delivery patients, was 2.94 days (QI [1.67, 5.34]), or, if these patients are excluded, 3.71 days (QI [2.15, 6.51]). Predictive performance was as follows: 30-day readmission (AUC 0.76/BSL 0.11); LOS > 5 days (AUC 0.84/BSL 0.15); death within 48-72 h (AUC 0.91/BSL 0.001). Explanatory diagrams showed factors that impacted each prediction.
医院系统、医保支付方和监管机构一直致力于缩短住院时长(LOS)并减少早期再入院情况,但成效不明。可解释机器学习(ML)或许有助于清晰地识别重要预后的风险。我们对一所三级学术医疗中心及其分院在2011年1月至2018年5月期间的住院病例进行了一项回顾性队列研究。研究纳入了该时间段内所有连续的住院病例样本。利用医学、社会人口统计学和机构变量对算法进行训练,以预测再入院情况、住院时长(LOS)以及48 - 72小时内的死亡情况。预测性能通过受试者工作特征曲线下面积(AUC)、衡量预测概率与观察概率匹配程度的布里尔评分损失(BSL)以及其他指标来衡量。使用多种特征提取算法生成解释。研究队列包括708,089名不同患者的1,485,880次住院病例(中位年龄59岁,第一和第三四分位数(QI)分别为[39, 73];女性占55.6%;白人占71%)。有211,022例30天内再入院病例,总体再入院率为14%(65岁及以上患者:16%)。包括观察、分娩患者在内的中位住院时长为2.94天(QI [1.67, 5.34]),若排除这些患者,则为3.71天(QI [2.15, 6.51])。预测性能如下:30天再入院(AUC 0.76/BSL 0.11);住院时长>5天(AUC 0.84/BSL 0.15);48 - 72小时内死亡(AUC 0.91/BSL 0.001)。解释性图表展示了影响各预测结果的因素。