Wan Yik-Ki Jacob, Abdelrahman Samir E, Facelli Julio C, Madaras-Kelly Karl, Kawamoto Kensaku, Dishman Deniz, Himes Samuel R, Cato Kenrick, Rossetti Sarah C, Del Fiol Guilherme
Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, USA.
College of Pharmacy, Idaho State University, Meridian, ID, USA.
J Biomed Inform. 2025 Jul 27;169:104887. doi: 10.1016/j.jbi.2025.104887.
The Health Process Model (HPM)-ExpertSignals Conceptual Framework posits that healthcare professionals' patient care behaviors can be used to predict in-hospital deterioration. Prediction models based on this framework have been validated using data from 4 hospitals within two healthcare systems. As clinician-system interactions may differ across organizations, this study aimed to evaluate the reproducibility and generalizability of the underlying conceptual framework using data from over 200 hospitals across the US.
This study used eICU-CRD, a publicly accessible dataset with data from 208 US hospitals. A logistic regression model was developed to predict in-hospital deterioration following the HPM-ExpertSignals conceptual framework. To test its reproducibility, patients were randomly split into training and testing datasets. After bootstrap testing of the model, the mean precision-recall curve (AUPRC) was compared with outcomes from previously published studies. For generalizability testing, the hospitals in the dataset were randomly assigned into model training or testing sets. After the model was trained with training hospitals' data, generalizability was measured as the percentage of testing hospitals with an AUPRC at or above a baseline performance obtained in the reproducibility experiment.
The AUPRC in the reproducibility experiment was 0.10 (0.09,0.11; 95% CI), equivalent to the AUPRC reported in a previous study at 0.093 (0.09, 0.096; 95% CI). In the generalizability experiment, 94% of the testing hospitals had AUPRC at or above the baseline AUPRC of 0.10.
The study provides evidence supporting the reproducibility of a predictive model following the HPM-ExpertSignals framework. This model also generalized to most hospitals without additional training. Nevertheless, some hospitals still obtained lower-than-expected performance, highlighting the need for model evaluation and potential fine-tuning before local adoption. Similar studies are needed to investigate the reproducibility and generalizability of other classes of machine learning models in healthcare.
健康过程模型(HPM)-专家信号概念框架假定医疗保健专业人员的患者护理行为可用于预测住院期间的病情恶化。基于该框架的预测模型已使用来自两个医疗系统内4家医院的数据进行了验证。由于不同组织之间临床医生与系统的交互可能存在差异,本研究旨在使用来自美国200多家医院的数据评估基础概念框架的可重复性和通用性。
本研究使用了eICU-CRD,这是一个可公开获取的数据集,包含来自208家美国医院的数据。按照HPM-专家信号概念框架开发了一个逻辑回归模型来预测住院期间的病情恶化。为了测试其可重复性,将患者随机分为训练和测试数据集。在对模型进行自助抽样测试后,将平均精确率-召回率曲线(AUPRC)与先前发表研究的结果进行比较。对于通用性测试,将数据集中的医院随机分配到模型训练或测试集中。在用训练医院的数据训练模型后,通用性以测试医院中AUPRC等于或高于在可重复性实验中获得的基线表现的百分比来衡量。
可重复性实验中的AUPRC为0.10(0.09,0.11;95%置信区间),与先前一项研究报告的AUPRC 0.093(0.09,0.096;95%置信区间)相当。在通用性实验中,94%的测试医院的AUPRC等于或高于基线AUPRC 0.10。
该研究提供了证据支持遵循HPM-专家信号框架的预测模型的可重复性。该模型在无需额外训练的情况下也能推广到大多数医院。然而,一些医院的表现仍低于预期,这凸显了在本地应用前进行模型评估和潜在微调的必要性。需要开展类似研究来调查其他类别的机器学习模型在医疗保健中的可重复性和通用性。