Computer and Information Sciences, Temple University, Philadelphia, PA, United States of America.
Weill Cornell Medicine, New York, NY, United States of America.
Artif Intell Med. 2024 Dec;158:103010. doi: 10.1016/j.artmed.2024.103010. Epub 2024 Nov 10.
A prediction model to assess the risk of hospital readmission can be valuable to identify patients who may benefit from extra care. Developing hospital-specific readmission risk prediction models using local data is not feasible for many institutions. Models developed on data from one hospital may not generalize well to another hospital. There is a lack of an end-to-end adaptable readmission model that can generalize to unseen test domains. We propose an early readmission risk domain generalization network, ERR-DGN, for cross-domain knowledge transfer. ERR-DGN internalizes the shared patterns and characteristics that are consistent across source domains, enabling it to adapt to a new domain. It transforms source datasets to a common embedding space while capturing relevant temporal long-term dependencies of sequential data. Domain generalization is then applied on domain-specific fully connected linear layers. The model is optimized by a loss function that integrates distribution discrepancy loss to match the mean embeddings of multiple source distributions with the task-specific loss. A model was developed using electronic health record (EHR) data of 201,688 patients with diabetes across urban, suburban, rural, and mixed hospital systems to enhance 30-day readmission predictions among patients with diabetes on 67,066 unseen patients at a rural hospital. We also explored how model performance varied by the number of sites and over time. The proposed method outperformed the baseline models, yielding a 6 % increase in F1-score (0.79 ± 0.006 vs. 0.73 ± 0.007). Model performance peaked with the inclusion of three sites. Performance of the model was relatively stable for 3 years then declined at 4 years. ERR-DGN may be a proficient tool for learning data from multiple sites and subsequently applying a hospitalization readmission prediction model to a new site. Including a relatively small number of varied sites may be sufficient to achieve peak performance. Periodic retraining at least every 3 years may mitigate model degradation over time.
一种用于评估医院再入院风险的预测模型对于识别可能需要额外护理的患者可能具有重要价值。对于许多机构来说,使用本地数据开发特定于医院的再入院风险预测模型是不可行的。在一家医院的数据上开发的模型可能无法很好地推广到另一家医院。缺乏一种可以推广到未见测试领域的端到端可适应再入院模型。我们提出了一种早期再入院风险领域泛化网络 ERR-DGN,用于跨领域知识转移。ERR-DGN 内化了跨源域一致的共享模式和特征,使其能够适应新的域。它将源数据集转换为公共嵌入空间,同时捕获序列数据的相关长期时间依赖性。然后在特定于域的全连接线性层上应用域泛化。该模型通过一个损失函数进行优化,该函数通过整合分布差异损失来匹配多个源分布的均值嵌入与特定于任务的损失。使用来自城市、郊区、农村和混合医院系统的 201688 名糖尿病患者的电子健康记录 (EHR) 数据开发了一个模型,以增强在农村医院的 67066 名未见患者中糖尿病患者的 30 天再入院预测。我们还探讨了模型性能随站点数量和时间的变化而变化的情况。与基线模型相比,所提出的方法表现更好,F1 分数提高了 6%(0.79 ± 0.006 与 0.73 ± 0.007)。随着包括三个站点,模型性能达到峰值。模型的性能在 3 年内相对稳定,然后在 4 年内下降。ERR-DGN 可能是一种从多个站点学习数据并随后将住院再入院预测模型应用于新站点的有效工具。包括相对较少的多样化站点可能足以达到最佳性能。至少每 3 年进行一次定期重新培训可能会减轻随着时间的推移模型性能下降的问题。