Division of Biomedical Informatics, University of California San Diego, San Diego, USA.
Division of Pulmonary, Critical Care, Allergy and Sleep Medicine, Emory University School of Medicine, Atlanta, USA.
Sci Rep. 2022 May 19;12(1):8380. doi: 10.1038/s41598-022-12497-7.
The inherent flexibility of machine learning-based clinical predictive models to learn from episodes of patient care at a new institution (site-specific training) comes at the cost of performance degradation when applied to external patient cohorts. To exploit the full potential of cross-institutional clinical big data, machine learning systems must gain the ability to transfer their knowledge across institutional boundaries and learn from new episodes of patient care without forgetting previously learned patterns. In this work, we developed a privacy-preserving learning algorithm named WUPERR (Weight Uncertainty Propagation and Episodic Representation Replay) and validated the algorithm in the context of early prediction of sepsis using data from over 104,000 patients across four distinct healthcare systems. We tested the hypothesis, that the proposed continual learning algorithm can maintain higher predictive performance than competing methods on previous cohorts once it has been trained on a new patient cohort. In the sepsis prediction task, after incremental training of a deep learning model across four hospital systems (namely hospitals H-A, H-B, H-C, and H-D), WUPERR maintained the highest positive predictive value across the first three hospitals compared to a baseline transfer learning approach (H-A: 39.27% vs. 31.27%, H-B: 25.34% vs. 22.34%, H-C: 30.33% vs. 28.33%). The proposed approach has the potential to construct more generalizable models that can learn from cross-institutional clinical big data in a privacy-preserving manner.
基于机器学习的临床预测模型具有从新机构(特定于机构的培训)的患者护理事件中学习的固有灵活性,但在应用于外部患者队列时,其性能会下降。为了充分利用跨机构临床大数据的潜力,机器学习系统必须能够在机构边界内转移其知识,并从新的患者护理事件中学习,而不会忘记以前学到的模式。在这项工作中,我们开发了一种名为 WUPERR(权重不确定性传播和事件表示重放)的隐私保护学习算法,并在使用来自四个不同医疗保健系统的超过 104,000 名患者的数据进行脓毒症早期预测的背景下验证了该算法。我们检验了以下假设,即在对新患者队列进行培训后,与竞争方法相比,所提出的持续学习算法可以在以前的队列上保持更高的预测性能。在脓毒症预测任务中,在四个医院系统(即医院 H-A、H-B、H-C 和 H-D)上对深度学习模型进行增量训练后,WUPERR 在前三家医院的阳性预测值(PPV)均高于基线迁移学习方法(H-A:39.27%比 31.27%,H-B:25.34%比 22.34%,H-C:30.33%比 28.33%)。所提出的方法具有构建更具通用性的模型的潜力,这些模型可以以隐私保护的方式从跨机构临床大数据中学习。