Zhu He, Bai Jun, Li Na, Li Xiaoxiao, Liu Dianbo, Buckeridge David L, Li Yue
School of Computer Science, McGill University, Montreal, QC, Canada.
Mila-Quebec AI Institute, Montreal, QC, Canada.
NPJ Digit Med. 2025 May 17;8(1):286. doi: 10.1038/s41746-025-01661-8.
Federated learning (FL) enables collaborative analysis of decentralized medical data while preserving patient privacy. However, the covariate shift from demographic and clinical differences can reduce model generalizability. We propose FedWeight, a novel FL framework that mitigates covariate shift by reweighting patient data from the source sites using density estimators, allowing the trained model to better align with the distribution of the target site. To support unsupervised applications, we introduce FedWeight ETM, a federated embedded topic model. We evaluated FedWeight in cross-site FL on the eICU dataset and cross-dataset FL between eICU and MIMIC III. FedWeight consistently outperforms standard FL baselines in predicting ICU mortality, ventilator use, sepsis diagnosis, and length of stay. SHAP-based interpretation and ETM-based topic modeling reveal improved identification of clinically relevant characteristics and disease topics associated with ICU readmission.
联邦学习(FL)能够在保护患者隐私的同时对分散的医学数据进行协作分析。然而,人口统计学和临床差异导致的协变量偏移会降低模型的泛化能力。我们提出了FedWeight,这是一种新颖的联邦学习框架,它使用密度估计器对来自源站点的患者数据进行重新加权,从而减轻协变量偏移,使训练好的模型能够更好地与目标站点的分布对齐。为了支持无监督应用,我们引入了FedWeight ETM,这是一种联邦嵌入式主题模型。我们在eICU数据集上的跨站点联邦学习以及eICU和MIMIC III之间的跨数据集联邦学习中对FedWeight进行了评估。在预测ICU死亡率、呼吸机使用情况、脓毒症诊断和住院时长方面,FedWeight始终优于标准的联邦学习基线。基于SHAP的解释和基于ETM的主题建模揭示了在识别与ICU再入院相关的临床相关特征和疾病主题方面的改进。