Li Haoyang, Zang Chengxi, Xu Zhenxing, Pan Weishen, Rajendran Suraj, Chen Yong, Wang Fei
Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA.
Tri-Institutional Computational Biology & Medicine Program, Weill Cornell Medicine, New York, NY, USA.
medRxiv. 2025 May 5:2025.05.02.25326905. doi: 10.1101/2025.05.02.25326905.
Target trial emulation (TTE) aims to estimate treatment effects by simulating randomized controlled trials using real-world observational data. Applying TTE across distributed datasets shows great promise in improving generalizability and power but is always infeasible due to privacy and data-sharing constraints. Here we propose a Federated Learning-based TTE framework, FL-TTE, that enables TTE across multiple sites without sharing patient-level data. FL-TTE incorporates federated protocol design, federated inverse probability of treatment weighting, and a federated Cox proportional hazards model to estimate time-to-event outcomes across heterogeneous data. We validated FL-TTE by emulating Sepsis trials using eICU and MIMIC-IV data from 192 hospitals, and Alzheimer's trials using INSIGHT Network across five New York City health systems. FL-TTE produced less biased estimates than traditional meta-analysis methods when compared to pooled results and is theoretically supported. Our FL-TTE enables federated treatment effect estimation across distributed and heterogeneous data in a privacy-preserved way.
目标试验模拟(TTE)旨在通过使用真实世界观察数据模拟随机对照试验来估计治疗效果。在分布式数据集上应用TTE在提高普遍性和效能方面显示出巨大潜力,但由于隐私和数据共享限制,往往不可行。在此,我们提出了一个基于联邦学习的TTE框架,即FL-TTE,它能够在不共享患者层面数据的情况下跨多个站点进行TTE。FL-TTE结合了联邦协议设计、治疗权重的联邦逆概率以及联邦Cox比例风险模型,以估计跨异构数据的事件发生时间结果。我们通过使用来自192家医院的eICU和MIMIC-IV数据模拟脓毒症试验,以及使用纽约市五个卫生系统的INSIGHT网络模拟阿尔茨海默病试验,对FL-TTE进行了验证。与汇总结果相比,FL-TTE产生的估计偏差比传统的荟萃分析方法更小,并且有理论支持。我们的FL-TTE能够以隐私保护的方式跨分布式和异构数据进行联邦治疗效果估计。