Han Larry, Hou Jue, Cho Kelly, Duan Rui, Cai Tianxi
Department of Biostatistics, Harvard University.
Department of Public Health and Health Sciences, Northeastern University.
J Am Stat Assoc. 2025 Mar 17. doi: 10.1080/01621459.2025.2453249.
Federated learning of causal estimands may greatly improve estimation efficiency by leveraging data from multiple study sites, but robustness to heterogeneity and model misspecifications is vital for ensuring validity. We develop a Federated Adaptive Causal Estimation (FACE) framework to incorporate heterogeneous data from multiple sites to provide treatment effect estimation and inference for a flexibly specified target population of interest. FACE accounts for site-level heterogeneity in the distribution of covariates through density ratio weighting. To safely incorporate source sites and avoid negative transfer, we introduce an adaptive weighting procedure via a penalized regression, which achieves both consistency and optimal efficiency. Our strategy is communication-efficient and privacy-preserving, allowing participating sites to share summary statistics only once with other sites. We conduct both theoretical and numerical evaluations of FACE and apply it to conduct a comparative effectiveness study of BNT162b2 (Pfizer) and mRNA-1273 (Moderna) vaccines on COVID-19 outcomes in U.S. veterans using electronic health records from five VA regional sites. We show that compared to traditional methods, FACE meaningfully increases the precision of treatment effect estimates, with reductions in standard errors ranging from 26% to 67%.
因果估计量的联邦学习可以通过利用来自多个研究地点的数据极大地提高估计效率,但对异质性和模型错误设定的稳健性对于确保有效性至关重要。我们开发了一个联邦自适应因果估计(FACE)框架,以整合来自多个地点的异质数据,为灵活指定的目标感兴趣人群提供治疗效果估计和推断。FACE通过密度比加权来考虑协变量分布中的地点级异质性。为了安全地纳入源地点并避免负迁移,我们通过惩罚回归引入了一种自适应加权程序,该程序实现了一致性和最优效率。我们的策略具有通信效率且能保护隐私,允许参与的地点仅与其他地点共享一次汇总统计信息。我们对FACE进行了理论和数值评估,并将其应用于使用来自五个退伍军人事务部(VA)地区地点的电子健康记录,对BNT162b2(辉瑞)和mRNA-1273(莫德纳)疫苗在美国退伍军人中对COVID-19结局的比较有效性研究。我们表明,与传统方法相比,FACE显著提高了治疗效果估计的精度,标准误差降低了26%至67%。