Tong Jiayi, Luo Chongliang, Islam Md Nazmul, Sheils Natalie E, Buresh John, Edmondson Mackenzie, Merkel Peter A, Lautenbach Ebbing, Duan Rui, Chen Yong
Perelman School of Medicine, The University of Pennsylvania, Philadelphia, PA, USA.
Division of Public Health Sciences, Department of Surgery, Washington University in St. Louis, St. Louis, MO, USA.
NPJ Digit Med. 2022 Jun 14;5(1):76. doi: 10.1038/s41746-022-00615-8.
Integrating real-world data (RWD) from several clinical sites offers great opportunities to improve estimation with a more general population compared to analyses based on a single clinical site. However, sharing patient-level data across sites is practically challenging due to concerns about maintaining patient privacy. We develop a distributed algorithm to integrate heterogeneous RWD from multiple clinical sites without sharing patient-level data. The proposed distributed conditional logistic regression (dCLR) algorithm can effectively account for between-site heterogeneity and requires only one round of communication. Our simulation study and data application with the data of 14,215 COVID-19 patients from 230 clinical sites in the UnitedHealth Group Clinical Research Database demonstrate that the proposed distributed algorithm provides an estimator that is robust to heterogeneity in event rates when efficiently integrating data from multiple clinical sites. Our algorithm is therefore a practical alternative to both meta-analysis and existing distributed algorithms for modeling heterogeneous multi-site binary outcomes.
与基于单个临床站点的分析相比,整合来自多个临床站点的真实世界数据(RWD)为使用更具普遍性的人群改进估计提供了巨大机会。然而,由于担心维护患者隐私,跨站点共享患者层面的数据在实际操作中具有挑战性。我们开发了一种分布式算法,无需共享患者层面的数据即可整合来自多个临床站点的异构RWD。所提出的分布式条件逻辑回归(dCLR)算法可以有效考虑站点间的异质性,并且只需要一轮通信。我们的模拟研究以及对联合健康集团临床研究数据库中来自230个临床站点的14215名COVID-19患者数据的应用表明,所提出的分布式算法在有效整合来自多个临床站点的数据时,能提供一个对事件发生率异质性具有鲁棒性的估计器。因此,我们的算法是荟萃分析和用于对异构多站点二元结局进行建模的现有分布式算法的一种实用替代方法。