The Center for Health Analytics and Synthesis of Evidence (CHASE), University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania; Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA. Electronic address: https://twitter.com/DazhengZ.
The Center for Health Analytics and Synthesis of Evidence (CHASE), University of Pennsylvania Perelman School of Medicine, Philadelphia, Pennsylvania; Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA. Electronic address: https://twitter.com/JiayiJessieTong.
J Biomed Inform. 2024 Feb;150:104595. doi: 10.1016/j.jbi.2024.104595. Epub 2024 Jan 18.
To characterize the interplay between multiple medical conditions across sites and account for the heterogeneity in patient population characteristics across sites within a distributed research network, we develop a one-shot algorithm that can efficiently utilize summary-level data from various institutions. By applying our proposed algorithm to a large pediatric cohort across four national Children's hospitals, we replicated a recently published prospective cohort, the RISK study, and quantified the impact of the risk factors associated with the penetrating or stricturing behaviors of pediatric Crohn's disease (PCD).
In this study, we introduce the ODACoRH algorithm, a one-shot distributed algorithm designed for the competing risks model with heterogeneity. Our approach considers the variability in baseline hazard functions of multiple endpoints of interest across different sites. To accomplish this, we build a surrogate likelihood function by combining patient-level data from the local site with aggregated data from other external sites. We validated our method through extensive simulation studies and replication of the RISK study to investigate the impact of risk factors on the PCD for adolescents and children from four children's hospitals within the PEDSnet, A National Pediatric Learning Health System. To evaluate our ODACoRH algorithm, we compared results from the ODACoRH algorithms with those from meta-analysis as well as those derived from the pooled data.
The ODACoRH algorithm had the smallest relative bias to the gold standard method (-0.2%), outperforming the meta-analysis method (-11.4%). In the PCD association study, the estimated subdistribution hazard ratios obtained through the ODACoRH algorithms are identical on par with the results derived from pooled data, which demonstrates the high reliability of our federated learning algorithms. From a clinical standpoint, the identified risk factors for PCD align well with the RISK study published in the Lancet in 2017 and other published studies, supporting the validity of our findings.
With the ODACoRH algorithm, we demonstrate the capability of effectively integrating data from multiple sites in a decentralized data setting while accounting for between-site heterogeneity. Importantly, our study reveals several crucial clinical risk factors for PCD that merit further investigations.
为了描述多个医学病症在不同地点之间的相互作用,并解释分散式研究网络中不同地点患者人群特征的异质性,我们开发了一种单次算法,能够有效地利用来自各个机构的汇总数据。通过将我们提出的算法应用于来自四个国家儿童医院的大型儿科队列,我们复制了最近发表的前瞻性队列 RISK 研究,并量化了与儿科克罗恩病(PCD)穿透或狭窄行为相关的风险因素的影响。
在这项研究中,我们引入了 ODACoRH 算法,这是一种用于具有异质性的竞争风险模型的单次分布式算法。我们的方法考虑了不同地点多个感兴趣终点的基线风险函数的可变性。为了实现这一点,我们通过将本地站点的患者水平数据与其他外部站点的汇总数据相结合来构建替代似然函数。我们通过广泛的模拟研究和对来自 PEDSnet 的四个儿童医院的青少年和儿童的 PCD 的 RISK 研究的复制来验证我们的方法,以研究风险因素对 PCD 的影响,PEDSnet 是一个国家儿科学习健康系统。为了评估我们的 ODACoRH 算法,我们将 ODACoRH 算法的结果与荟萃分析以及来自汇总数据的结果进行了比较。
ODACoRH 算法对黄金标准方法的相对偏差最小(-0.2%),优于荟萃分析方法(-11.4%)。在 PCD 关联研究中,通过 ODACoRH 算法获得的亚分布风险比与从汇总数据得出的结果完全一致,这证明了我们的联邦学习算法的高度可靠性。从临床角度来看,PCD 的确定风险因素与 2017 年发表在《柳叶刀》上的 RISK 研究以及其他已发表的研究一致,支持了我们的发现的有效性。
使用 ODACoRH 算法,我们展示了在分散数据设置中有效地整合来自多个站点的数据的能力,同时考虑了站点之间的异质性。重要的是,我们的研究揭示了几个对 PCD 至关重要的临床风险因素,值得进一步研究。