Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States of America.
Observational Health Data Sciences and Informatics, New York, NY, United States of America; Janssen Research & Development, Titusville, NJ, United States of America; Department of Biostatistics, University of California, Los Angeles, CA, United States of America.
J Biomed Inform. 2023 Sep;145:104476. doi: 10.1016/j.jbi.2023.104476. Epub 2023 Aug 19.
We developed and evaluated a novel one-shot distributed algorithm for evidence synthesis in distributed research networks with rare outcomes.
Fed-Padé, motivated by a classic mathematical tool, Padé approximants, reconstructs the multi-site data likelihood via Padé approximant whose key parameters can be computed distributively. Thanks to the simplicity of [2,2] Padé approximant, Fed-Padé requests an extremely simple task and low communication cost for data partners. Specifically, each data partner only needs to compute and share the log-likelihood and its first 4 gradients evaluated at an initial estimator. We evaluated the performance of our algorithm with extensive simulation studies and four observational healthcare databases.
Our simulation studies revealed that a [2,2]-Padé approximant can well reconstruct the multi-site likelihood so that Fed-Padé produces nearly identical estimates to the pooled analysis. Across all simulation scenarios considered, the median of relative bias and rate of instability of our Fed-Padé are both <0.1%, whereas meta-analysis estimates have bias up to 50% and instability up to 75%. Furthermore, the confidence intervals derived from the Fed-Padé algorithm showed better coverage of the truth than confidence intervals based on the meta-analysis. In real data analysis, the Fed-Padé has a relative bias of <1% for all three comparisons for risks of acute liver injury and decreased libido, whereas the meta-analysis estimates have a substantially higher bias (around 10%).
The Fed-Padé algorithm is nearly lossless, stable, communication-efficient, and easy to implement for models with rare outcomes. It provides an extremely suitable and convenient approach for synthesizing evidence in distributed research networks with rare outcomes.
我们开发并评估了一种新颖的、适用于稀有结局的分布式研究网络中证据综合的单步分布式算法。
Fed-Padé 受到经典数学工具 Padé 逼近的启发,通过 Padé 逼近重构多站点数据似然,其关键参数可以分布式计算。由于 [2,2] Padé 逼近的简单性,Fed-Padé 为数据合作伙伴提出了一个极其简单的任务和低通信成本要求。具体来说,每个数据合作伙伴只需要计算和共享在初始估计器处评估的对数似然及其前 4 个梯度。我们通过广泛的模拟研究和四个观察性医疗保健数据库评估了我们算法的性能。
我们的模拟研究表明,[2,2] Padé 逼近可以很好地重建多站点似然,因此 Fed-Padé 产生的估计值与合并分析几乎相同。在考虑的所有模拟场景中,我们的 Fed-Padé 的中位数相对偏差和不稳定性率均<0.1%,而荟萃分析估计值的偏差高达 50%,不稳定性高达 75%。此外,Fed-Padé 算法得出的置信区间比基于荟萃分析的置信区间更好地覆盖了真实值。在真实数据分析中,Fed-Padé 对于急性肝损伤和性欲降低风险的所有三种比较的相对偏差均<1%,而荟萃分析估计值的偏差要高得多(约 10%)。
Fed-Padé 算法对于稀有结局模型几乎无损、稳定、通信效率高且易于实现。它为稀有结局的分布式研究网络中的证据综合提供了一种极其合适和方便的方法。