Center for Health AI and Synthesis of Evidence (CHASE), Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA.
Stat Med. 2024 Dec 20;43(29):5573-5582. doi: 10.1002/sim.10250. Epub 2024 Nov 3.
Sparse data bias, where there is a lack of sufficient cases, is a common problem in data analysis, particularly when studying rare binary outcomes. Although a two-step meta-analysis approach may be used to lessen the bias by combining the summary statistics to increase the number of cases from multiple studies, this method does not completely eliminate bias in effect estimation. In this paper, we propose a one-shot distributed algorithm for estimating relative risk using a modified Poisson regression for binary data, named ODAP-B. We evaluate the performance of our method through both simulation studies and real-world case analyses of postacute sequelae of SARS-CoV-2 infection in children using data from 184 501 children across eight national academic medical centers. Compared with the meta-analysis method, our method provides closer estimates of the relative risk for all outcomes considered including syndromic and systemic outcomes. Our method is communication-efficient and privacy-preserving, requiring only aggregated data to obtain relatively unbiased effect estimates compared with two-step meta-analysis methods. Overall, ODAP-B is an effective distributed learning algorithm for Poisson regression to study rare binary outcomes. The method provides inference on adjusted relative risk with a robust variance estimator.
稀疏数据偏差是数据分析中一个常见的问题,尤其是在研究罕见的二分类结局时。尽管可以使用两步荟萃分析方法通过合并汇总统计信息来增加来自多个研究的病例数量,从而减轻偏差,但这种方法并不能完全消除效果估计中的偏差。在本文中,我们提出了一种使用修正泊松回归对二分类数据进行相对风险估计的一次性分布式算法,命名为 ODAP-B。我们通过模拟研究和对来自 8 个国家学术医疗中心的 184501 名儿童的 SARS-CoV-2 感染后急性后遗症数据进行实际案例分析,评估了我们方法的性能。与荟萃分析方法相比,我们的方法对所有考虑的结局(包括综合征和系统结局)的相对风险提供了更接近的估计值。我们的方法具有高效的通信和隐私保护,与两步荟萃分析方法相比,仅需要汇总数据即可获得相对无偏的效果估计值。总体而言,ODAP-B 是一种用于研究罕见二分类结局的泊松回归的有效分布式学习算法。该方法提供了具有稳健方差估计量的调整后相对风险的推断。