Guha Subharup, Li Yi
Department of Biostatistics, University of Florida, 2004 Mowry Rd, Gainesville, 32603, Florida, U.S.A..
Department of Biostatistics, University of Michigan, 1415 Washington Heights, Ann Arbor, 48109, Michigan, U.S.A..
Stat Biosci. 2024 Dec 9. doi: 10.1007/s12561-024-09470-5.
Comparative meta-analyses of groups of subjects by integrating multiple observational studies rely on estimated propensity scores (PSs) to mitigate covariate imbalances. However, PS estimation grapples with the theoretical and practical challenges posed by high-dimensional covariates. Motivated by an integrative analysis of breast cancer patients across seven medical centers, this paper tackles the challenges of integrating multiple observational datasets. The proposed inferential technique, called Bayesian Motif Submatrices for Covariates (B-MSC), addresses the curse of dimensionality by a hybrid of Bayesian and frequentist approaches. B-MSC uses nonparametric Bayesian "Chinese restaurant" processes to eliminate redundancy in the high-dimensional covariates and discover latent or lower-dimensional structures. With these motifs as potential predictors, standard regression techniques can be utilized to accurately infer the PSs and facilitate covariate-balanced group comparisons. Simulations and meta-analysis of the motivating cancer investigation demonstrate the efficacy of the B-MSC approach to accurately estimate the propensity scores and efficiently address covariate imbalance when integrating observational health studies with high-dimensional covariates.
通过整合多项观察性研究对受试者群体进行比较荟萃分析,依赖于估计倾向得分(PS)来减轻协变量不平衡。然而,PS估计面临高维协变量带来的理论和实际挑战。受对七个医疗中心的乳腺癌患者进行综合分析的启发,本文解决了整合多个观察性数据集的挑战。所提出的推理技术,称为协变量的贝叶斯基序子矩阵(B-MSC),通过贝叶斯和频率主义方法的混合来解决维度诅咒问题。B-MSC使用非参数贝叶斯“中餐厅”过程来消除高维协变量中的冗余,并发现潜在的或低维结构。以这些基序作为潜在预测因子,可以利用标准回归技术准确推断PS,并促进协变量平衡的组间比较。对激发性癌症研究的模拟和荟萃分析表明,B-MSC方法在整合具有高维协变量的观察性健康研究时,能够有效估计倾向得分并有效解决协变量不平衡问题。