Yu Tingting, Ye Shangyuan, Wang Rui
Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, Massachusetts, U.S.A.
Biostatistics Shared Resource, Knight Cancer Institute, Oregon Health and Science University Portland, Oregon, U.S.A.
Can J Stat. 2024 Sep;52(3):900-923. doi: 10.1002/cjs.11793. Epub 2023 Aug 19.
When analyzing data combined from multiple sources (e.g., hospitals, studies), the heterogeneity across different sources must be accounted for. In this paper, we consider high-dimensional linear regression models for integrative data analysis. We propose a new adaptive clustering penalty (ACP) method to simultaneously select variables and cluster source-specific regression coefficients with sub-homogeneity. We show that the estimator based on the ACP method enjoys a strong oracle property under certain regularity conditions. We also develop an efficient algorithm based on the alternating direction method of multipliers (ADMM) for parameter estimation. We conduct simulation studies to compare the performance of the proposed method to three existing methods (a fused LASSO with adjacent fusion, a pairwise fused LASSO, and a multi-directional shrinkage penalty method). Finally, we apply the proposed method to the multi-center Childhood Adenotonsillectomy Trial to identify sub-homogeneity in the treatment effects across different study sites.
在分析来自多个来源(如医院、研究)的组合数据时,必须考虑不同来源之间的异质性。在本文中,我们考虑用于综合数据分析的高维线性回归模型。我们提出了一种新的自适应聚类惩罚(ACP)方法,以同时选择变量并对具有子同质性的特定来源回归系数进行聚类。我们表明,在某些正则条件下,基于ACP方法的估计器具有很强的神谕性质。我们还基于乘子交替方向法(ADMM)开发了一种用于参数估计的高效算法。我们进行模拟研究,将所提出方法的性能与三种现有方法(具有相邻融合的融合LASSO、成对融合LASSO和多方向收缩惩罚方法)进行比较。最后,我们将所提出的方法应用于多中心儿童腺样体扁桃体切除术试验,以识别不同研究地点治疗效果中的子同质性。