Wang Fei, Wang Lu, Song Peter X-K
Global Analytics, Ford Motor Credit, Dearborn, Michigan, U.S.A. 48126.
Department of Biostatistics, University of Michigan, Ann Arbor, Michigan, U.S.A. 48109.
Biometrics. 2016 Dec;72(4):1184-1193. doi: 10.1111/biom.12496. Epub 2016 Feb 22.
Combining multiple studies is frequently undertaken in biomedical research to increase sample sizes for statistical power improvement. We consider the marginal model for the regression analysis of repeated measurements collected in several similar studies with potentially different variances and correlation structures. It is of great importance to examine whether there exist common parameters across study-specific marginal models so that simpler models, sensible interpretations, and meaningful efficiency gain can be obtained. Combining multiple studies via the classical means of hypothesis testing involves a large number of simultaneous tests for all possible subsets of common regression parameters, in which it results in unduly large degrees of freedom and low statistical power. We develop a new method of fused lasso with the adaptation of parameter ordering (FLAPO) to scrutinize only adjacent-pair parameter differences, leading to a substantial reduction for the number of involved constraints. Our method enjoys the oracle properties as does the full fused lasso based on all pairwise parameter differences. We show that FLAPO gives estimators with smaller error bounds and better finite sample performance than the full fused lasso. We also establish a regularized inference procedure based on bias-corrected FLAPO. We illustrate our method through both simulation studies and an analysis of HIV surveillance data collected over five geographic regions in China, in which the presence or absence of common covariate effects is reflective to relative effectiveness of regional policies on HIV control and prevention.
在生物医学研究中,经常会合并多项研究以增加样本量,从而提高统计功效。我们考虑用于对在几项相似研究中收集的重复测量数据进行回归分析的边际模型,这些研究可能具有不同的方差和相关结构。检验各个研究特定的边际模型之间是否存在共同参数非常重要,这样才能获得更简单的模型、合理的解释以及有意义的效率提升。通过经典的假设检验方法合并多项研究,需要对所有可能的共同回归参数子集进行大量的同时检验,这会导致自由度过大且统计功效较低。我们开发了一种新的带有参数排序调整的融合套索方法(FLAPO),只审查相邻参数对之间的差异,从而大幅减少所涉及的约束数量。我们的方法与基于所有成对参数差异的完全融合套索方法一样具有估计量的最优性质。我们表明,与完全融合套索方法相比,FLAPO给出的估计量具有更小的误差界和更好的有限样本性能。我们还基于偏差校正的FLAPO建立了一种正则化推断程序。我们通过模拟研究以及对中国五个地理区域收集的HIV监测数据的分析来说明我们的方法,其中共同协变量效应的存在与否反映了区域HIV控制和预防政策的相对有效性。