Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA.
Am J Hum Genet. 2013 Jul 11;93(1):42-53. doi: 10.1016/j.ajhg.2013.05.010. Epub 2013 Jun 13.
We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels.
我们提出了一个用于测序关联研究中基于基因或基于区域的多标记稀有变异关联检验的荟萃分析的一般统计框架。在全基因组关联研究中,通过使用回归系数和不同研究的标准误差来组合结果,单标记荟萃分析已被广泛用于提高统计功效。在测序研究中稀有变异的分析中,基于区域的多标记检验通常用于提高功效。我们提出了常用于基于基因或基于区域的稀有变异检验的荟萃分析方法,例如负担检验和方差分量检验。由于个体稀有变异回归系数的估计通常不稳定或不可行,因此该方法通过计算分数统计量来避免这种困难,而仅需要为每个研究拟合零模型,然后跨研究汇总这些分数统计量。我们提出的基于研究特定汇总统计数据的稀有变异关联检验的荟萃分析方法,特别是针对每个变体的分数统计量和针对每个基因或区域的变体间协方差类型(连锁不平衡)关系统计量。该方法能够纳入不同研究之间遗传效应异质性的不同水平,并且适用于多个祖先群体的荟萃分析。我们通过直接汇总个体水平基因型数据来证明,该方法与联合分析的功效基本相同。我们通过改变研究之间的异质性水平来进行广泛的模拟,以评估我们方法的性能,并将所提出的方法应用于血脂水平遗传多队列研究中稀有变异效应的荟萃分析。