Shi Jingchunzi, Lee Seunggeun
Department of Biostatistics, University of Michigan, Ann Arbor, Michigan 48109, U.S.A..
Biometrics. 2016 Sep;72(3):945-54. doi: 10.1111/biom.12481. Epub 2016 Feb 24.
Meta-analysis of trans-ethnic genome-wide association studies (GWAS) has proven to be a practical and profitable approach for identifying loci that contribute to the risk of complex diseases. However, the expected genetic effect heterogeneity cannot easily be accommodated through existing fixed-effects and random-effects methods. In response, we propose a novel random effect model for trans-ethnic meta-analysis with flexible modeling of the expected genetic effect heterogeneity across diverse populations. Specifically, we adopt a modified random effect model from the kernel regression framework, in which genetic effect coefficients are random variables whose correlation structure reflects the genetic distances across ancestry groups. In addition, we use the adaptive variance component test to achieve robust power regardless of the degree of genetic effect heterogeneity. Simulation studies show that our proposed method has well-calibrated type I error rates at very stringent significance levels and can improve power over the traditional meta-analysis methods. We reanalyzed the published type 2 diabetes GWAS meta-analysis (Consortium et al., 2014) and successfully identified one additional SNP that clearly exhibits genetic effect heterogeneity across different ancestry groups. Furthermore, our proposed method provides scalable computing time for genome-wide datasets, in which an analysis of one million SNPs would require less than 3 hours.
跨种族全基因组关联研究(GWAS)的荟萃分析已被证明是一种实用且有效的方法,用于识别与复杂疾病风险相关的基因座。然而,现有的固定效应和随机效应方法难以轻松应对预期的基因效应异质性。为此,我们提出了一种新颖的随机效应模型,用于跨种族荟萃分析,能够灵活地对不同人群之间预期的基因效应异质性进行建模。具体而言,我们采用了核回归框架下的一种改进随机效应模型,其中基因效应系数是随机变量,其相关结构反映了不同祖先群体之间的遗传距离。此外,我们使用自适应方差分量检验,无论基因效应异质性程度如何,都能实现稳健的检验效能。模拟研究表明,我们提出的方法在非常严格的显著性水平下具有校准良好的I型错误率,并且比传统的荟萃分析方法具有更高的检验效能。我们重新分析了已发表的2型糖尿病GWAS荟萃分析(Consortium等人,2014年),并成功识别出一个额外的单核苷酸多态性(SNP),该SNP在不同祖先群体中明显表现出基因效应异质性。此外,我们提出的方法为全基因组数据集提供了可扩展的计算时间,其中对一百万个SNP进行分析所需时间不到3小时。