Tsepilov Yakov A, Ried Janina S, Strauch Konstantin, Grallert Harald, van Duijn Cornelia M, Axenovich Tatiana I, Aulchenko Yurii S
Institute of Cytology and Genetics SD RAS, Novosibirsk, Russia ; Novosibirsk State University, Novosibirsk, Russia.
Institute of Genetic Epidemiology, Helmholtz Zentrum München - German Research Center for Environmental Health, Neuherberg, Germany.
PLoS One. 2013 Dec 16;8(12):e81431. doi: 10.1371/journal.pone.0081431. eCollection 2013.
Genome-wide association studies (GWAS) comprise a powerful tool for mapping genes of complex traits. However, an inflation of the test statistic can occur because of population substructure or cryptic relatedness, which could cause spurious associations. If information on a large number of genetic markers is available, adjusting the analysis results by using the method of genomic control (GC) is possible. GC was originally proposed to correct the Cochran-Armitage additive trend test. For non-additive models, correction has been shown to depend on allele frequencies. Therefore, usage of GC is limited to situations where allele frequencies of null markers and candidate markers are matched. In this work, we extended the capabilities of the GC method for non-additive models, which allows us to use null markers with arbitrary allele frequencies for GC. Analytical expressions for the inflation of a test statistic describing its dependency on allele frequency and several population parameters were obtained for recessive, dominant, and over-dominant models of inheritance. We proposed a method to estimate these required population parameters. Furthermore, we suggested a GC method based on approximation of the correction coefficient by a polynomial of allele frequency and described procedures to correct the genotypic (two degrees of freedom) test for cases when the model of inheritance is unknown. Statistical properties of the described methods were investigated using simulated and real data. We demonstrated that all considered methods were effective in controlling type 1 error in the presence of genetic substructure. The proposed GC methods can be applied to statistical tests for GWAS with various models of inheritance. All methods developed and tested in this work were implemented using R language as a part of the GenABEL package.
全基因组关联研究(GWAS)是定位复杂性状基因的强大工具。然而,由于群体亚结构或隐匿相关性,可能会出现检验统计量膨胀的情况,这可能导致虚假关联。如果有大量遗传标记的信息,就可以使用基因组控制(GC)方法来调整分析结果。GC最初是为校正 Cochr an - Armitage 加性趋势检验而提出的。对于非加性模型,已表明校正取决于等位基因频率。因此,GC的使用仅限于无效标记和候选标记的等位基因频率相匹配的情况。在这项工作中,我们扩展了GC方法在非加性模型中的功能,这使我们能够使用具有任意等位基因频率的无效标记进行GC。针对隐性、显性和超显性遗传模型,获得了描述检验统计量膨胀及其对等位基因频率和几个群体参数依赖性的解析表达式。我们提出了一种估计这些所需群体参数的方法。此外,我们提出了一种基于等位基因频率多项式逼近校正系数的GC方法,并描述了在遗传模型未知时校正基因型(两个自由度)检验的程序。使用模拟数据和实际数据研究了所描述方法的统计特性。我们证明,在存在遗传亚结构的情况下,所有考虑的方法在控制I型错误方面都是有效的。所提出的GC方法可应用于具有各种遗传模型的GWAS统计检验。在这项工作中开发和测试的所有方法都是使用R语言作为GenABEL软件包的一部分实现的。