Wang Tao, Elston Robert C
Department of Epidemiology and Biostatistics, Case Western Reserve University, Cleveland, OH 44106, USA.
Am J Hum Genet. 2007 Feb;80(2):353-60. doi: 10.1086/511312. Epub 2006 Dec 21.
Association studies offer an exciting approach to finding underlying genetic variants of complex human diseases. However, identification of genetic variants still includes difficult challenges, and it is important to develop powerful new statistical methods. Currently, association methods may depend on single-locus analysis--that is, analysis of the association of one locus, which is typically a single-nucleotide polymorphism (SNP), at a time--or on multilocus analysis, in which multiple SNPs are used to allow extraction of maximum information about linkage disequilibrium (LD). It has been shown that single-locus analysis may have low power because a single SNP often has limited LD information. Multilocus analysis, which is more informative, can be performed on the basis of either haplotypes or genotypes. It may lose power because of the often large number of degrees of freedom involved. The ideal method must make full use of important information from multiple loci but avoid increasing the degrees of freedom. Therefore, we propose a method to capture information from multiple SNPs but with the use of fewer degrees of freedom. When a set of SNPs in a block are correlated because of LD, we might expect that the genotype variation among the different phenotypic groups would extend across all the SNPs, and this information could be compressed into the low-frequency components of a Fourier transform. Therefore, we develop a test based on weighted Fourier transformation coefficients, with more weight given to the low-frequency components. Our simulation results demonstrate the validity and substantially higher power of the proposed method compared with other common methods. This method provides an additional tool to existing methods for identification of causative genetic variants underlying complex diseases.
关联研究为寻找复杂人类疾病潜在的基因变异提供了一种令人兴奋的方法。然而,基因变异的识别仍然面临着艰巨的挑战,开发强大的新统计方法至关重要。目前,关联方法可能依赖于单基因座分析——即一次分析一个基因座(通常是单核苷酸多态性,SNP)的关联性——或者依赖于多基因座分析,其中使用多个SNP以获取关于连锁不平衡(LD)的最大信息。已经表明,单基因座分析的效能可能较低,因为单个SNP通常具有有限的LD信息。信息量更大的多基因座分析可以基于单倍型或基因型进行。由于通常涉及大量自由度,它可能会损失效能。理想的方法必须充分利用来自多个基因座的重要信息,但又要避免增加自由度。因此,我们提出一种方法,既能从多个SNP中获取信息,又能使用较少的自由度。当一个区域内的一组SNP由于LD而相关时,我们可能预期不同表型组之间的基因型变异会扩展到所有SNP,并且该信息可以被压缩到傅里叶变换的低频分量中。因此,我们开发了一种基于加权傅里叶变换系数的检验方法,对低频分量赋予更大的权重。我们的模拟结果表明,与其他常用方法相比,所提出的方法有效且效能显著更高。该方法为现有方法提供了一个额外的工具,用于识别复杂疾病潜在的致病基因变异。