Wang Kai, Abbott Diana
Department of Biostatistics, College of Public Health, The University of Iowa, Iowa City, IA 52242, USA.
Genet Epidemiol. 2008 Feb;32(2):108-18. doi: 10.1002/gepi.20266.
With the rapid development of modern genotyping technology, it is becoming commonplace to genotype densely spaced genetic markers such as single nucleotide polymorphisms (SNPs) along the genome. This development has inspired a strong interest in using multiple markers located in the target region for the detection of association. We introduce a principal components (PCs) regression method for candidate gene association studies where multiple SNPs from the candidate region tend to be correlated. In this approach, the total variance in the original genotype scores is decomposed into parts that correspond to uncorrelated PCs. The PCs with the largest variances are then used as regressors in a multiple regression. Simulation studies suggest that this approach can have higher power than some popular methods. An application to CHI3L2 gene expression data confirms a significant association between CHI3L2 gene expression level and SNPs from this gene that has been previously reported by others.
随着现代基因分型技术的快速发展,对基因组中紧密间隔的遗传标记(如单核苷酸多态性(SNP))进行基因分型已变得司空见惯。这一发展激发了人们对使用位于目标区域的多个标记来检测关联性的浓厚兴趣。我们为候选基因关联研究引入了一种主成分(PC)回归方法,其中来自候选区域的多个SNP往往是相关的。在这种方法中,原始基因型得分的总方差被分解为与不相关主成分相对应的部分。然后,将具有最大方差的主成分用作多元回归中的回归变量。模拟研究表明,这种方法可能比一些常用方法具有更高的功效。对CHI3L2基因表达数据的应用证实了CHI3L2基因表达水平与该基因的SNP之间存在显著关联,这一点此前已被其他人报道过。