Gauderman W James, Murcray Cassandra, Gilliland Frank, Conti David V
Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California 90033, USA.
Genet Epidemiol. 2007 Jul;31(5):383-95. doi: 10.1002/gepi.20219.
Current technology allows investigators to obtain genotypes at multiple single nucleotide polymorphism (SNPs) within a candidate locus. Many approaches have been developed for using such data in a test of association with disease, ranging from genotype-based to haplotype-based tests. We develop a new approach that involves two basic steps. In the first step, we use principal components (PCs) analysis to compute combinations of SNPs that capture the underlying correlation structure within the locus. The second step uses the PCs directly in a test of disease association. The PC approach captures linkage-disequilibrium information within a candidate region, but does not require the difficult computing implicit in a haplotype analysis. We demonstrate by simulation that the PC approach is typically as or more powerful than both genotype- and haplotype-based approaches. We also analyze association between respiratory symptoms in children and four SNPs in the Glutathione-S-Transferase P1 locus, based on data from the Children's Health Study. We observe stronger evidence of an association using the PC approach (p = 0.044) than using either a genotype-based (p = 0.13) or haplotype-based (p = 0.052) approach.
当前技术使研究人员能够在候选基因座内的多个单核苷酸多态性(SNP)处获取基因型。已经开发了许多方法来在与疾病的关联测试中使用此类数据,从基于基因型的测试到基于单倍型的测试。我们开发了一种新方法,该方法涉及两个基本步骤。第一步,我们使用主成分(PC)分析来计算SNP的组合,这些组合捕获了基因座内潜在的相关结构。第二步直接在疾病关联测试中使用主成分。主成分方法捕获了候选区域内的连锁不平衡信息,但不需要单倍型分析中隐含的复杂计算。我们通过模拟证明,主成分方法通常与基于基因型和基于单倍型的方法一样强大或更强大。我们还根据儿童健康研究的数据,分析了儿童呼吸道症状与谷胱甘肽-S-转移酶P1基因座中的四个SNP之间的关联。我们观察到,使用主成分方法(p = 0.044)比使用基于基因型的方法(p = 0.13)或基于单倍型的方法(p = 0.052)有更强的关联证据。