Pickrell Joseph, Clerget-Darpoux Françoise, Bourgain Catherine
INSERM, U535, Villejuif, France.
Genet Epidemiol. 2007 Nov;31(7):748-62. doi: 10.1002/gepi.20238.
Though multiple interacting loci are likely involved in the etiology of complex diseases, early genome-wide association studies (GWAS) have depended on the detection of the marginal effects of each locus. Here, we evaluate the power of GWAS in the presence of two linked and potentially associated causal loci for several models of interaction between them and find that interacting loci may give rise to marginal relative risks that are not generally considered in a one-locus model. To derive power under realistic situations, we use empirical data generated by the HapMap ENCODE project for both allele frequencies and linkage disequilibrium (LD) structure. The power is also evaluated in situations where the causal single nucleotide polymorphisms (SNPs) may not be genotyped, but rather detected by proxy using a SNP in LD. A common simplification for such power computations assumes that the sample size necessary to detect the effect at the tSNP is the sample size necessary to detect the causal locus directly divided by the LD measure r(2) between the two. This assumption, which we call the "proportionality assumption", is a simplification of the many factors that contribute to the strength of association at a marker, and has recently been criticized as unreasonable (Terwilliger and Hiekkalinna [2006] Eur J Hum Genet 14(4):426-437), in particular in the presence of interacting and associated loci. We find that this assumption does not introduce much error in single locus models of disease, but may do so in so in certain two-locus models.
尽管多种相互作用的基因座可能参与复杂疾病的病因,但早期全基因组关联研究(GWAS)依赖于检测每个基因座的边际效应。在此,我们评估了在两个连锁且可能相关的因果基因座存在的情况下,GWAS对于它们之间几种相互作用模型的效能,并发现相互作用的基因座可能产生单基因座模型中通常未考虑的边际相对风险。为了在实际情况下推导效能,我们使用了HapMap ENCODE项目生成的关于等位基因频率和连锁不平衡(LD)结构的经验数据。在因果单核苷酸多态性(SNP)可能未进行基因分型而是通过使用处于LD中的SNP进行代理检测的情况下,也评估了效能。对于这种效能计算的一种常见简化假设是,检测tSNP处效应所需的样本量是直接检测因果基因座所需的样本量除以两者之间的LD度量r(2)。我们将这个假设称为“比例假设”,它是对影响标记处关联强度的许多因素的简化,最近被批评为不合理(Terwilliger和Hiekkalinna [2006] Eur J Hum Genet 14(4):426 - 437),特别是在存在相互作用和相关基因座的情况下。我们发现这个假设在疾病的单基因座模型中不会引入太多误差,但在某些双基因座模型中可能会。