Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK.
Proc Biol Sci. 2011 Apr 22;278(1709):1183-8. doi: 10.1098/rspb.2010.1920. Epub 2010 Oct 6.
Technological developments allow increasing numbers of markers to be deployed in case-control studies searching for genetic factors that influence disease susceptibility. However, with vast numbers of markers, true 'hits' may become lost in a sea of false positives. This problem may be particularly acute for infectious diseases, where the control group may contain unexposed individuals with susceptible genotypes. To explore this effect, we used a series of stochastic simulations to model a scenario based loosely on bovine tuberculosis. We find that a candidate gene approach tends to have greater statistical power than studies that use large numbers of single nucleotide polymorphisms (SNPs) in genome-wide association tests, almost regardless of the number of SNPs deployed. Both approaches struggle to detect genetic effects when these are either weak or if an appreciable proportion of individuals are unexposed to the disease when modest sample sizes (250 each of cases and controls) are used, but these issues are largely mitigated if sample sizes can be increased to 2000 or more of each class. We conclude that the power of any genotype-phenotype association test will be improved if the sampling strategy takes account of exposure heterogeneity, though this is not necessarily easy to do.
技术的发展使得在病例对照研究中可以部署越来越多的标记物,以寻找影响疾病易感性的遗传因素。然而,随着标记物数量的增加,真正的“命中”可能会淹没在大量的假阳性中。对于传染病来说,这个问题可能尤其严重,因为对照组可能包含易感基因型的未暴露个体。为了探索这种影响,我们使用一系列随机模拟来模拟一个基于牛结核病的场景。我们发现,候选基因方法通常比使用全基因组关联测试中大量单核苷酸多态性(SNP)的研究具有更大的统计效力,几乎与部署的 SNP 数量无关。这两种方法在遗传效应较弱或在使用适度样本量(每组 250 例病例和对照)时,相当一部分个体未接触疾病时,都难以检测到遗传效应,但如果可以将每个类别增加到 2000 个或更多,则可以大大减轻这些问题。我们得出结论,如果采样策略考虑到暴露异质性,任何基因型-表型关联测试的效力都将得到提高,尽管这并不一定容易做到。