Department of Epidemiology and Public Health, Yale University, New Haven, CT 06520, USA.
BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S26. doi: 10.1186/1471-2156-6-S1-S26.
Single-nucleotide polymorphisms (SNPs) are a class of attractive genetic markers for population genetic studies and for identifying genetic variations underlying complex traits. However, the usefulness and efficiency of SNPs in comparison to microsatellites in different scientific contexts, e.g., population structure inference or association analysis, still must be systematically evaluated through large empirical studies. In this article, we use the Collaborative Studies on Genetics of Alcoholism (COGA) data from Genetic Analysis Workshop 14 (GAW14) to compare the performance of microsatellites and SNPs in the whole human genome in the context of population structure inference. A total of 328 microsatellites and 15,840 SNPs are used to infer population structure in 236 unrelated individuals. We find that, on average, the informativeness of random microsatellites is four to twelve times that of random SNPs for various population comparisons, which is consistent with previous studies. Our results also indicate that for the combined set of microsatellites and SNPs, SNPs constitute the majority among the most informative markers and the use of these SNPs leads to better inference of population structure than the use of microsatellites. We also find that the inclusion of less informative markers may add noise and worsen the results.
单核苷酸多态性(SNP)是一类具有吸引力的遗传标记,可用于群体遗传学研究和识别复杂性状的遗传变异。然而,在不同的科学背景下,如群体结构推断或关联分析,SNP 相对于微卫星的有用性和效率仍需通过大规模的经验研究来系统评估。在本文中,我们使用来自遗传分析研讨会 14(GAW14)的酒精中毒遗传研究协作组(COGA)数据,比较了微卫星和 SNP 在整个人类基因组中在群体结构推断方面的性能。总共使用了 328 个微卫星和 15840 个 SNP 来推断 236 个无关个体的群体结构。我们发现,平均而言,对于各种群体比较,随机微卫星的信息量是随机 SNP 的四到十二倍,这与先前的研究一致。我们的结果还表明,对于微卫星和 SNP 的组合集合,SNP 构成了最具信息量的标记中的大多数,并且使用这些 SNP 可以比使用微卫星更好地推断群体结构。我们还发现,包含信息量较低的标记可能会增加噪声并恶化结果。