Department of Psychiatry, Washington University of Medicine, St. Louis, MO, USA.
BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S84. doi: 10.1186/1471-2156-6-S1-S84.
Accurately resolving population structure in a sample is important for both linkage and association studies. In this study we investigated the power of single-nucleotide polymorphisms (SNPs) in detecting population structure in a sample of 286 unrelated individuals. We varied the number of SNPs to determine how many are required to approach the degree of resolution obtained with the Collaborative Study on the Genetics of Alcoholism (COGA) short tandem repeat polymorphisms (STRPs). In addition, we selected SNPs with varying minor allele frequencies (MAFs) to determine whether low or high frequency SNPs are more efficient in resolving population structure. We conclude that a set of at least 100 evenly spaced SNPs with MAFs of 40-50% is required to resolve population structure in this dataset. If SNPs with lower MAFs are used, then more than 250 SNPs may be required to obtain reliable results.
准确解析样本中的群体结构对于连锁和关联研究都很重要。在这项研究中,我们调查了单核苷酸多态性(SNP)在检测 286 个无关个体样本中群体结构的能力。我们改变了 SNP 的数量,以确定需要多少 SNP 才能达到酒精中毒遗传学合作研究(COGA)短串联重复多态性(STRP)获得的分辨率程度。此外,我们选择了具有不同次要等位基因频率(MAF)的 SNP,以确定低频率或高频率的 SNP 是否更有效地解析群体结构。我们得出结论,至少需要 100 个具有 40-50%MAF 的均匀分布的 SNP 来解析这个数据集的群体结构。如果使用 MAF 较低的 SNP,则可能需要超过 250 个 SNP 才能获得可靠的结果。