Payseur Bret A, Jing Peicheng
Laboratory of Genetics, University of Wisconsin, WI, USA.
Mol Biol Evol. 2009 Jun;26(6):1369-77. doi: 10.1093/molbev/msp052. Epub 2009 Mar 16.
Patterns of population structure provide insights into evolutionary processes and help identify groups of individuals for genotype-phenotype association studies. With increasing availability of polymorphic molecular markers across genomes, the examination of population structure using large numbers of unlinked loci has become a common practice in evolutionary biology and human genetics. The two classes of molecular variation most widely used for this purpose, short tandem repeat polymorphisms (STRPs) and single-nucleotide polymorphisms (SNPs), differ in mutational properties expected to affect population structure. To measure the relative ability of these loci to describe population structure, we compared diversity at neighboring STRPs and SNPs from 720 genomic regions in the four populations that comprise the Human HapMap. Comparing loci from the same genomic regions allowed us to focus on the contribution of mutational differences (rather than variation in genealogical history) to disparities in population structure between STRPs and SNPs. Relative to average values for SNPs from the same regions, STRPs had lower F(st), but higher G(st)' and I(n) values. STRP-SNP correlations in population structure across genomic regions were statistically significant but weak in magnitude. Separate analyses by repeat type showed that these correlations were driven primarily by tetranucleotide and trinucleotide STRPs; measures of population structure at dinucleotides and SNPs were not significantly correlated. Pairwise comparisons among populations revealed effects of divergence time on differences in population structure between STRPs and SNPs. Collectively, these results confirm that individual STRPs can provide more information about population structure than individual SNPs, but suggest that the difference in structure at STRPs and SNPs depends on local genealogical history. Our study motivates theoretical comparisons of population structure at loci with different mutational properties.
群体结构模式有助于深入了解进化过程,并有助于识别用于基因型 - 表型关联研究的个体群体。随着全基因组多态性分子标记的可得性不断增加,利用大量不连锁基因座来研究群体结构已成为进化生物学和人类遗传学中的常见做法。为此目的最广泛使用的两类分子变异,即短串联重复多态性(STRP)和单核苷酸多态性(SNP),在预期影响群体结构的突变特性方面存在差异。为了衡量这些基因座描述群体结构的相对能力,我们比较了构成人类HapMap的四个人群中720个基因组区域内相邻STRP和SNP的多样性。比较来自相同基因组区域的基因座使我们能够专注于突变差异(而非系谱历史的变异)对STRP和SNP之间群体结构差异的贡献。相对于来自相同区域的SNP的平均值,STRP具有较低的F(st),但具有较高的G(st)'和I(n)值。跨基因组区域的群体结构中STRP - SNP相关性具有统计学意义,但幅度较弱。按重复类型进行的单独分析表明,这些相关性主要由四核苷酸和三核苷酸STRP驱动;二核苷酸和SNP处的群体结构测量值没有显著相关性。群体间的成对比较揭示了分化时间对STRP和SNP之间群体结构差异的影响。总体而言,这些结果证实单个STRP比单个SNP能够提供更多关于群体结构的信息,但表明STRP和SNP在结构上的差异取决于局部系谱历史。我们的研究推动了对具有不同突变特性的基因座处群体结构的理论比较。