Hubbard Center for Genome Studies, University of New Hampshire, Durham, NH 03824, USA.
BMC Genomics. 2010 Dec 3;11:691. doi: 10.1186/1471-2164-11-691.
Simple sequence repeats (SSRs) are highly variable features of all genomes. Their rapid evolution makes them useful for tracing the evolutionary history of populations and investigating patterns of selection and mutation across genomes. The recently sequenced Daphnia pulex genome provides us with a valuable data set to study the mode and tempo of SSR evolution, without the inherent biases that accompany marker selection.
Here we catalogue SSR loci in the Daphnia pulex genome with repeated motif sizes of 1-100 nucleotides with a minimum of 3 perfect repeats. We then used whole genome shotgun reads to determine the average heterozygosity of each SSR type and the relationship that it has to repeat number, motif size, motif sequence, and distribution of SSR loci. We find that SSR heterozygosity is motif specific, and positively correlated with repeat number as well as motif size. For non-repeat unit polymorphisms, we identify a motif-dependent end-nucleotide polymorphism bias that may contribute to the patterns of abundance for specific homopolymers, dimers, and trimers. Our observations confirm the high frequency of multiple unit variation (multistep) at large microsatellite loci, and further show that the occurrence of multiple unit variation is dependent on both repeat number and motif size. Using the Daphnia pulex genetic map, we show a positive correlation between dimer and trimer frequency and recombination.
This genome-wide analysis of SSR variation in Daphnia pulex indicates that several aspects of SSR variation are motif dependent and suggests that a combination of unit length variation and end repeat biased base substitution contribute to the unique spectrum of SSR repeat loci.
简单重复序列(SSRs)是所有基因组中高度可变的特征。它们的快速进化使它们成为追踪种群进化历史和研究基因组中选择和突变模式的有用工具。最近测序的大型溞基因组为我们提供了一个有价值的数据集,用于研究 SSR 进化的模式和速度,而不会伴随标记选择带来的固有偏差。
在这里,我们对大型溞基因组中的 SSR 位点进行了编目,这些 SSR 位点的重复基序大小为 1-100 个核苷酸,且具有至少 3 个完全重复的基序。然后,我们使用全基因组鸟枪法测序reads 来确定每种 SSR 类型的平均杂合度,以及它与重复数、基序大小、基序序列和 SSR 位点分布的关系。我们发现 SSR 杂合度是基序特异性的,与重复数以及基序大小呈正相关。对于非重复单元多态性,我们确定了一种依赖基序的末端核苷酸多态性偏倚,这可能导致特定的同质多聚体、二聚体和三聚体的丰度模式。我们的观察结果证实了在大型微卫星位点上存在高频的多位点变异(多步),并且进一步表明多位点变异的发生取决于重复数和基序大小。利用大型溞的遗传图谱,我们表明二聚体和三聚体频率与重组之间存在正相关关系。
这项对大型溞中 SSR 变异的全基因组分析表明,SSR 变异的几个方面是依赖基序的,并且表明单位长度变异和末端重复偏向碱基替换的组合有助于 SSR 重复位点的独特谱。