Li ShuXian, Yin TongMing
College of Environment and Forest Resources, Nanjing Forestry University, Nanjing 210037, China.
Sci China C Life Sci. 2007 Oct;50(5):690-9. doi: 10.1007/s11427-007-0073-6.
We mapped and analyzed the microsatellites throughout 284295605 base pairs of the unambiguously assembled sequence scaffolds along 19 chromosomes of the haploid poplar genome. Totally, we found 150985 SSRs with repeat unit lengths between 2 and 5 bp. The established microsatellite physical map demonstrated that SSRs were distributed relatively evenly across the genome of Populus. On average, These SSRs occurred every 1883 bp within the poplar genome and the SSR densities in intergenic regions, introns, exons and UTRs were 85.4%, 10.7%, 2.7% and 1.2%, respectively. We took di-, tri-, tetra-and pentamers as the four classes of repeat units and found that the density of each class of SSRs decreased with the repeat unit lengths except for the tetranucleotide repeats. It was noteworthy that the length diversification of microsatellite sequences was negatively correlated with their repeat unit length and the SSRs with shorter repeat units gained repeats faster than the SSRs with longer repeat units. We also found that the GC content of poplar sequence significantly correlated with densities of SSRs with uneven repeat unit lengths (tri-and penta-), but had no significant correlation with densities of SSRs with even repeat unit lengths (di-and tetra-). In poplar genome, there were evidences that the occurrence of different microsatellites was under selection and the GC content in SSR sequences was found to significantly relate to the functional importance of microsatellites.
我们对单倍体杨树基因组19条染色体上明确组装的序列支架的284295605个碱基对中的微卫星进行了定位和分析。总共,我们发现了150985个重复单元长度在2至5个碱基对之间的简单序列重复(SSRs)。所建立的微卫星物理图谱表明,SSRs相对均匀地分布在杨树基因组中。平均而言,这些SSRs在杨树基因组中每1883个碱基对出现一次,基因间区域、内含子、外显子和非翻译区(UTRs)中的SSRs密度分别为85.4%、10.7%、2.7%和1.2%。我们将二聚体、三聚体、四聚体和五聚体作为四类重复单元,发现除了四核苷酸重复外,每类SSRs的密度都随着重复单元长度的增加而降低。值得注意的是,微卫星序列的长度多样化与其重复单元长度呈负相关,且重复单元较短的SSRs比重复单元较长的SSRs获得重复的速度更快。我们还发现,杨树序列的GC含量与重复单元长度不均匀的SSRs(三聚体和五聚体)的密度显著相关,但与重复单元长度均匀的SSRs(二聚体和四聚体)的密度无显著相关性。在杨树基因组中,有证据表明不同微卫星的出现受到选择,并且发现SSRs序列中的GC含量与微卫星的功能重要性显著相关。