Coenye Tom, Vandamme Peter
Laboratorium voor Microbiologie, Universiteit Gent K.L., Gent, Belgium.
DNA Res. 2005;12(4):221-33. doi: 10.1093/dnares/dsi009.
The increasing availability of prokaryotic genome sequences has shown that simple sequence repeats (SSRs) are widespread in prokaryotes and that there is extensive variation in their length, number and distribution. Considering their potential importance in generating genomic diversity, we determined the distribution of a specific group of SSRs, mononucleotide repeats of size between 5 and 13 nt, in 157 sequenced prokaryotic genomes. The data obtained in the present study show that (i) a large number of mononucleotide SSRs is present in all prokaryotic genomes investigated, (ii) shorter repeats are much more abundant than longer repeats, and (iii) in the majority of the genomes, longer mononucleotide SSRs are excluded from coding regions although we identified several organisms where mononucleotide SSRs are not excluded from the coding regions. We also observed that some genomes contain more mononucleotide SSRs than expected, while others contain significantly less. Bacterial genomes that contain much less mononucleotide SSRs than expected are generally larger and more GC-rich, while bacterial genomes that contain much more mononucleotide SSRs than expected are in general smaller and more AT-rich. Finally, we also noted that genomes that contain a high fraction of horizontally transferred genes have a lower mononucleotide SSR density and that A and T are generally overrepresented in mononucleotide SSRs.
原核生物基因组序列越来越容易获取,这表明简单序列重复(SSRs)在原核生物中广泛存在,并且其长度、数量和分布存在广泛差异。考虑到它们在产生基因组多样性方面的潜在重要性,我们确定了一组特定的SSRs(长度在5至13个核苷酸之间的单核苷酸重复序列)在157个已测序原核生物基因组中的分布。本研究获得的数据表明:(i)在所研究的所有原核生物基因组中都存在大量的单核苷酸SSRs;(ii)较短的重复序列比较长的重复序列丰富得多;(iii)在大多数基因组中,较长的单核苷酸SSRs被排除在编码区域之外,尽管我们鉴定出了几种单核苷酸SSRs未被排除在编码区域之外的生物。我们还观察到,一些基因组包含的单核苷酸SSRs比预期的多,而另一些则明显较少。单核苷酸SSRs比预期少得多的细菌基因组通常更大且GC含量更高,而单核苷酸SSRs比预期多得多的细菌基因组通常更小且AT含量更高。最后,我们还注意到,含有高比例水平转移基因的基因组单核苷酸SSRs密度较低,并且单核苷酸SSRs中A和T通常占比过高。