Wagner Andreas
Department of Biology, The University of New Mexico, USA.
Mol Biol Evol. 2006 Apr;23(4):723-33. doi: 10.1093/molbev/msj085. Epub 2005 Dec 22.
Most previous work on the evolution of mobile DNA was limited by incomplete sequence information. Whole genome sequences allow us to overcome this limitation. I study the nucleotide diversity of prominent members of five insertion sequence families whose transposition activity is encoded by a single transposase gene. Eighteen among 376 completely sequenced bacterial genomes and plasmids carry between 3 and 20 copies of a given insertion sequence. I show that these copies generally show very low DNA divergence. Specifically, more than 68% of the transposase genes are identical within a genome. The average number of amino acid replacement substitutions at amino acid replacement sites is Ka = 0.013, that at silent sites is Ks = 0.1. This low intragenomic diversity stands in stark contrast to a much higher divergence of the same insertion sequences among distantly related genomes. Gene conversion among protein-coding genes is unlikely to account for this lack of diversity. The relation between transposition frequencies and silent substitution rates suggests that most insertion sequences in a typical genome are evolutionarily young and have been recently acquired. They may undergo periodic extinction in bacterial lineages. By implication, they are detrimental to their host in the long run. This is also suggested by the highly skewed and patchy distribution of insertion sequences among genomes. In sum, one can think of insertion sequences as slow-acting infectious diseases of cell lineages.
以往大多数关于移动DNA进化的研究都受到序列信息不完整的限制。全基因组序列使我们能够克服这一限制。我研究了五个插入序列家族中主要成员的核苷酸多样性,这些家族的转座活性由单个转座酶基因编码。在376个完全测序的细菌基因组和质粒中,有18个携带给定插入序列的3至20个拷贝。我发现这些拷贝通常显示出非常低的DNA差异。具体而言,基因组内超过68%的转座酶基因是相同的。氨基酸替换位点上氨基酸替换替代的平均数量为Ka = 0.013,沉默位点上的为Ks = 0.1。这种低基因组内多样性与远缘基因组中相同插入序列的高得多的差异形成鲜明对比。蛋白质编码基因之间的基因转换不太可能解释这种缺乏多样性的情况。转座频率与沉默替换率之间的关系表明,典型基因组中的大多数插入序列在进化上是年轻的,并且是最近获得的。它们可能在细菌谱系中经历周期性灭绝。这意味着,从长远来看,它们对宿主是有害的。基因组中插入序列高度偏斜和斑驳的分布也表明了这一点。总之,可以将插入序列视为细胞谱系的缓慢作用的传染病。