Huff Jason T, Zilberman Daniel, Roy Scott W
Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA.
California Institute for Quantitative Biosciences, University of California, Berkeley, California 94720, USA.
Nature. 2016 Oct 27;538(7626):533-536. doi: 10.1038/nature20110. Epub 2016 Oct 19.
The discovery of introns four decades ago was one of the most unexpected findings in molecular biology. Introns are sequences interrupting genes that must be removed as part of messenger RNA production. Genome sequencing projects have shown that most eukaryotic genes contain at least one intron, and frequently many. Comparison of these genomes reveals a history of long evolutionary periods during which few introns were gained, punctuated by episodes of rapid, extensive gain. However, although several detailed mechanisms for such episodic intron generation have been proposed, none has been empirically supported on a genomic scale. Here we show how short, non-autonomous DNA transposons independently generated hundreds to thousands of introns in the prasinophyte Micromonas pusilla and the pelagophyte Aureococcus anophagefferens. Each transposon carries one splice site. The other splice site is co-opted from the gene sequence that is duplicated upon transposon insertion, allowing perfect splicing out of the RNA. The distributions of sequences that can be co-opted are biased with respect to codons, and phasing of transposon-generated introns is similarly biased. These transposons insert between pre-existing nucleosomes, so that multiple nearby insertions generate nucleosome-sized intervening segments. Thus, transposon insertion and sequence co-option may explain the intron phase biases and prevalence of nucleosome-sized exons observed in eukaryotes. Overall, the two independent examples of proliferating elements illustrate a general DNA transposon mechanism that can plausibly account for episodes of rapid, extensive intron gain during eukaryotic evolution.
四十年前内含子的发现是分子生物学中最意想不到的发现之一。内含子是打断基因的序列,在信使核糖核酸产生过程中必须被去除。基因组测序项目表明,大多数真核基因至少包含一个内含子,而且常常有许多内含子。对这些基因组的比较揭示了一段漫长的进化历程,在此期间很少有内含子获得,其间穿插着快速、大量获得内含子的时期。然而,尽管已经提出了几种关于这种偶发性内含子产生的详细机制,但在基因组规模上没有一个得到实证支持。在这里,我们展示了短的、非自主的DNA转座子如何在绿藻小球藻和噬藻体金藻中独立产生了数百到数千个内含子。每个转座子携带一个剪接位点。另一个剪接位点则是从转座子插入时发生复制的基因序列中选取的,从而使RNA能够完美剪接。可被选取的序列分布在密码子方面存在偏向性,转座子产生的内含子的相位同样存在偏向性。这些转座子插入到预先存在的核小体之间,因此多个附近的插入会产生核小体大小的间隔片段。因此,转座子插入和序列选取可能解释了在真核生物中观察到的内含子相位偏向性和核小体大小外显子的普遍存在。总体而言,这两个增殖元件的独立例子说明了一种普遍的DNA转座子机制,该机制可以合理地解释真核生物进化过程中快速、大量内含子获得的时期。