National Center for Biotechnology Information NLM/NIH, Bethesda, MD, USA.
Wiley Interdiscip Rev RNA. 2013 Jan-Feb;4(1):93-105. doi: 10.1002/wrna.1143. Epub 2012 Nov 8.
In eukaryotes, protein-coding sequences are interrupted by non-coding sequences known as introns. During mRNA maturation, introns are excised by the spliceosome and the coding regions, exons, are spliced to form the mature coding region. The intron densities widely differ between eukaryotic lineages, from 6 to 7 introns per kb of coding sequence in vertebrates, some invertebrates and green plants, to only a few introns across the entire genome in many unicellular eukaryotes. Evolutionary reconstructions using maximum likelihood methods suggest intron-rich ancestors for each major group of eukaryotes. For the last common ancestor of animals, the highest intron density of all extant and extinct eukaryotes was inferred, at 120-130% of the human intron density. Furthermore, an intron density within 53-74% of the human values was inferred for the last eukaryotic common ancestor. Accordingly, evolution of eukaryotic genes in all lines of descent involved primarily intron loss, with substantial gain only at the bases of several branches including plants and animals. These conclusions have substantial biological implications indicating that the common ancestor of all modern eukaryotes was a complex organism with a gene architecture resembling those in multicellular organisms. Alternative splicing most likely initially appeared as an inevitable result of splicing errors and only later was employed to generate structural and functional diversification of proteins.
在真核生物中,蛋白质编码序列被称为内含子的非编码序列所打断。在 mRNA 成熟过程中,内含子被剪接体切除,而编码区域(外显子)被拼接在一起形成成熟的编码区。真核生物谱系之间的内含子密度差异很大,从脊椎动物每千碱基编码序列 6 到 7 个内含子,一些无脊椎动物和绿色植物,到许多单细胞真核生物整个基因组只有少数内含子。使用最大似然法进行的进化重建表明,每个主要真核生物群体都有内含子丰富的祖先。对于动物的最后共同祖先,推断出所有现存和已灭绝真核生物中最高的内含子密度,为人类内含子密度的 120-130%。此外,还推断出最后一个真核生物共同祖先的内含子密度在人类值的 53-74%以内。因此,所有谱系的真核基因进化主要涉及内含子丢失,只有在包括植物和动物在内的几个分支的基础上才会有大量的获得。这些结论具有重要的生物学意义,表明所有现代真核生物的共同祖先都是一种复杂的生物体,其基因结构类似于多细胞生物。选择性剪接很可能最初是作为剪接错误的必然结果出现的,只是后来才被用来产生蛋白质的结构和功能多样化。