Okamura Kohji, Feuk Lars, Marquès-Bonet Tomàs, Navarro Arcadi, Scherer Stephen W
The Centre for Applied Genomics, Program in Genetics and Genomic Biology, The Hospital for Sick Children, Toronto, Canada ON M5G 1L7; Department of Molecular and Medical Genetics, University of Toronto, Toronto, Canada ON M5S 1A8.
Unitat de Biologia Evolutiva, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, 08003 Barcelona, Spain.
Genomics. 2006 Dec;88(6):690-697. doi: 10.1016/j.ygeno.2006.06.009. Epub 2006 Aug 4.
Genomic duplication, followed by divergence, contributes to organismal evolution. Several mechanisms, such as exon shuffling and alternative splicing, are responsible for novel gene functions, but they generate homologous domains and do not usually lead to drastic innovation. Major novelties can potentially be introduced by frameshift mutations and this idea can explain the creation of novel proteins. Here, we employ a strategy using simulated protein sequences and identify 470 human and 108 mouse frameshift events that originate new gene segments. No obvious interspecies overlap was observed, suggesting high rates of acquisition of evolutionary events. This inference is supported by a deficiency of TpA dinucleotides in the protein-coding sequences, which decreases the occurrence of translational termination, even on the complementary strand. Increased usage of the TGA codon as the termination signal in newer genes also supports our inference. This suggests that tolerated frameshift changes are a prevalent mechanism for the rapid emergence of new genes and that protein-coding sequences can be derived from existing or ancestral exons rather than from events that result in noncoding sequences becoming exons.
基因组复制,随后发生分化,推动了生物体的进化。几种机制,如外显子重排和可变剪接,负责新的基因功能,但它们产生同源结构域,通常不会导致重大创新。移码突变可能会引入主要的新特性,这一观点可以解释新蛋白质的产生。在这里,我们采用一种使用模拟蛋白质序列的策略,鉴定出470个人类和108个小鼠的移码事件,这些事件产生了新的基因片段。未观察到明显的种间重叠,这表明进化事件的获得率很高。蛋白质编码序列中TpA二核苷酸的缺乏支持了这一推断,这降低了翻译终止的发生率,即使在互补链上也是如此。在较新的基因中,TGA密码子作为终止信号的使用增加也支持了我们的推断。这表明可耐受的移码变化是新基因快速出现的普遍机制,并且蛋白质编码序列可以源自现有或祖先外显子,而不是源自导致非编码序列成为外显子的事件。