Chung Wen-Yu, Wadhawan Samir, Szklarczyk Radek, Pond Sergei Kosakovsky, Nekrutenko Anton
Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, Pennsylvania, United States of America.
PLoS Comput Biol. 2007 May;3(5):e91. doi: 10.1371/journal.pcbi.0030091.
Coding of multiple proteins by overlapping reading frames is not a feature one would associate with eukaryotic genes. Indeed, codependency between codons of overlapping protein-coding regions imposes a unique set of evolutionary constraints, making it a costly arrangement. Yet in cases of tightly coexpressed interacting proteins, dual coding may be advantageous. Here we show that although dual coding is nearly impossible by chance, a number of human transcripts contain overlapping coding regions. Using newly developed statistical techniques, we identified 40 candidate genes with evolutionarily conserved overlapping coding regions. Because our approach is conservative, we expect mammals to possess more dual-coding genes. Our results emphasize that the skepticism surrounding eukaryotic dual coding is unwarranted: rather than being artifacts, overlapping reading frames are often hallmarks of fascinating biology.
多个蛋白质由重叠阅读框编码并非真核基因的一个特征。实际上,重叠蛋白质编码区域的密码子之间的密码子依赖关系施加了一组独特的进化限制,使其成为一种代价高昂的安排。然而,在紧密共表达的相互作用蛋白质的情况下,双重编码可能是有利的。在这里我们表明,虽然双重编码几乎不可能偶然出现,但许多人类转录本包含重叠编码区域。使用新开发的统计技术,我们鉴定出40个具有进化保守重叠编码区域的候选基因。由于我们的方法是保守的,我们预计哺乳动物拥有更多的双重编码基因。我们的结果强调,围绕真核生物双重编码的怀疑是没有根据的:重叠阅读框往往不是人为产物,而是迷人生物学的标志。