Synthetic and Systems Biology Unit, Institute of Biochemistry, Biological Research Centre of the Hungarian Academy of Sciences, Szeged H-6701, Hungary.
Genetics. 2013 Dec;195(4):1407-17. doi: 10.1534/genetics.113.152256. Epub 2013 Sep 20.
It has been recently discovered that new genes can originate de novo from noncoding DNA, and several biological traits including expression or sequence composition form a continuum from noncoding sequences to conserved genes. In this article, using yeast genes I test whether the integration of new genes into cellular networks and their structural maturation shows such a continuum by analyzing their changes with gene age. I show that 1) The number of regulatory, protein-protein, and genetic interactions increases continuously with gene age, although with very different rates. New regulatory interactions emerge rapidly within a few million years, while the number of protein-protein and genetic interactions increases slowly, with a rate of 2-2.25 × 10(-8)/year and 4.8 × 10(-8)/year, respectively. 2) Gene essentiality evolves relatively quickly: the youngest essential genes appear in proto-genes ∼14 MY old. 3) In contrast to interactions, the secondary structure of proteins and their robustness to mutations indicate that new genes face a bottleneck in their evolution: proto-genes are characterized by high β-strand content, high aggregation propensity, and low robustness against mutations, while conserved genes are characterized by lower strand content and higher stability, most likely due to the higher probability of gene loss among young genes and accumulation of neutral mutations.
最近发现,新基因可以从头起源于非编码 DNA,包括表达或序列组成在内的几种生物特征形成了一个从非编码序列到保守基因的连续体。在本文中,我使用酵母基因来测试新基因是否通过分析它们随基因年龄的变化而整合到细胞网络中并使其结构成熟,从而表现出这种连续性。我表明:1) 尽管速率非常不同,但与基因年龄相关的调控、蛋白质-蛋白质和遗传相互作用的数量不断增加。新的调控相互作用在几百万年内迅速出现,而蛋白质-蛋白质和遗传相互作用的数量则缓慢增加,分别为每年 2-2.25×10(-8)和 4.8×10(-8)。2) 基因的必需性进化得相对较快:最年轻的必需基因出现在大约 1400 万年前的原基因中。3) 与相互作用相反,蛋白质的二级结构及其对突变的稳健性表明,新基因在其进化过程中面临瓶颈:原基因的特点是β-折叠含量高、聚集倾向高、对突变的稳健性低,而保守基因的特点是β-折叠含量低、稳定性高,这很可能是由于年轻基因中基因丢失的概率较高,以及中性突变的积累。