Dimitrieva Slavica, Anisimova Maria
Swiss Institute for Experimental Cancer Research (ISREC) and Swiss Federal Institute of Technology Lausanne (EPFL), Lausanne, Switzerland; Department of Computer Science, ETH Zürich, Zurich, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
Department of Computer Science, ETH Zürich, Zurich, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
PLoS One. 2014 Jun 4;9(6):e95034. doi: 10.1371/journal.pone.0095034. eCollection 2014.
In protein-coding genes, synonymous mutations are often thought not to affect fitness and therefore are not subject to natural selection. Yet increasingly, cases of non-neutral evolution at certain synonymous sites were reported over the last decade. To evaluate the extent and the nature of site-specific selection on synonymous codons, we computed the site-to-site synonymous rate variation (SRV) and identified gene properties that make SRV more likely in a large database of protein-coding gene families and protein domains. To our knowledge, this is the first study that explores the determinants and patterns of the SRV in real data. We show that the SRV is widespread in the evolution of protein-coding sequences, putting in doubt the validity of the synonymous rate as a standard neutral proxy. While protein domains rarely undergo adaptive evolution, the SRV appears to play important role in optimizing the domain function at the level of DNA. In contrast, protein families are more likely to evolve by positive selection, but are less likely to exhibit SRV. Stronger SRV was detected in genes with stronger codon bias and tRNA reusage, those coding for proteins with larger number of interactions or forming larger number of structures, located in intracellular components and those involved in typically conserved complex processes and functions. Genes with extreme SRV show higher expression levels in nearly all tissues. This indicates that codon bias in a gene, which often correlates with gene expression, may often be a site-specific phenomenon regulating the speed of translation along the sequence, consistent with the co-translational folding hypothesis. Strikingly, genes with SRV were strongly overrepresented for metabolic pathways and those associated with several genetic diseases, particularly cancers and diabetes.
在蛋白质编码基因中,同义突变通常被认为不会影响适应性,因此不受自然选择的影响。然而,在过去十年中,越来越多关于某些同义位点非中性进化的案例被报道。为了评估同义密码子位点特异性选择的程度和性质,我们计算了位点间同义速率变异(SRV),并在一个大型蛋白质编码基因家族和蛋白质结构域数据库中确定了使SRV更有可能出现的基因特性。据我们所知,这是第一项在实际数据中探索SRV决定因素和模式的研究。我们表明,SRV在蛋白质编码序列的进化中广泛存在,这使同义速率作为标准中性指标的有效性受到质疑。虽然蛋白质结构域很少经历适应性进化,但SRV似乎在DNA水平上优化结构域功能方面发挥着重要作用。相比之下,蛋白质家族更有可能通过正选择进化,但表现出SRV的可能性较小。在密码子偏好性更强和tRNA再利用更强的基因中,那些编码具有更多相互作用或形成更多结构的蛋白质的基因中,位于细胞内成分中的基因以及那些参与典型保守复杂过程和功能的基因中,检测到更强的SRV。具有极端SRV的基因在几乎所有组织中都表现出更高的表达水平。这表明基因中的密码子偏好性(通常与基因表达相关)可能常常是一种位点特异性现象,调节沿着序列的翻译速度,这与共翻译折叠假说是一致的。引人注目的是,具有SRV的基因在代谢途径以及与几种遗传疾病(特别是癌症和糖尿病)相关的基因中显著富集。