Savisaar Rosina, Hurst Laurence D
The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom.
Mol Biol Evol. 2017 May 1;34(5):1110-1126. doi: 10.1093/molbev/msx061.
While the principal force directing coding sequence (CDS) evolution is selection on protein function, to ensure correct gene expression CDSs must also maintain interactions with RNA-binding proteins (RBPs). Understanding how our genes are shaped by these RNA-level pressures is necessary for diagnostics and for improving transgenes. However, the evolutionary impact of the need to maintain RBP interactions remains unresolved. Are coding sequences constrained by the need to specify RBP binding motifs? If so, what proportion of mutations are affected? Might sequence evolution also be constrained by the need not to specify motifs that might attract unwanted binding, for instance because it would interfere with exon definition? Here, we have scanned human CDSs for motifs that have been experimentally determined to be recognized by RBPs. We observe two sets of motifs-those that are enriched over nucleotide-controlled null and those that are depleted. Importantly, the depleted set is enriched for motifs recognized by non-CDS binding RBPs. Supporting the functional relevance of our observations, we find that motifs that are more enriched are also slower-evolving. The net effect of this selection to preserve is a reduction in the over-all rate of synonymous evolution of 2-3% in both primates and rodents. Stronger motif depletion, on the other hand, is associated with stronger selection against motif gain in evolution. The challenge faced by our CDSs is therefore not only one of attracting the right RBPs but also of avoiding the wrong ones, all while also evolving under selection pressures related to protein structure.
虽然指导编码序列(CDS)进化的主要力量是对蛋白质功能的选择,但为确保正确的基因表达,CDS还必须与RNA结合蛋白(RBP)保持相互作用。了解我们的基因如何受到这些RNA水平压力的影响,对于诊断和改进转基因是必要的。然而,维持RBP相互作用的必要性对进化的影响仍未得到解决。编码序列是否受到指定RBP结合基序的需求的限制?如果是这样,受影响的突变比例是多少?序列进化是否也会受到避免指定可能吸引不必要结合的基序的需求的限制,例如因为它会干扰外显子定义?在这里,我们扫描了人类CDS,寻找已通过实验确定可被RBP识别的基序。我们观察到两组基序——在核苷酸控制的对照中富集的基序和减少的基序。重要的是,减少的基序集中富含非CDS结合RBP识别的基序。支持我们观察结果的功能相关性,我们发现富集程度更高的基序进化也更慢。这种选择保留的净效应是灵长类动物和啮齿动物的同义进化总体速率降低2-3%。另一方面,更强的基序减少与进化中对基序获得的更强选择相关。因此,我们的CDS面临的挑战不仅是吸引正确的RBP,而且是避免错误的RBP,同时还要在与蛋白质结构相关选择压力下进化。