Biomedical Informatics Research Programme (GRIB), Fundació Institut Municipal d'Investigació Mèdica, Barcelona 08003, Spain.
Genome Res. 2010 Jun;20(6):745-54. doi: 10.1101/gr.101261.109. Epub 2010 Mar 24.
Amino acid tandem repeats are found in a large number of eukaryotic proteins. They are often encoded by trinucleotide repeats and exhibit high intra- and interspecies size variability due to the high mutation rate associated with replication slippage. The extent to which natural selection is important in shaping amino acid repeat evolution is a matter of debate. On one hand, their high frequency may simply reflect their high probability of expansion by slippage, and they could essentially evolve in a neutral manner. On the other hand, there is experimental evidence that changes in repeat size can influence protein-protein interactions, transcriptional activity, or protein subcellular localization, indicating that repeats could be functionally relevant and thus shaped by selection. To gauge the relative contribution of neutral and selective forces in amino acid repeat evolution, we have performed a comparative analysis of amino acid repeat conservation in a large set of orthologous proteins from 12 vertebrate species. As a neutral model of repeat evolution we have used sequences with the same DNA triplet composition as the coding sequences--and thus expected to be subject to the same mutational forces--but located in syntenic noncoding genomic regions. The results strongly indicate that selection has played a more important role than previously suspected in amino acid tandem repeat evolution, by increasing the repeat retention rate and by modulating repeat size. The data obtained in this study have allowed us to identify a set of 92 repeats that are postulated to play important functional roles due to their strong selective signature, including five cases with direct experimental evidence.
氨基酸串联重复序列存在于大量真核蛋白质中。它们通常由三核苷酸重复编码,并由于与复制滑动相关的高突变率而表现出高度的种内和种间大小变异性。自然选择在塑造氨基酸重复进化方面的重要程度是一个有争议的问题。一方面,它们的高频率可能仅仅反映了它们通过滑动扩张的高概率,并且它们可以基本上以中性方式进化。另一方面,有实验证据表明重复大小的变化会影响蛋白质-蛋白质相互作用、转录活性或蛋白质亚细胞定位,表明重复可能具有功能相关性,因此受到选择的影响。为了衡量中性和选择力在氨基酸重复进化中的相对贡献,我们对来自 12 种脊椎动物的大量直系同源蛋白质中的氨基酸重复保守性进行了比较分析。作为重复进化的中性模型,我们使用了与编码序列具有相同三核苷酸组成的序列 - 因此预计会受到相同的突变力的影响 - 但位于同线性非编码基因组区域。结果强烈表明,选择比以前怀疑的在氨基酸串联重复进化中发挥了更重要的作用,增加了重复保留率并调节了重复大小。本研究获得的数据使我们能够识别出一组 92 个重复序列,由于其强烈的选择特征,这些重复序列被认为具有重要的功能作用,包括 5 个具有直接实验证据的情况。