Stockholm Bioinformatics Centre, Science for Life Laboratory, Box 1031, Solna, 17121 Sweden.
BMC Bioinformatics. 2011 Aug 5;12:326. doi: 10.1186/1471-2105-12-326.
As orthologous proteins are expected to retain function more often than other homologs, they are often used for functional annotation transfer between species. However, ortholog identification methods do not take into account changes in domain architecture, which are likely to modify a protein's function. By domain architecture we refer to the sequential arrangement of domains along a protein sequence.To assess the level of domain architecture conservation among orthologs, we carried out a large-scale study of such events between human and 40 other species spanning the entire evolutionary range. We designed a score to measure domain architecture similarity and used it to analyze differences in domain architecture conservation between orthologs and paralogs relative to the conservation of primary sequence. We also statistically characterized the extents of different types of domain swapping events across pairs of orthologs and paralogs.
The analysis shows that orthologs exhibit greater domain architecture conservation than paralogous homologs, even when differences in average sequence divergence are compensated for, for homologs that have diverged beyond a certain threshold. We interpret this as an indication of a stronger selective pressure on orthologs than paralogs to retain the domain architecture required for the proteins to perform a specific function. In general, orthologs as well as the closest paralogous homologs have very similar domain architectures, even at large evolutionary separation.The most common domain architecture changes observed in both ortholog and paralog pairs involved insertion/deletion of new domains, while domain shuffling and segment duplication/deletion were very infrequent.
On the whole, our results support the hypothesis that function conservation between orthologs demands higher domain architecture conservation than other types of homologs, relative to primary sequence conservation. This supports the notion that orthologs are functionally more similar than other types of homologs at the same evolutionary distance.
由于同源蛋白比其他同源物更有可能保留功能,因此它们经常被用于物种间的功能注释转移。然而,同源物识别方法并没有考虑到结构域架构的变化,而这些变化很可能会改变蛋白质的功能。通过结构域架构,我们指的是沿着蛋白质序列排列的结构域的顺序排列。为了评估同源物之间结构域架构的保守程度,我们对人类和 40 种其他跨越整个进化范围的物种之间的这种事件进行了大规模研究。我们设计了一个分数来衡量结构域架构的相似性,并使用它来分析同源物和旁系同源物之间的结构域架构保守性差异与一级序列的保守性之间的差异。我们还统计了不同类型的结构域交换事件在同源物和旁系同源物对之间的发生程度。
分析表明,即使在考虑到同源物的平均序列差异已经超过一定阈值的情况下,同源物的序列差异也会得到补偿,同源物也比旁系同源物表现出更大的结构域架构保守性。我们将这解释为对同源物的选择压力强于旁系同源物,以保留蛋白质执行特定功能所需的结构域架构的迹象。一般来说,同源物以及最接近的旁系同源物具有非常相似的结构域架构,即使在进化分离较大时也是如此。在同源物和旁系同源物对中观察到的最常见的结构域架构变化涉及新结构域的插入/缺失,而结构域改组和片段重复/缺失则非常罕见。
总的来说,我们的结果支持这样一种假设,即同源物之间的功能保守性要求结构域架构的保守性高于其他类型的同源物,相对于一级序列的保守性。这支持了这样一种观点,即在相同的进化距离下,同源物在功能上比其他类型的同源物更相似。