Ekman Diana, Björklund Asa K, Elofsson Arne
Stockholm Bioinformatics Center, SCFAB, Stockholm University, SE-10691 Stockholm, Sweden.
J Mol Biol. 2007 Oct 5;372(5):1337-48. doi: 10.1016/j.jmb.2007.06.022. Epub 2007 Jun 15.
Most eukaryotic proteins consist of multiple domains created through gene fusions or internal duplications. The most frequent change of a domain architecture (DA) is insertion or deletion of a domain at the N or C terminus. Still, the mechanisms underlying the evolution of multidomain proteins are not very well studied. Here, we have studied the evolution of multidomain architectures (MDA), guided by evolutionary information in the form of a phylogenetic tree. Our results show that Pfam domain families and MDAs have been created with comparable rates (0.1-1 per million years (My)). The major changes in DA evolution have occurred in the process of multicellularization and within the metazoan lineage. In contrast, creation of domains seems to have been frequent already in the early evolution. Furthermore, most of the architectures have been created from older domains or architectures, whereas novel domains are mainly found in single-domain proteins. However, a particular group of exon-bordering domains may have contributed to the rapid evolution of novel multidomain proteins in metazoan organisms. Finally, MDAs have evolved predominantly through insertions of domains, whereas domain deletions are less common. In conclusion, the rate of creation of multidomain proteins has accelerated in the metazoan lineage, which may partly be explained by the frequent insertion of exon-bordering domains into new architectures. However, our results indicate that other factors have contributed as well.
大多数真核生物蛋白质由通过基因融合或内部重复产生的多个结构域组成。结构域架构(DA)最常见的变化是在N或C末端插入或缺失一个结构域。然而,多结构域蛋白质进化的潜在机制尚未得到很好的研究。在这里,我们以系统发育树的形式利用进化信息研究了多结构域架构(MDA)的进化。我们的结果表明,Pfam结构域家族和MDA的产生速率相当(每百万年0.1 - 1个)。DA进化的主要变化发生在多细胞化过程中和后生动物谱系内。相比之下,结构域的产生在早期进化中似乎就很频繁。此外,大多数架构是由较古老的结构域或架构产生的,而新结构域主要存在于单结构域蛋白质中。然而,一组特定的外显子边界结构域可能促进了后生动物中新的多结构域蛋白质的快速进化。最后,MDA主要通过结构域的插入进化,而结构域缺失则较少见。总之,多结构域蛋白质的产生速率在后生动物谱系中有所加快,这可能部分是由于外显子边界结构域频繁插入新架构。然而,我们的结果表明其他因素也起到了作用。