Basu Malay Kumar, Carmel Liran, Rogozin Igor B, Koonin Eugene V
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
Genome Res. 2008 Mar;18(3):449-61. doi: 10.1101/gr.6943508. Epub 2008 Jan 29.
Numerous eukaryotic proteins contain multiple domains. Certain domains show a tendency to occur in diverse domain architectures and can be considered "promiscuous." These promiscuous domains are, typically, involved in protein-protein interactions and play crucial roles in interaction networks, particularly those that contribute to signal transduction. A systematic comparative-genomic analysis of promiscuous domains in eukaryotes is described. Two quantitative measures of domain promiscuity are introduced and applied to the analysis of 28 genomes of diverse eukaryotes. Altogether, 215 domains are identified as strongly promiscuous. The fraction of promiscuous domains in animals is shown to be significantly greater than that in fungi or plants. Evolutionary reconstructions indicate that domain promiscuity is a volatile, relatively fast-changing feature of eukaryotic proteins, with few domains remaining promiscuous throughout the evolution of eukaryotes. Some domains appear to have attained promiscuity independently in different lineages, for example, animals and plants. It is proposed that promiscuous domains persist within a relatively small pool of evolutionarily stable domain combinations from which numerous rare architectures emerge during evolution. Domain promiscuity positively correlates with the number of experimentally detected domain interactions and with the strength of purifying selection affecting a domain. Thus, evolution of promiscuous domains seems to be constrained by the diversity of their interaction partners. The set of promiscuous domains is enriched for domains mediating protein-protein interactions that are involved in various forms of signal transduction, especially in the ubiquitin system and in chromatin. Thus, a limited repertoire of promiscuous domains makes a major contribution to the diversity and evolvability of eukaryotic proteomes and signaling networks.
许多真核生物蛋白质包含多个结构域。某些结构域倾向于出现在多种结构域架构中,可被视为“混杂型”。这些混杂型结构域通常参与蛋白质 - 蛋白质相互作用,在相互作用网络中发挥关键作用,尤其是那些有助于信号转导的网络。本文描述了对真核生物中混杂型结构域的系统比较基因组分析。引入了两种衡量结构域混杂程度的定量方法,并将其应用于分析28种不同真核生物的基因组。总共鉴定出215个结构域为高度混杂型。结果表明,动物中混杂型结构域的比例显著高于真菌或植物。进化重建表明,结构域混杂是真核生物蛋白质的一个不稳定、变化相对较快的特征,在真核生物的整个进化过程中,很少有结构域一直保持混杂状态。一些结构域似乎在不同的谱系中独立获得了混杂性,例如动物和植物。有人提出,混杂型结构域存在于一个相对较小的进化稳定结构域组合库中,在进化过程中会从中出现许多罕见的架构。结构域混杂与实验检测到的结构域相互作用数量以及影响一个结构域的纯化选择强度呈正相关。因此,混杂型结构域的进化似乎受到其相互作用伙伴多样性的限制。混杂型结构域集合富含介导参与各种信号转导形式的蛋白质 - 蛋白质相互作用的结构域,特别是泛素系统和染色质中的相互作用。因此,有限的混杂型结构域库对真核生物蛋白质组和信号网络的多样性及可进化性做出了重大贡献。