Department of Abiotic Stress, Integrative and Systems Biology Laboratory, Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicias (CSIC-UPV), Valencia, Spain.
Genome Biol Evol. 2012;4(12):1263-74. doi: 10.1093/gbe/evs101.
Genes show a bewildering variation in their patterns of molecular evolution, as a result of the action of different levels and types of selective forces. The factors underlying this variation are, however, still poorly understood. In the last decade, the position of proteins in the protein-protein interaction network has been put forward as a determinant factor of the evolutionary rate and duplicability of their encoding genes. This conclusion, however, has been based on the analysis of the limited number of microbes and animals for which interactome-level data are available (essentially, Escherichia coli, yeast, worm, fly, and humans). Here, we study, for the first time, the relationship between the position of proteins in the high-density interactome of a plant (Arabidopsis thaliana) and the patterns of molecular evolution of their encoding genes. We found that genes whose encoded products act at the center of the network are more evolutionarily constrained than those acting at the network periphery. This trend remains significant when potential confounding factors (gene expression level and breadth, duplicability, function, and length of the encoded products) are controlled for. Even though the correlation between centrality measures and rates of evolution is generally weak, for some functional categories, it is comparable in strength to (or even stronger than) the correlation between evolutionary rates and expression levels or breadths. In addition, genes encoding interacting proteins in the network evolve at relatively similar rates. Finally, Arabidopsis proteins encoded by duplicated genes are more highly connected than those encoded by singleton genes. This observation is in agreement with the patterns observed in humans, but in contrast with those observed in E. coli, yeast, worm, and fly (whose duplicated genes tend to act at the periphery of the network), implying that the relationship between duplicability and centrality inverted at least twice during eukaryote evolution. Taken together, these results indicate that the structure of the A. thaliana network constrains the evolution of its components at multiple levels.
基因在分子进化模式上表现出令人眼花缭乱的变异,这是由于不同层次和类型的选择压力的作用。然而,这些变异的背后因素仍未得到很好的理解。在过去的十年中,蛋白质在蛋白质-蛋白质相互作用网络中的位置被提出是其编码基因进化速度和可复制性的决定因素。然而,这一结论是基于对具有相互作用组水平数据的少数微生物和动物(主要是大肠杆菌、酵母、线虫、果蝇和人类)的分析得出的。在这里,我们首次研究了植物(拟南芥)高密度相互作用组中蛋白质的位置与编码基因分子进化模式之间的关系。我们发现,在网络中心作用的蛋白质编码基因比在网络边缘作用的基因受到更多的进化限制。当控制潜在的混杂因素(基因表达水平和广度、可复制性、功能和编码产物的长度)时,这种趋势仍然显著。尽管中心性度量与进化速度之间的相关性通常较弱,但对于某些功能类别,其相关性与进化速度与表达水平或广度之间的相关性相当(甚至更强)。此外,网络中相互作用蛋白的编码基因以相对相似的速度进化。最后,网络中由重复基因编码的拟南芥蛋白比由单基因编码的蛋白具有更高的连接度。这一观察结果与在人类中观察到的模式一致,但与在大肠杆菌、酵母、线虫和果蝇中观察到的模式相反(其重复基因倾向于在网络的外围作用),这表明在真核生物进化过程中,可复制性和中心性之间的关系至少发生了两次反转。总之,这些结果表明,拟南芥网络的结构在多个层次上限制了其组成部分的进化。