Department of Chemical Engineering and Polymer Research Center, Bogazici University, Istanbul, Turkey.
School of Neurobiology, Biochemistry & Biophysics, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.
Mol Biol Evol. 2024 Sep 4;41(9). doi: 10.1093/molbev/msae184.
Protein space is characterized by extensive recurrence, or "reuse," of parts, suggesting that new proteins and domains can evolve by mixing-and-matching of existing segments. From an evolutionary perspective, for a given combination to persist, the protein segments should presumably not only match geometrically but also dynamically communicate with each other to allow concerted motions that are key to function. Evidence from protein space supports the premise that domains indeed combine in this manner; we explore whether a similar phenomenon can be observed at the sub-domain level. To this end, we use Gaussian Network Models (GNMs) to calculate the so-called soft modes, or low-frequency modes of motion for a dataset of 150 protein domains. Modes of motion can be used to decompose a domain into segments of consecutive amino acids that we call "dynamic elements", each of which belongs to one of two parts that move in opposite senses. We find that, in many cases, the dynamic elements, detected based on GNM analysis, correspond to established "themes": Sub-domain-level segments that have been shown to recur in protein space, and which were detected in previous research using sequence similarity alone (i.e. completely independently of the GNM analysis). This statistically significant correlation hints at the importance of dynamics in evolution. Overall, the results are consistent with an evolutionary scenario where proteins have emerged from themes that need to match each other both geometrically and dynamically, e.g. to facilitate allosteric regulation.
蛋白质空间的特点是广泛的重复或“重用”部分,这表明新的蛋白质和结构域可以通过混合和匹配现有的片段来进化。从进化的角度来看,为了使给定的组合得以持续,蛋白质片段不仅应该在几何上匹配,而且应该动态地相互通信,以允许协同运动,这是功能的关键。蛋白质空间的证据支持了这样一个前提,即结构域确实以这种方式结合;我们探讨了在亚结构域水平上是否可以观察到类似的现象。为此,我们使用高斯网络模型 (Gaussian Network Models, GNM) 为 150 个蛋白质结构域的数据集计算所谓的软模式或低频运动模式。运动模式可用于将结构域分解为连续氨基酸的片段,我们称之为“动态元素”,每个动态元素属于两个以相反方向运动的部分之一。我们发现,在许多情况下,基于 GNM 分析检测到的动态元素与已建立的“主题”相对应:在蛋白质空间中重复出现的亚结构域级别的片段,并且在以前的研究中仅使用序列相似性(即完全独立于 GNM 分析)检测到了这些片段。这种具有统计学意义的相关性暗示了动力学在进化中的重要性。总体而言,这些结果与一个进化情景一致,即蛋白质是从需要在几何和动态上相互匹配的主题中出现的,例如,促进变构调节。