Coban Abdulbaki, Bornberg-Bauer Erich, Kemena Carsten
Institute for Evolution and Biodiversity, University of Münster, Münster, 48159, Germany.
Departement of Protein Evolution, Max Planck Institute for Biology Tübingen, Tübingen, 72076, Germany.
BMC Ecol Evol. 2025 Jan 8;25(1):6. doi: 10.1186/s12862-024-02347-7.
Protein evolution is central to molecular adaptation and largely characterized by modular rearrangements of domains, the evolutionary and structural building blocks of proteins. Genetic events underlying protein rearrangements are relatively rare compared to changes of amino-acids. Therefore, these events can be used to characterize and reconstruct major events of molecular adaptation by comparing large data sets of proteomes.
Here we determine, at unprecedented completeness, the rates of fusion, fission, emergence and loss of domains in five eukaryotic clades (monocots, eudicots, fungi, insects, vertebrates). By characterizing rearrangements that were previously considered "ambiguous" or "complex" we raise the fraction of resolved rearrangement events from previously ca. 60% to around 92%. We exemplify our method by analyzing the evolutionary histories of protein rearrangements in (i) the extracellular matrix, (ii) innate immunity across Eukaryota, Metazoa, and Vertebrata, and (iii) Toll-Like-Receptors in the innate immune system of Eukaryota. In all three cases we can find hot-spots of rearrangement events in their phylogeny which (i) can be related with major events of adaptation and (ii) which follow the emergence of new domains which become integrated into existing arrangements.
Our results demonstrate that, akin to the change at the level of amino acids, domain rearrangements follow a clock-like dynamic which can be well quantified and supports the concept of evolutionary tinkering. While many novel domain emergence events are ancient, emerged domains are quickly incorporated into a great number of proteins. In parallel, the observed rates of emergence of new domains are becoming smaller over time.
蛋白质进化是分子适应性的核心,其主要特征是结构域的模块化重排,而结构域是蛋白质的进化和结构构建单元。与氨基酸变化相比,蛋白质重排背后的遗传事件相对较少。因此,通过比较大量蛋白质组数据集,这些事件可用于表征和重建分子适应性的主要事件。
在此,我们以前所未有的完整性确定了五个真核生物进化枝(单子叶植物、双子叶植物、真菌、昆虫、脊椎动物)中结构域的融合、裂变、出现和丢失速率。通过表征以前被认为“不明确”或“复杂”的重排,我们将已解决的重排事件比例从之前的约60%提高到了约92%。我们通过分析以下方面的蛋白质重排进化历史来举例说明我们的方法:(i)细胞外基质;(ii)真核生物、后生动物和脊椎动物的先天免疫;(iii)真核生物先天免疫系统中的Toll样受体。在所有这三种情况下,我们都能在其系统发育中找到重排事件的热点,这些热点(i)可能与主要的适应事件相关,(ii)随着新结构域的出现而出现,这些新结构域随后被整合到现有的排列中。
我们的结果表明,与氨基酸水平的变化类似,结构域重排遵循类似时钟的动态变化,这种变化可以很好地量化,并支持进化修补的概念。虽然许多新的结构域出现事件很古老,但出现的结构域会很快被纳入大量蛋白质中。同时,随着时间的推移,观察到的新结构域出现速率正在变小。