Institut Systématique Evolution Biodiversité (ISYEB UMR 7205), Sorbonne Université, MNHN, CNRS, EPHE, UA, Paris, France.
IMPMC (UMR 7590), BiBiP, Sorbonne Université, CNRS, MNHN, Paris, France.
J Mol Evol. 2023 Dec;91(6):854-864. doi: 10.1007/s00239-023-10136-x. Epub 2023 Dec 7.
Folds are the architecture and topology of a protein domain. Categories of folds are very few compared to the astronomical number of sequences. Eukaryotes have more protein folds than Archaea and Bacteria. These folds are of two types: shared with Archaea and/or Bacteria on one hand and specific to eukaryotic clades on the other hand. The first kind of folds is inherited from the first endosymbiosis and confirms the mixed origin of eukaryotes. In a dataset of 1073 folds whose presence or absence has been evidenced among 210 species equally distributed in the three super-kingdoms, we have identified 28 eukaryotic folds unambiguously inherited from Bacteria and 40 eukaryotic folds unambiguously inherited from Archaea. Compared to previous studies, the repartition of informational function is higher than expected for folds originated from Bacteria and as high as expected for folds inherited from Archaea. The second type of folds is specifically eukaryotic and associated with an increase of new folds within eukaryotes distributed in particular clades. Reconstructed ancestral states coupled with dating of each node on the tree of life provided fold appearance rates. The rate is on average twice higher within Eukaryota than within Bacteria or Archaea. The highest rates are found in the origins of eukaryotes, holozoans, metazoans, metazoans stricto sensu, and vertebrates: the roots of these clades correspond to bursts of fold evolution. We could correlate the functions of some of the fold synapomorphies within eukaryotes with significant evolutionary events. Among them, we find evidence for the rise of multicellularity, adaptive immune system, or virus folds which could be linked to an ecological shift made by tetrapods.
折叠是蛋白质结构域的架构和拓扑结构。与序列数量相比,折叠的类别非常少。真核生物比古菌和细菌具有更多的蛋白质折叠。这些折叠有两种类型:一种与古菌和/或细菌共有,另一种是真核生物类群特有的。第一种折叠是从第一次内共生中遗传下来的,证实了真核生物的混合起源。在一个由 1073 个折叠组成的数据集,这些折叠的存在或缺失已经在 210 个物种中得到了证实,这些物种在三个超级王国中平均分布,我们已经明确鉴定出 28 个从细菌中遗传下来的真核折叠,以及 40 个从古菌中遗传下来的真核折叠。与之前的研究相比,源于细菌的折叠的信息功能分配高于预期,而从古菌中遗传下来的折叠的信息功能分配与预期一致。第二种折叠是真核生物特有的,与真核生物中特定类群内新折叠的增加有关。重建的祖先状态加上对生命之树中每个节点的日期,提供了折叠出现的速度。在真核生物中,平均速度比细菌或古菌高两倍。在真核生物、原生动物、后生动物、后生动物狭义、脊椎动物的起源中发现了最高的速度:这些类群的根对应于折叠进化的爆发。我们可以将真核生物中一些折叠的同源特征的功能与重要的进化事件相关联。其中,我们发现了多细胞生物、适应性免疫系统或病毒折叠的出现证据,这些可能与四足动物的生态转变有关。