Institut Systématique Evolution Biodiversité (ISYEB UMR 7205), Sorbonne Université, MNHN, CNRS, EPHE, UA, Paris, France.
IMPMC (UMR 7590), BiBiP, Sorbonne Université, CNRS, MNHN, Paris, France.
Evolution. 2022 Aug;76(8):1706-1719. doi: 10.1111/evo.14550. Epub 2022 Jul 13.
Several studies showed that folds (topology of protein secondary structures) distribution in proteomes may be a global proxy to build phylogeny. Then, some folds should be synapomorphies (derived characters exclusively shared among taxa). However, previous studies used methods that did not allow synapomorphy identification, which requires congruence analysis of folds as individual characters. Here, we map SCOP folds onto a sample of 210 species across the tree of life (TOL). Congruence is assessed using retention index of each fold for the TOL, and principal component analysis for deeper branches. Using a bicluster mapping approach, we define synapomorphic blocks of folds (SBF) sharing similar presence/absence patterns. Among the 1232 folds, 20% are universally present in our TOL, whereas 54% are reliable synapomorphies. These results are similar with CATH and ECOD databases. Eukaryotes are characterized by a large number of them, and several SBFs clearly support nested eukaryotic clades (divergence times from 1100 to 380 mya). Although clearly separated, the three superkingdoms reveal a strong mosaic pattern. This pattern is consistent with the dual origin of eukaryotes and witness secondary endosymbiosis in their phothosynthetic clades. Our study unveils direct analysis of folds synapomorphies as key characters to unravel evolutionary history of species.
几项研究表明,蛋白质折叠(蛋白质二级结构的拓扑结构)在蛋白质组中的分布可能是构建系统发育的全局代理。那么,一些折叠应该是同源特征(仅在分类群中共享的衍生特征)。然而,以前的研究使用的方法不允许识别同源特征,这需要对折叠作为单个特征进行一致性分析。在这里,我们将 SCOP 折叠映射到生命之树(TOL)中 210 个物种的样本上。使用每个折叠对 TOL 的保留指数以及对更深分支的主成分分析来评估一致性。使用双聚类映射方法,我们定义了具有相似存在/缺失模式的折叠同源特征块(SBF)。在 1232 个折叠中,有 20%普遍存在于我们的 TOL 中,而 54%是可靠的同源特征。这些结果与 CATH 和 ECOD 数据库相似。真核生物的特点是它们数量众多,并且有几个 SBF 清楚地支持嵌套的真核类群(分化时间从 1100 到 380 百万年前)。尽管明显分开,但三个超级王国呈现出强烈的镶嵌模式。这种模式与真核生物的双重起源一致,并见证了它们光合类群中的二次内共生。我们的研究揭示了对折叠同源特征的直接分析作为揭示物种进化历史的关键特征。