Baltzis Athanasios, Santus Luisa, Langer Björn E, Magis Cedrik, de Vienne Damien M, Gascuel Olivier, Mansouri Leila, Notredame Cedric
Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain.
Universitat Pompeu Fabra (UPF), Barcelona, Spain.
Nat Commun. 2025 Jan 15;16(1):293. doi: 10.1038/s41467-024-55264-0.
In a phylogeny, trustworthy reliability branch support estimates are as important as the tree itself. We show that reliability support values based on bootstrapping can be improved by combining sequence and structural information from proteins. Our approach relies on the systematic comparison of homologous intra-molecular structural distances. These variations exhibit less saturation than sequence-based Hamming distances and support the computation of tree-like distance matrices resolvable into phylogenetic trees using distance-based methods such as minimum evolution. These trees bear strong similarities to their sequence-based counterparts and allow the estimation of bootstrap support values, but they are sufficiently distinct so that their information content may be combined. The combined sequence and structure bootstrap support values yield improved discrimination between correct and incorrect branches. In this work we show that our approach, named multistrap, is suitable for the improvement of bootstrap branch support values using both predicted and experimental 3D structures.
在系统发育树中,可靠的分支支持度估计与树本身同样重要。我们表明,通过结合蛋白质的序列和结构信息,可以提高基于自展法的可靠性支持值。我们的方法依赖于对同源分子内结构距离的系统比较。这些变异比基于序列的汉明距离表现出更少的饱和度,并支持使用基于距离的方法(如最小进化法)计算可解析为系统发育树的树状距离矩阵。这些树与其基于序列的对应物有很强的相似性,并允许估计自展支持值,但它们又足够不同,因此可以合并它们的信息内容。结合序列和结构的自展支持值能够更好地区分正确和错误的分支。在这项工作中,我们表明我们的方法(称为多重自展)适用于使用预测的和实验性的三维结构来改进自展分支支持值。