Valdez Cabrera Maria Alejandra, Willis Amy D
IEEE Trans Comput Biol Bioinform. 2025 Mar-Apr;22(2):614-627. doi: 10.1109/TCBBIO.2025.3526422.
Phylogenetic trees summarize evolutionary relationships between organisms, and tools to analyze collections of phylogenetic trees enable contrasts between different genes' ancestry. The BHV metric space has enabled the analysis of collections of trees that share a common set of leaves, but in practice, many genes are not shared, even between closely related species. BHV extension spaces represent trees with non-identical leaf sets in a common BHV space, but limited analytical tools exist for extension spaces. We define the distance between two phylogenetic trees with non-identical leaf sets as the shortest BHV distance between their extension spaces, and develop a reduced gradient algorithm to compute this distance. We study the scalability of our algorithm and apply it to analyze gene trees spanning multiple domains of life. Our distance and algorithm offer a fully general, interpretable approach to analyzing both ancient and recent evolutionary divergence.
系统发育树总结了生物体之间的进化关系,而用于分析系统发育树集合的工具能够对比不同基因的谱系。BHV度量空间使得对具有共同叶集的树集合进行分析成为可能,但在实际中,即使是亲缘关系很近的物种之间,许多基因也并不共享。BHV扩展空间在一个共同的BHV空间中表示具有不同叶集的树,但针对扩展空间的分析工具有限。我们将具有不同叶集的两棵系统发育树之间的距离定义为它们扩展空间之间的最短BHV距离,并开发了一种约化梯度算法来计算此距离。我们研究了我们算法的可扩展性,并将其应用于分析跨越生命多个领域的基因树。我们的距离和算法为分析古代和近期的进化分歧提供了一种完全通用且可解释的方法。