Suppr超能文献

光谱聚类超树:有根系统发育树的快速且统计稳健的合并

Spectral cluster supertree: fast and statistically robust merging of rooted phylogenetic trees.

作者信息

McArthur Robert N, Zehmakan Ahad N, Charleston Michael A, Lin Yu, Huttley Gavin

机构信息

Research School of Biology, The Australian National University, Canberra, ACT, Australia.

School of Computing, The Australian National University, Canberra, ACT, Australia.

出版信息

Front Mol Biosci. 2024 Oct 30;11:1432495. doi: 10.3389/fmolb.2024.1432495. eCollection 2024.

Abstract

The algorithms for phylogenetic reconstruction are central to computational molecular evolution. The relentless pace of data acquisition has exposed their poor scalability and the conclusion that the conventional application of these methods is impractical and not justifiable from an energy usage perspective. Furthermore, the drive to improve the statistical performance of phylogenetic methods produces increasingly parameter-rich models of sequence evolution, which worsens the computational performance. Established theoretical and algorithmic results identify supertree methods as critical to divide-and-conquer strategies for improving scalability of phylogenetic reconstruction. Of particular importance is the ability to explicitly accommodate rooted topologies. These can arise from the more biologically plausible non-stationary models of sequence evolution. We make a contribution to addressing this challenge with Spectral Cluster Supertree, a novel supertree method for merging a set of overlapping rooted phylogenetic trees. It offers significant improvements over Min-Cut supertree and previous state-of-the-art methods in terms of both time complexity and overall topological accuracy, particularly for problems of large size. We perform comparisons against Min-Cut supertree and Bad Clade Deletion. Leveraging two tree topology distance metrics, we demonstrate that while Bad Clade Deletion generates more correct clades in its resulting supertree, Spectral Cluster Supertree's generated tree is generally more topologically close to the true model tree. Over large datasets containing 10,000 taxa and 500 source trees, where Bad Clade Deletion usually takes 2 h to run, our method generates a supertree in on average 20 s. Spectral Cluster Supertree is released under an open source license and is available on the python package index as sc-supertree.

摘要

系统发育重建算法是计算分子进化的核心。数据获取的迅猛步伐暴露了这些算法扩展性差的问题,以及从能源使用角度来看,这些方法的传统应用不切实际且不合理的结论。此外,提高系统发育方法统计性能的需求产生了越来越多参数丰富的序列进化模型,这进一步恶化了计算性能。已有的理论和算法结果表明,超级树方法对于改进系统发育重建扩展性的分治策略至关重要。特别重要的是能够明确容纳有根拓扑结构。这些结构可能源自序列进化中更符合生物学实际的非平稳模型。我们通过谱聚类超级树(Spectral Cluster Supertree)为应对这一挑战做出了贡献,谱聚类超级树是一种用于合并一组重叠有根系统发育树的新型超级树方法。在时间复杂度和整体拓扑准确性方面,它相对于最小割超级树和先前的最先进方法都有显著改进,特别是对于大规模问题。我们与最小割超级树和坏分支删除法进行了比较。利用两种树拓扑距离度量,我们证明虽然坏分支删除法在其生成的超级树中产生了更多正确的分支,但谱聚类超级树生成的树在拓扑上通常更接近真实模型树。在包含10000个分类单元和500个源树的大型数据集上,坏分支删除法通常需要2小时运行,而我们的方法平均20秒就能生成一棵超级树。谱聚类超级树在开源许可下发布,可在Python包索引中作为sc - supertree获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验