Suppr超能文献

利用深度神经网络可靠估计树枝长度。

Reliable estimation of tree branch lengths using deep neural networks.

机构信息

Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, United States of America.

Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America.

出版信息

PLoS Comput Biol. 2024 Aug 5;20(8):e1012337. doi: 10.1371/journal.pcbi.1012337. eCollection 2024 Aug.

Abstract

A phylogenetic tree represents hypothesized evolutionary history for a set of taxa. Besides the branching patterns (i.e., tree topology), phylogenies contain information about the evolutionary distances (i.e. branch lengths) between all taxa in the tree, which include extant taxa (external nodes) and their last common ancestors (internal nodes). During phylogenetic tree inference, the branch lengths are typically co-estimated along with other phylogenetic parameters during tree topology space exploration. There are well-known regions of the branch length parameter space where accurate estimation of phylogenetic trees is especially difficult. Several novel studies have recently demonstrated that machine learning approaches have the potential to help solve phylogenetic problems with greater accuracy and computational efficiency. In this study, as a proof of concept, we sought to explore the possibility of machine learning models to predict branch lengths. To that end, we designed several deep learning frameworks to estimate branch lengths on fixed tree topologies from multiple sequence alignments or its representations. Our results show that deep learning methods can exhibit superior performance in some difficult regions of branch length parameter space. For example, in contrast to maximum likelihood inference, which is typically used for estimating branch lengths, deep learning methods are more efficient and accurate. In general, we find that our neural networks achieve similar accuracy to a Bayesian approach and are the best-performing methods when inferring long branches that are associated with distantly related taxa. Together, our findings represent a next step toward accurate, fast, and reliable phylogenetic inference with machine learning approaches.

摘要

系统发育树表示一组分类单元的假设进化历史。除了分支模式(即树拓扑)外,系统发育还包含有关树中所有分类单元之间进化距离(即分支长度)的信息,其中包括现存的分类单元(外部节点)及其最后共同祖先(内部节点)。在系统发育树推断过程中,分支长度通常与树拓扑空间探索过程中的其他系统发育参数一起进行共同估计。分支长度参数空间中有一些众所周知的区域,在这些区域中,准确估计系统发育树特别困难。最近有几项新的研究表明,机器学习方法有可能以更高的准确性和计算效率帮助解决系统发育问题。在这项研究中,作为概念验证,我们试图探索机器学习模型预测分支长度的可能性。为此,我们设计了几个深度学习框架,以便从多个序列比对或其表示中在固定的树拓扑上估计分支长度。我们的结果表明,深度学习方法在分支长度参数空间的某些困难区域可以表现出优越的性能。例如,与通常用于估计分支长度的最大似然推断相比,深度学习方法更高效、更准确。总的来说,我们发现我们的神经网络在推断与远距离相关的分类单元相关的长分支时,与贝叶斯方法具有相似的准确性,并且是表现最好的方法。总之,我们的研究结果代表了朝着使用机器学习方法进行准确、快速和可靠的系统发育推断迈出的下一步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f19e/11326709/daa5f7edc347/pcbi.1012337.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验