Suppr超能文献

从不一致的基因树推断物种树的精确解。

Exact solutions for species tree inference from discordant gene trees.

作者信息

Chang Wen-Chieh, Górecki Paweł, Eulenstein Oliver

机构信息

DuPont Pioneer, Johnston, Iowa 50131, USA.

出版信息

J Bioinform Comput Biol. 2013 Oct;11(5):1342005. doi: 10.1142/S0219720013420055. Epub 2013 Oct 2.

Abstract

Phylogenetic analysis has to overcome the grant challenge of inferring accurate species trees from evolutionary histories of gene families (gene trees) that are discordant with the species tree along whose branches they have evolved. Two well studied approaches to cope with this challenge are to solve either biologically informed gene tree parsimony (GTP) problems under gene duplication, gene loss, and deep coalescence, or the classic RF supertree problem that does not rely on any biological model. Despite the potential of these problems to infer credible species trees, they are NP-hard. Therefore, these problems are addressed by heuristics that typically lack any provable accuracy and precision. We describe fast dynamic programming algorithms that solve the GTP problems and the RF supertree problem exactly, and demonstrate that our algorithms can solve instances with data sets consisting of as many as 22 taxa. Extensions of our algorithms can also report the number of all optimal species trees, as well as the trees themselves. To better asses the quality of the resulting species trees that best fit the given gene trees, we also compute the worst case species trees, their numbers, and optimization score for each of the computational problems. Finally, we demonstrate the performance of our exact algorithms using empirical and simulated data sets, and analyze the quality of heuristic solutions for the studied problems by contrasting them with our exact solutions.

摘要

系统发育分析必须克服一项重大挑战,即从基因家族的进化历史(基因树)中推断准确的物种树,而这些基因家族的进化历史与它们所沿其分支进化的物种树不一致。两种经过充分研究以应对这一挑战的方法,一是在基因复制、基因丢失和深度合并的情况下解决具有生物学依据的基因树简约(GTP)问题,二是解决不依赖任何生物学模型的经典RF超级树问题。尽管这些问题有潜力推断出可信的物种树,但它们是NP难问题。因此,这些问题通过启发式方法来解决,而这些方法通常缺乏任何可证明的准确性和精确性。我们描述了能够精确解决GTP问题和RF超级树问题的快速动态规划算法,并证明我们的算法能够解决由多达22个分类单元组成的数据集的实例。我们算法的扩展还可以报告所有最优物种树的数量以及这些树本身。为了更好地评估最适合给定基因树的所得物种树的质量,我们还计算了每个计算问题的最坏情况物种树、它们的数量以及优化得分。最后,我们使用经验数据集和模拟数据集展示了我们精确算法的性能,并通过将启发式解决方案与我们的精确解决方案进行对比,分析了所研究问题的启发式解决方案的质量。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验