Suppr超能文献

一种用于估计最大似然祖先重建误差的实用算法。

A practical algorithm for estimation of the maximum likelihood ancestral reconstruction error.

作者信息

Hickey Glenn, Blanchette Mathieu

机构信息

McGill Centre for Bioinformatics and School of Computer Science, McGill University, 3480 University St., Montréal, Québec, H3A 2B4, Canada.

出版信息

Pac Symp Biocomput. 2010:31-42. doi: 10.1142/9789814295291_0005.

Abstract

The ancestral sequence reconstruction problem asks to predict the DNA or protein sequence of an ancestral species, given the sequences of extant species. Such reconstructions are fundamental to comparative genomics, as they provide information about extant genomes and the process of evolution that gave rise to them. Arguably the best method for ancestral reconstruction is maximum likelihood estimation. Many effective algorithms for accurately computing the most likely ancestral sequence have been proposed. We consider the less-studied problem of computing the expected reconstruction error of a maximum likelihood reconstruction, given the phylogenetic tree and model of evolution, but not the extant sequences. This situation can arise, for example, when deciding which genomes to sequence for a reconstruction project given a gene-tree phylogeny (The Taxon Selection Problem). In most applications, the reconstruction error is necessarily very small, making Monte Carlo simulations very inefficient for accurate estimation. We present the first practical algorithm for this problem and demonstrate how it can be used to quickly and accurately estimate the reconstruction accuracy. We then use our method as a kernel in a heuristic algorithm for the taxon selection problem. The implementation is available at http://www.mcb.mcgill.ca/ blanchem/mlerror.

摘要

祖先序列重建问题要求在已知现存物种序列的情况下,预测某一祖先物种的DNA或蛋白质序列。此类重建对于比较基因组学至关重要,因为它们能提供有关现存基因组以及产生这些基因组的进化过程的信息。可以说,用于祖先重建的最佳方法是最大似然估计。已经提出了许多用于精确计算最可能祖先序列的有效算法。我们考虑一个研究较少的问题:在给定系统发育树和进化模型但未给定现存序列的情况下,计算最大似然重建的预期重建误差。例如,在根据基因树系统发育确定为重建项目测序哪些基因组时(分类单元选择问题),就可能出现这种情况。在大多数应用中,重建误差必然非常小,这使得蒙特卡罗模拟对于精确估计效率极低。我们提出了针对此问题的首个实用算法,并展示了如何使用该算法快速准确地估计重建精度。然后,我们将我们的方法用作分类单元选择问题启发式算法的核心。该实现可在http://www.mcb.mcgill.ca/ blanchem/mlerror获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验