Suppr超能文献

系统发育重建:利用进化速率的贝叶斯推断提高成对距离估计的准确性。

Phylogeny reconstruction: increasing the accuracy of pairwise distance estimation using Bayesian inference of evolutionary rates.

作者信息

Ninio Matan, Privman Eyal, Pupko Tal, Friedman Nir

机构信息

The Selim and Rachel Benin School of Computer Science and Engineering, Hebrew University Jerusalem 91904, Israel.

出版信息

Bioinformatics. 2007 Jan 15;23(2):e136-41. doi: 10.1093/bioinformatics/btl304.

Abstract

Distance-based methods for phylogeny reconstruction are the fastest and easiest to use, and their popularity is accordingly high. They are also the only known methods that can cope with huge datasets of thousands of sequences. These methods rely on evolutionary distance estimation and are sensitive to errors in such estimations. In this study, a novel Bayesian method for estimation of evolutionary distances is developed. The proposed method enables the use of a sophisticated evolutionary model that better accounts for among-site rate variation (ASRV), thereby improving the accuracy of distance estimation. Rate variations are estimated within a Bayesian framework by extracting information from the entire dataset of sequences, unlike standard methods that can only use one pair of sequences at a time. We compare the accuracy of a cascade of distance estimation methods, starting from commonly used methods and moving towards the more sophisticated novel method. Simulation studies show significant improvements in the accuracy of distance estimation by the novel method over the commonly used ones. We demonstrate the effect of the improved accuracy on tree reconstruction using both real and simulated protein sequence alignments. An implementation of this method is available as part of the SEMPHY package.

摘要

基于距离的系统发育重建方法是最快且最易于使用的,因此其受欢迎程度很高。它们也是已知的唯一能够处理包含数千个序列的巨大数据集的方法。这些方法依赖于进化距离估计,并且对这种估计中的误差很敏感。在本研究中,开发了一种用于估计进化距离的新型贝叶斯方法。所提出的方法能够使用一种更复杂的进化模型,该模型能更好地解释位点间速率变化(ASRV),从而提高距离估计的准确性。速率变化是在贝叶斯框架内通过从整个序列数据集中提取信息来估计的,这与标准方法不同,标准方法一次只能使用一对序列。我们比较了一系列距离估计方法的准确性,从常用方法开始,逐步转向更复杂的新方法。模拟研究表明,新方法在距离估计准确性方面比常用方法有显著提高。我们使用真实和模拟的蛋白质序列比对展示了提高的准确性对树重建的影响。该方法的一个实现作为SEMPHY软件包的一部分可用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验