Suppr超能文献

基于简约法估计的位点间替换率分布的核苷酸序列间的进化距离。

Evolutionary distances between nucleotide sequences based on the distribution of substitution rates among sites as estimated by parsimony.

作者信息

Tourasse N J, Gouy M

机构信息

Laboratorie de Biométrie, Génétique et Biologie des Populations, Université Claude Bernard, France.

出版信息

Mol Biol Evol. 1997 Mar;14(3):287-98. doi: 10.1093/oxfordjournals.molbev.a025764.

Abstract

The rate of evolution of macromolecules such as ribosomal RNAs and proteins varies along the molecule because structural and functional constraints differ between sites. Many studies have shown that ignoring this variation in computing evolutionary distances leads to severe underestimation of sequence divergences, and thus can lead to misleading evolutionary tree inferences. We propose here a new parsimony-based method for computing evolutionary distances between pairs of sequences that takes into account this variation and estimates it from the data. This method applies to the number of substitutions per site in ribosomal RNA genes as well as to the number of nonsynonymous substitutions per codon for protein-coding genes and is especially suitable when large data sets (> or = 100 sequences) are analyzed. First, starting from a phylogeny constructed with usual distances, the maximum-parsimony method is used to infer the distribution of the number of substitutions that have occurred at each site (or codon) along this tree. This distribution is then fitted to an "invariant + truncated negative binomial" distribution that allows for invariant sites. Maximum-likelihood fitting of this distribution to different data sets showed that it agreed very well with real data. Noticeably, allowing for invariant sites seemed to be very important. Finally, two distance estimates were developed by introducing the distribution of site variability into the substitution models of Jukes and Cantor and of Kimura. The use of different numbers of aligned sequences (up to 1,000 rRNA sequences) showed that the parameters of the model are very sensitive to the number of sequences used to estimate them. However, if at least 100 sequences are considered, the two new distance estimates are quite stable with respect to the number of sequences used to fit the distribution. This stability is true for low as well as for high evolutionary distances. These new distances appeared to be much better estimates of the number of substitutions per site than the classical distances of Jukes and Cantor and of Kimura, which both greatly underestimate this number, so that they can serve as indexes to detect saturation. We conclude that the new distances are particularly suitable for phylogenetic analysis when very distantly related species and relatively large data sets are considered. Trees reconstructed using these distances are generally different from those constructed by means of the classical estimates. Using this new method, we showed that the mean evolutionary distance between Prokaryotes and Eukaryotes is substantially higher for the small-subunit than for the large-subunit rRNAs. This suggests than the former might have experienced a drastic change during the early evolution of Eukaryotes.

摘要

核糖体RNA和蛋白质等大分子的进化速率在分子上各部位有所不同,因为各部位的结构和功能限制存在差异。许多研究表明,在计算进化距离时忽略这种差异会导致对序列分歧严重低估,进而可能得出误导性的进化树推断。我们在此提出一种基于简约法的新方法,用于计算序列对之间的进化距离,该方法考虑了这种差异并从数据中对其进行估计。此方法适用于核糖体RNA基因中每个位点的替换数,以及蛋白质编码基因中每个密码子的非同义替换数,尤其适用于分析大数据集(≥100个序列)时。首先,从用常规距离构建的系统发育树出发,使用最大简约法推断沿此树每个位点(或密码子)发生的替换数的分布。然后将该分布拟合为允许存在不变位点的“不变 + 截断负二项式”分布。对不同数据集进行该分布的最大似然拟合表明,它与实际数据吻合得很好。值得注意的是,允许存在不变位点似乎非常重要。最后,通过将位点变异性分布引入Jukes和Cantor以及Kimura的替换模型,得出了两种距离估计值。使用不同数量的比对序列(多达1000个rRNA序列)表明,模型参数对用于估计它们的序列数量非常敏感。然而,如果考虑至少100个序列,这两种新的距离估计值对于用于拟合分布的序列数量相当稳定。无论是低进化距离还是高进化距离,这种稳定性都成立。这些新距离似乎比Jukes和Cantor以及Kimura的经典距离更能准确估计每个位点的替换数,后两者都极大地低估了这个数字,因此它们可作为检测饱和度的指标。我们得出结论,当考虑亲缘关系非常远的物种和相对较大的数据集时,新距离特别适合系统发育分析。使用这些距离重建的树通常与通过经典估计构建的树不同。使用这种新方法,我们表明原核生物和真核生物之间小亚基的平均进化距离比大亚基rRNA的平均进化距离高得多。这表明前者在真核生物早期进化过程中可能经历了剧烈变化。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验