Suppr超能文献

从DNA和蛋白质序列重建进化树:平行线性距离

Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances.

作者信息

Lake J A

机构信息

Molecular Biology Institute, University of California, Los Angeles 90024.

出版信息

Proc Natl Acad Sci U S A. 1994 Feb 15;91(4):1455-9. doi: 10.1073/pnas.91.4.1455.

Abstract

The reconstruction of phylogenetic trees from DNA and protein sequences is confounded by unequal rate effects. These effects can group rapidly evolving taxa with other rapidly evolving taxa, whether or not they are genealogically related. All algorithms are sensitive to these effects whenever the assumptions on which they are based are not met. The algorithm presented here, called paralinear distances, is valid for a much broader class of substitution processes than previous algorithms and is accordingly less affected by unequal rate effects. It may be used with all nucleic acid, protein, or other sequences, provided that their evolution may be modeled as a succession of Markov processes. The properties of the method have been proven both analytically and by computer simulations. Like all other methods, paralinear distances can fail when sequences are misaligned or when site-to-site sequence variation of rates is extensive. To examine the usefulness of paralinear distances, the "origin of the eukaryotes" has been investigated by the analysis of elongation factor Tu sequences with a variety of sequence alignments. It has been found that the order in which sequences are pairwise aligned strongly determines the topology which is reconstructed by paralinear distances (as it does for all other reconstruction methods tested). When the parts of the alignment that are unaffected by alignment order are analyzed, paralinear distances strongly select the eocyte topology. This provides evidence that the eocyte prokaryotes are the closest prokaryotic relatives of the eukaryotes.

摘要

从DNA和蛋白质序列重建系统发育树会受到不等速率效应的干扰。这些效应会将快速进化的分类群与其他快速进化的分类群归为一类,无论它们在谱系上是否相关。只要算法所基于的假设不成立,所有算法都会对这些效应敏感。这里提出的算法称为平行线性距离算法,它适用于比以前的算法更广泛的一类替换过程,因此受不等速率效应的影响较小。只要其进化可以建模为一系列马尔可夫过程,它就可以用于所有核酸、蛋白质或其他序列。该方法的性质已经通过解析和计算机模拟得到证明。与所有其他方法一样,当序列比对错误或位点间序列速率变化很大时,平行线性距离算法可能会失效。为了检验平行线性距离算法的实用性,通过对多种序列比对的延伸因子Tu序列进行分析,研究了“真核生物的起源”。已经发现,序列两两比对的顺序强烈地决定了由平行线性距离算法重建的拓扑结构(对于所有其他测试的重建方法也是如此)。当分析比对中不受比对顺序影响的部分时,平行线性距离算法强烈支持“曙细胞”拓扑结构。这提供了证据表明,曙细胞原核生物是真核生物最亲近的原核生物亲属。

相似文献

7
Calculation of evolutionary trees from sequence data.从序列数据计算进化树。
Proc Natl Acad Sci U S A. 1979 Sep;76(9):4516-20. doi: 10.1073/pnas.76.9.4516.
9
On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
10
Comparing evolutionary distances via adaptive distance functions.通过自适应距离函数比较进化距离。
J Theor Biol. 2018 Mar 7;440:88-99. doi: 10.1016/j.jtbi.2017.12.022. Epub 2017 Dec 23.

引用本文的文献

1
The evolution of the tree of life.生命之树的演化。
Philos Trans R Soc Lond B Biol Sci. 2025 Aug 7;380(1931):20240091. doi: 10.1098/rstb.2024.0091.
5
Spectral neighbor joining for reconstruction of latent tree Models.用于潜在树模型重建的谱邻接合并
SIAM J Math Data Sci. 2021;3(1):113-141. doi: 10.1137/20m1365715. Epub 2021 Feb 1.
10
Coronavirus phylogeny based on triplets of nucleic acids bases.基于核酸碱基三联体的冠状病毒系统发育
Chem Phys Lett. 2006 Apr 15;421(4):313-318. doi: 10.1016/j.cplett.2006.01.030. Epub 2006 Feb 20.

本文引用的文献

1
Optimal sequence alignments.最佳序列比对。
Proc Natl Acad Sci U S A. 1983 Mar;80(5):1382-6. doi: 10.1073/pnas.80.5.1382.
4
Establishing homologies in protein sequences.确定蛋白质序列中的同源性。
Methods Enzymol. 1983;91:524-45. doi: 10.1016/s0076-6879(83)91049-2.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验