• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PhyPA:一种结合成对序列比对的系统发育方法,在涉及高度分化序列的系统发育分析中,其性能优于似然法。

PhyPA: Phylogenetic method with pairwise sequence alignment outperforms likelihood methods in phylogenetics involving highly diverged sequences.

作者信息

Xia Xuhua

机构信息

Department of Biology, University of Ottawa, 30 Marie Curie, Ottawa K1N 6N5, Canada; Ottawa Institute of Systems Biology, 451 Smyth Road, Ottawa, ON K1H 8M5, Canada.

出版信息

Mol Phylogenet Evol. 2016 Sep;102:331-43. doi: 10.1016/j.ympev.2016.07.001. Epub 2016 Jul 1.

DOI:10.1016/j.ympev.2016.07.001
PMID:27377322
Abstract

While pairwise sequence alignment (PSA) by dynamic programming is guaranteed to generate one of the optimal alignments, multiple sequence alignment (MSA) of highly divergent sequences often results in poorly aligned sequences, plaguing all subsequent phylogenetic analysis. One way to avoid this problem is to use only PSA to reconstruct phylogenetic trees, which can only be done with distance-based methods. I compared the accuracy of this new computational approach (named PhyPA for phylogenetics by pairwise alignment) against the maximum likelihood method using MSA (the ML+MSA approach), based on nucleotide, amino acid and codon sequences simulated with different topologies and tree lengths. I present a surprising discovery that the fast PhyPA method consistently outperforms the slow ML+MSA approach for highly diverged sequences even when all optimization options were turned on for the ML+MSA approach. Only when sequences are not highly diverged (i.e., when a reliable MSA can be obtained) does the ML+MSA approach outperforms PhyPA. The true topologies are always recovered by ML with the true alignment from the simulation. However, with MSA derived from alignment programs such as MAFFT or MUSCLE, the recovered topology consistently has higher likelihood than that for the true topology. Thus, the failure to recover the true topology by the ML+MSA is not because of insufficient search of tree space, but by the distortion of phylogenetic signal by MSA methods. I have implemented in DAMBE PhyPA and two approaches making use of multi-gene data sets to derive phylogenetic support for subtrees equivalent to resampling techniques such as bootstrapping and jackknifing.

摘要

虽然通过动态规划进行的成对序列比对(PSA)必定会生成最优比对结果之一,但高度分化序列的多序列比对(MSA)往往会导致序列比对不佳,这给所有后续的系统发育分析带来了困扰。避免这个问题的一种方法是仅使用PSA来重建系统发育树,而这只能通过基于距离的方法来完成。我基于用不同拓扑结构和树长模拟的核苷酸、氨基酸和密码子序列,将这种新的计算方法(通过成对比对进行系统发育分析,命名为PhyPA)与使用MSA的最大似然法(ML+MSA方法)的准确性进行了比较。我有一个惊人的发现,即对于高度分化的序列,即使为ML+MSA方法开启了所有优化选项,快速的PhyPA方法也始终优于缓慢的ML+MSA方法。只有当序列分化程度不高时(即当可以获得可靠的MSA时),ML+MSA方法才会优于PhyPA。通过模拟中的真实比对,ML总能恢复真实的拓扑结构。然而,对于从MAFFT或MUSCLE等比对程序得到的MSA,恢复的拓扑结构的似然性始终高于真实拓扑结构。因此,ML+MSA未能恢复真实拓扑结构并非因为对树空间的搜索不足,而是因为MSA方法对系统发育信号的扭曲。我已在DAMBE中实现了PhyPA以及另外两种利用多基因数据集来获得与重抽样技术(如自展法和刀切法)等效的子树系统发育支持的方法。

相似文献

1
PhyPA: Phylogenetic method with pairwise sequence alignment outperforms likelihood methods in phylogenetics involving highly diverged sequences.PhyPA:一种结合成对序列比对的系统发育方法,在涉及高度分化序列的系统发育分析中,其性能优于似然法。
Mol Phylogenet Evol. 2016 Sep;102:331-43. doi: 10.1016/j.ympev.2016.07.001. Epub 2016 Jul 1.
2
Ancestral sequence alignment under optimal conditions.在最佳条件下进行祖先序列比对。
BMC Bioinformatics. 2005 Nov 17;6:273. doi: 10.1186/1471-2105-6-273.
3
Characterization of pairwise and multiple sequence alignment errors.成对和多序列比对错误的特征描述。
Gene. 2009 Jul 15;441(1-2):141-7. doi: 10.1016/j.gene.2008.05.016. Epub 2008 Jun 3.
4
SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.SATe-II:一种非常快速且准确的同时估计多个序列比对和系统发育树的方法。
Syst Biol. 2012 Jan;61(1):90-106. doi: 10.1093/sysbio/syr095. Epub 2011 Dec 1.
5
The effect of the guide tree on multiple sequence alignments and subsequent phylogenetic analyses.引导树对多序列比对及后续系统发育分析的影响。
Pac Symp Biocomput. 2008:25-36. doi: 10.1142/9789812776136_0004.
6
Evidence of Statistical Inconsistency of Phylogenetic Methods in the Presence of Multiple Sequence Alignment Uncertainty.在存在多序列比对不确定性的情况下系统发育方法统计不一致性的证据。
Genome Biol Evol. 2015 Jul 1;7(8):2102-16. doi: 10.1093/gbe/evv127.
7
Alignment of, and phylogenetic inference from, random sequences: the susceptibility of alternative alignment methods to creating artifactual resolution and support.随机序列的比对和系统发育推断:替代比对方法在产生人为分辨率和支持方面的易感性。
Mol Phylogenet Evol. 2010 Dec;57(3):1004-16. doi: 10.1016/j.ympev.2010.09.004. Epub 2010 Sep 16.
8
Mind the gaps: evidence of bias in estimates of multiple sequence alignments.注意差距:多重序列比对估计中的偏差证据。
Mol Biol Evol. 2007 Nov;24(11):2433-42. doi: 10.1093/molbev/msm176. Epub 2007 Aug 20.
9
Using confidence set heuristics during topology search improves the robustness of phylogenetic inference.在拓扑结构搜索过程中使用置信集启发式方法可提高系统发育推断的稳健性。
J Mol Evol. 2007 Jan;64(1):80-9. doi: 10.1007/s00239-006-0072-4. Epub 2006 Dec 9.
10
Characterization of multiple sequence alignment errors using complete-likelihood score and position-shift map.使用完全似然得分和位置偏移图对多序列比对错误进行表征。
BMC Bioinformatics. 2016 Mar 18;17:133. doi: 10.1186/s12859-016-0945-5.

引用本文的文献

1
A sequence-based evolutionary distance method for Phylogenetic analysis of highly divergent proteins.一种基于序列的进化距离方法,用于高度分化蛋白的系统发育分析。
Sci Rep. 2023 Nov 20;13(1):20304. doi: 10.1038/s41598-023-47496-9.
2
Data on the solution and processing time reached when constructing a phylogenetic tree using a quantum-inspired computer.关于使用量子启发式计算机构建系统发育树时所达到的解决方案和处理时间的数据。
Data Brief. 2023 Feb 13;47:108970. doi: 10.1016/j.dib.2023.108970. eCollection 2023 Apr.
3
Post-Alignment Adjustment and Its Automation.
后校对调整及其自动化。
Genes (Basel). 2021 Nov 18;12(11):1809. doi: 10.3390/genes12111809.
4
Major Revisions in Arthropod Phylogeny Through Improved Supermatrix, With Support for Two Possible Waves of Land Invasion by Chelicerates.通过改进的超矩阵对节肢动物系统发育进行重大修订,支持螯肢动物两次可能的陆地入侵浪潮。
Evol Bioinform Online. 2020 Feb 5;16:1176934320903735. doi: 10.1177/1176934320903735. eCollection 2020.
5
DAMBE7: New and Improved Tools for Data Analysis in Molecular Biology and Evolution.DAMBE7:用于分子生物学和进化数据分析的新改进工具。
Mol Biol Evol. 2018 Jun 1;35(6):1550-1552. doi: 10.1093/molbev/msy073.
6
DAMBE6: New Tools for Microbial Genomics, Phylogenetics, and Molecular Evolution.DAMBE6:微生物基因组学、系统发育学和分子进化的新工具。
J Hered. 2017 Jun 1;108(4):431-437. doi: 10.1093/jhered/esx033.
7
The Role of +4U as an Extended Translation Termination Signal in Bacteria.+4U作为细菌中一种扩展翻译终止信号的作用。
Genetics. 2017 Feb;205(2):539-549. doi: 10.1534/genetics.116.193961. Epub 2016 Nov 30.
8
Bioinformatics and Drug Discovery.生物信息学与药物发现
Curr Top Med Chem. 2017;17(15):1709-1726. doi: 10.2174/1568026617666161116143440.