Suppr超能文献

芒果:一种多序列比对的新方法。

MANGO: a new approach to multiple sequence alignment.

作者信息

Zhang Zefeng, Lin Hao, Li Ming

机构信息

Computational Biology Research Group, Division of Intelligent Software Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China.

出版信息

Comput Syst Bioinformatics Conf. 2007;6:237-47.

Abstract

Multiple sequence alignment is a classical and challenging task for biological sequence analysis. The problem is NP-hard. The full dynamic programming takes too much time. The progressive alignment heuristics adopted by most state of the art multiple sequence alignment programs suffer from the 'once a gap, always a gap' phenomenon. Is there a radically new way to do multiple sequence alignment? This paper introduces a novel and orthogonal multiple sequence alignment method, using multiple optimized spaced seeds and new algorithms to handle these seeds efficiently. Our new algorithm processes information of all sequences as a whole, avoiding problems caused by the popular progressive approaches. Because the optimized spaced seeds are provably significantly more sensitive than the consecutive k-mers, the new approach promises to be more accurate and reliable. To validate our new approach, we have implemented MANGO: Multiple Alignment with N Gapped Oligos. Experiments were carried out on large 16S RNA benchmarks showing that MANGO compares favorably, in both accuracy and speed, against state-of-art multiple sequence alignment methods, including ClustalW 1.83, MUSCLE 3.6, MAFFT 5.861, Prob-ConsRNA 1.11, Dialign 2.2.1, DIALIGN-T 0.2.1, T-Coffee 4.85, POA 2.0 and Kalign 2.0.

摘要

多序列比对是生物序列分析中的一项经典且具有挑战性的任务。该问题是NP难问题。完整的动态规划方法耗时过长。大多数先进的多序列比对程序所采用的渐进比对启发式算法存在“一旦有缺口,永远有缺口”的现象。是否存在一种全新的多序列比对方法呢?本文介绍了一种新颖且正交的多序列比对方法,该方法使用多个优化的间隔种子以及新算法来高效处理这些种子。我们的新算法将所有序列的信息作为一个整体进行处理,避免了流行的渐进方法所导致的问题。由于优化的间隔种子经证明比连续的k-mer序列明显更具敏感性,因此新方法有望更加准确和可靠。为了验证我们的新方法,我们实现了MANGO:带N个缺口寡核苷酸的多序列比对工具。在大型16S RNA基准数据集上进行的实验表明,MANGO在准确性和速度方面均优于包括ClustalW 1.83、MUSCLE 3.6、MAFFT 5.861、Prob-ConsRNA 1.11、Dialign 2.2.1、DIALIGN-T 0.2.1、T-Coffee 4.85、POA 2.0和Kalign 2.0在内的现有多序列比对方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验