Wang Yi, Li Kuo-Bin
Bioinformatics Institute, 30 Biopolis Street, Singapore 138671, Singapore.
Comput Biol Chem. 2004 Apr;28(2):141-8. doi: 10.1016/j.compbiolchem.2004.02.001.
Multiple sequence alignment is a basic tool in computational genomics. The art of multiple sequence alignment is about placing gaps. This paper presents a heuristic algorithm that improves multiple protein sequences alignment iteratively. A consistency-based objective function is used to evaluate the candidate moves. During the iterative optimization, well-aligned regions can be detected and kept intact. Columns of gaps will be inserted to assist the algorithm to escape from local optimal alignments. The algorithm has been evaluated using the BAliBASE benchmark alignment database. Results show that the performance of the algorithm does not depend on initial or seed alignments much. Given a perfect consistency library, the algorithm is able to produce alignments that are close to the global optimum. We demonstrate that the algorithm is able to refine alignments produced by other software, including ClustalW, SAGA and T-COFFEE. The program is available upon request.
多序列比对是计算基因组学中的一种基本工具。多序列比对的技巧在于如何插入空位。本文提出了一种启发式算法,该算法可迭代地改进多蛋白序列比对。基于一致性的目标函数用于评估候选移动。在迭代优化过程中,可以检测到排列良好的区域并使其保持完整。将插入空位列以帮助算法摆脱局部最优比对。该算法已使用BAliBASE基准比对数据库进行了评估。结果表明,该算法的性能在很大程度上不依赖于初始或种子比对。给定一个完美的一致性库,该算法能够产生接近全局最优的比对。我们证明该算法能够优化由其他软件(包括ClustalW、SAGA和T-COFFEE)产生的比对。该程序可应要求提供。