Suppr超能文献

将多个多序列比对合并为一个改进的比对。

Combining many multiple alignments in one improved alignment.

作者信息

Bucka-Lassen K, Caprani O, Hein J

机构信息

Object Oriented Ltd, 6004 Luzern, Switzerland, Department of Computer Science and Department of Ecology and Genetics, University of Aarhus, 8000 Aarhus C, Denmark.

出版信息

Bioinformatics. 1999 Feb;15(2):122-30. doi: 10.1093/bioinformatics/15.2.122.

Abstract

MOTIVATION

The fact that the multiple sequence alignment problem is of high complexity has led to many different heuristic algorithms attempting to find a solution in what would be considered a reasonable amount of computation time and space. Very few of these heuristics produce results that are guaranteed always to lie within a certain distance of an optimal solution (given a measure of quality, e.g. parsimony). Most practical heuristics cannot guarantee this, but nevertheless perform well for certain cases. An alignment, obtained with one of these heuristics and with a bad overall score, is not unusable though, it might contain important information on how substrings should be aligned. This paper presents a method that extracts qualitatively good sub-alignments from a set of multiple alignments and combines these into a new, often improved alignment. The algorithm is implemented as a variant of the traditional dynamic programming technique.

RESULTS

An implementation of ComAlign (the algorithm that combines multiple alignments) has been run on several sets of artificially generated sequences and a set of 5S RNA sequences. To assess the quality of the alignments obtained, the results have been compared with the output of MSA 2.1 (Gupta et al., Proceedings of the Sixth Annual Symposium on Combinatorial Pattern Matching, 1995; Kececioglu et al., http://www.techfak.uni-bielefeld. de/bcd/Lectures/kececioglu.html, 1995). In all cases, ComAlign was able to produce a solution with a score comparable to the solution obtained by MSA. The results also show that ComAlign actually does combine parts from different alignments and not just select the best of them.

AVAILABILITY

The C source code (a Smalltalk version is being worked on) of ComAlign and the other programs that have been implemented in this context are free and available on WWW (http://www.daimi.au.dk/ õcaprani).

CONTACT

klaus@bucka-lassen.dk; jotun@pop.bio.au.dk;ocaprani@daimi.au.dk

摘要

动机

多重序列比对问题具有高复杂性这一事实,导致许多不同的启发式算法试图在被认为合理的计算时间和空间内找到解决方案。这些启发式算法中很少有能保证其结果总是处于最优解一定距离范围内的(给定质量度量,例如简约性)。大多数实用的启发式算法无法保证这一点,但在某些情况下仍表现良好。通过这些启发式算法之一获得的且总体得分较差的比对并非不可用,它可能包含有关子串应如何比对的重要信息。本文提出了一种方法,该方法从一组多重比对中提取质量上良好的子比对,并将它们组合成一个新的、通常有所改进的比对。该算法是作为传统动态规划技术的一种变体实现的。

结果

ComAlign(组合多重比对的算法)的一个实现版本已在几组人工生成的序列以及一组5S RNA序列上运行。为了评估所获得比对的质量,已将结果与MSA 2.1的输出进行了比较(Gupta等人,《第六届组合模式匹配年度研讨会论文集》,1995年;Kececioglu等人,http://www.techfak.uni-bielefeld.de/bcd/Lectures/kececioglu.html,1995年)。在所有情况下,ComAlign都能够产生一个得分与MSA获得的解相当的解。结果还表明,ComAlign实际上确实组合了来自不同比对的部分,而不仅仅是选择其中最好的。

可用性

ComAlign的C源代码(正在开发一个Smalltalk版本)以及在此背景下实现的其他程序可在万维网上免费获取(http://www.daimi.au.dk/õcaprani)。

联系方式

klaus@bucka-lassen.dkjotun@pop.bio.au.dkocaprani@daimi.au.dk

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验