Georg-August-Universität, Institut für Mikrobiologie und Genetik, Goldschmidtstrasse 1, 37077 Göttingen, Germany.
Bioinformatics. 2010 Apr 15;26(8):1015-21. doi: 10.1093/bioinformatics/btq082. Epub 2010 Feb 25.
Multiple sequence alignments can be constructed on the basis of pairwise local sequence similarities. This approach is rather flexible and can combine the advantages of global and local alignment methods. The restriction to pairwise alignments as building blocks, however, can lead to misalignments since weak homologies may be missed if only pairs of sequences are compared.
Herein, we propose a graph-theoretical approach to find local multiple sequence similarities. Starting with pairwise alignments produced by DIALIGN, we use a min-cut algorithm to find potential (partial) alignment columns that we use to construct a final multiple alignment. On real and simulated benchmark data, our approach consistently outperforms the standard version of DIALIGN where local pairwise alignments are greedily incorporated into a multiple alignment.
The prototype is freely available under GNU Public Licence from E.C.
可以基于两两局部序列相似性构建多重序列比对。这种方法相当灵活,可以结合全局和局部比对方法的优点。然而,由于仅比较序列对可能会错过较弱的同源性,因此将比对限制为构建块可能会导致不对齐。
在此,我们提出了一种基于图论的方法来寻找局部多重序列相似性。从 DIALIGN 生成的两两比对开始,我们使用最小割算法来找到潜在的(部分)比对列,我们使用这些列来构建最终的多重比对。在真实和模拟基准数据上,我们的方法始终优于 DIALIGN 的标准版本,其中局部两两比对被贪婪地纳入多重比对中。
原型可根据 GNU 公共许可证免费获得,可从 E.C. 获取。