Hernández-Guía M, Mulet R, Rodríguez-Pérez S
Henri-Poincaré Group of Complex Systems, Physics Faculty, University of Havana, La Habana, CP 10400, Cuba.
Phys Rev E Stat Nonlin Soft Matter Phys. 2005 Sep;72(3 Pt 1):031915. doi: 10.1103/PhysRevE.72.031915. Epub 2005 Sep 27.
We propose a probabilistic algorithm to solve the multiple sequence alignment problem. The algorithm is a simulated annealing that exploits the representation of the multiple alignment between D sequences as a directed polymer in D dimensions. Within this representation we can easily track the evolution of the alignment through local moves of low computational cost. In contrast with other probabilistic algorithms proposed to solve this problem, our approach allows the creation and deletion of gaps without extra computational cost. The algorithm was tested by aligning proteins from the kinase family. When D=3 the results are consistent with those obtained using a complete algorithm. For D>3 where the complete algorithm fails, we show that our algorithm still converges to reasonable alignments. We also study the space of solutions obtained and show that depending on the number of sequences aligned the solutions are organized in different ways, suggesting a possible source of errors for progressive algorithms. Finally, we test our algorithm in artificially generated sequences and prove that it may perform better than progressive algorithms. Moreover, in those cases in which a progressive algorithm works better, its solution may be taken as an initial condition of our algorithm and, again, we obtain alignments with lower scores and more relevant from the biological point of view.
我们提出了一种概率算法来解决多序列比对问题。该算法是一种模拟退火算法,它将D个序列之间的多序列比对表示为D维有向聚合物。在这种表示方式下,我们可以通过低计算成本的局部移动轻松跟踪比对的演变。与为解决此问题而提出的其他概率算法相比,我们的方法允许在不增加额外计算成本的情况下创建和删除空位。该算法通过比对激酶家族的蛋白质进行了测试。当D = 3时,结果与使用完整算法获得的结果一致。对于D> 3且完整算法失败的情况,我们表明我们的算法仍能收敛到合理的比对结果。我们还研究了获得的解空间,并表明根据比对的序列数量,解以不同方式组织,这表明渐进算法可能存在错误来源。最后,我们在人工生成的序列上测试了我们的算法,并证明它可能比渐进算法表现更好。此外,在渐进算法表现更好的那些情况下,其解可以作为我们算法的初始条件,并且我们再次获得得分更低且从生物学角度更相关的比对结果。