Godzik A, Skolnick J
Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037, USA.
Comput Appl Biosci. 1994 Dec;10(6):587-96. doi: 10.1093/bioinformatics/10.6.587.
The recently described equivalence between the alignment of two proteins and a conformation of a lattice chain on a two-dimensional square lattice is extended to multiple alignments. The search for the optimal multiple alignment between several proteins, which is equivalent to finding the energy minimum in the conformational space of a multi-dimensional lattice chain, is studied by the Monte Carlo approach. This method, while not deterministic, and for two-dimensional problems slower than dynamic programming, can accept arbitrary scoring functions, including non-local ones, and its speed decreases slowly with increasing number of dimensions. For the local scoring functions, the MC algorithm can also reproduce known exact solutions for the direct multiple alignments. As illustrated by examples, both for structure- and sequence-based alignments, direct multi-dimensional alignments are able to capture weak similarities between divergent families much better than ones built from pairwise alignments by a hierarchical approach.
最近描述的两种蛋白质的比对与二维方格上晶格链构象之间的等效性被扩展到了多重比对。通过蒙特卡罗方法研究了寻找几种蛋白质之间的最优多重比对问题,这等同于在多维晶格链的构象空间中寻找能量最小值。该方法虽然不是确定性的,并且对于二维问题比动态规划慢,但它可以接受任意的评分函数,包括非局部评分函数,并且其速度随着维度数量的增加而缓慢下降。对于局部评分函数,蒙特卡罗算法也能够重现直接多重比对的已知精确解。如示例所示,无论是基于结构还是基于序列的比对,直接多维比对都比通过分层方法从两两比对构建的比对能够更好地捕捉不同家族之间的微弱相似性。