Chen Yu, Crippen Gordon M
Bioinformatics Program, University of Michigan, Ann Arbor, MI 48109, USA.
Bioinformatics. 2006 Sep 1;22(17):2087-93. doi: 10.1093/bioinformatics/btl351. Epub 2006 Jun 29.
Multiple STructural Alignment (MSTA) provides valuable information for solving problems such as fold recognition. The consistency-based approach tries to find conflict-free subsets of alignments from a pre-computed all-to-all Pairwise Alignment Library (PAL). If large proportions of conflicts exist in the library, consistency can be hard to get. On the other hand, multiple structural superposition has been used in many MSTA methods to refine alignments. However, multiple structural superposition is dependent on alignments, and a superposition generated based on erroneous alignments is not guaranteed to be the optimal superposition. Correcting errors after making errors is not as good as avoiding errors from the beginning. Hence it is important to refine the pairwise library to reduce the number of conflicts before any consistency-based assembly.
We present an algorithm, Iterative Refinement of Induced Structural alignment (IRIS), to refine the PAL. A new measurement for the consistency of a library is also proposed. Experiments show that our algorithm can greatly improve T-COFFEE performance for less consistent pairwise alignment libraries. The final multiple alignment outperforms most state-of-the-art MSTA algorithms at assembling 15 transglycosidases. Results on three other benchmarks showed that the algorithm consistently improves multiple alignment performance.
The C++ code of the algorithm is available upon request.
多重结构比对(MSTA)为解决诸如折叠识别等问题提供了有价值的信息。基于一致性的方法试图从预先计算的全对全成对比对库(PAL)中找到无冲突的比对子集。如果库中存在大量冲突,就很难获得一致性。另一方面,在许多MSTA方法中,多重结构叠加已被用于优化比对。然而,多重结构叠加依赖于比对,基于错误比对生成的叠加不一定是最优叠加。犯错后再纠错不如从一开始就避免错误。因此,在基于一致性的组装之前,优化成对库以减少冲突数量很重要。
我们提出了一种算法,即诱导结构比对的迭代优化(IRIS),用于优化PAL。还提出了一种新的库一致性度量方法。实验表明,对于一致性较低的成对比对库,我们的算法可以大大提高T-COFFEE的性能。在组装15种转糖苷酶时,最终的多重比对优于大多数最先进的MSTA算法。在其他三个基准测试中的结果表明,该算法持续提高了多重比对性能。
可根据要求提供该算法的C++代码。