Dong Elizabeth, Smith Jarrod, Heinze Sten, Alexander Nathan, Meiler Jens
Department of Chemistry and Center for Structural Biology, Vanderbilt University, Nashville, TN, USA.
Gene. 2008 Oct 1;422(1-2):41-6. doi: 10.1016/j.gene.2008.06.006. Epub 2008 Jun 7.
:Align is a multiple sequence alignment tool that utilizes the dynamic programming method in combination with a customizable scoring function for sequence alignment and fold recognition. The scoring function is a weighted sum of the traditional PAM and BLOSUM scoring matrices, position-specific scoring matrices output by PSI-BLAST, secondary structure predicted by a variety of methods, chemical properties, and gap penalties. By adjusting the weights, the method can be tailored for fold recognition or sequence alignment tasks at different levels of sequence identity. A Monte Carlo algorithm was used to determine optimized weight sets for sequence alignment and fold recognition that most accurately reproduced the SABmark reference alignment test set. In an evaluation of sequence alignment performance, BCL::Align ranked best in alignment accuracy (Cline score of 22.90 for sequences in the Twilight Zone) when compared with Align-m, ClustalW, T-Coffee, and MUSCLE. ROC curve analysis indicates BCL::Align's ability to correctly recognize protein folds with over 80% accuracy. The flexibility of the program allows it to be optimized for specific classes of proteins (e.g. membrane proteins) or fold families (e.g. TIM-barrel proteins). BCL::Align is free for academic use and available online at http://www.meilerlab.org/.
:Align是一种多序列比对工具,它利用动态规划方法并结合可定制的评分函数进行序列比对和折叠识别。该评分函数是传统PAM和BLOSUM评分矩阵、PSI-BLAST输出的位置特异性评分矩阵、多种方法预测的二级结构、化学性质以及空位罚分的加权和。通过调整权重,该方法可针对不同序列同一性水平的折叠识别或序列比对任务进行定制。使用蒙特卡罗算法来确定用于序列比对和折叠识别的优化权重集,这些权重集能最准确地重现SABmark参考比对测试集。在序列比对性能评估中,与Align-m、ClustalW、T-Coffee和MUSCLE相比,BCL::Align在比对准确性方面排名最佳(对于处于“黄昏区”的序列,Cline评分为22.90)。ROC曲线分析表明BCL::Align能够以超过80%的准确率正确识别蛋白质折叠。该程序的灵活性使其能够针对特定类别的蛋白质(如膜蛋白)或折叠家族(如TIM桶状蛋白)进行优化。BCL::Align供学术使用免费,可在http://www.meilerlab.org/在线获取。