Madhusudhan M S, Webb Benjamin M, Marti-Renom Marc A, Eswar Narayanan, Sali Andrej
Department of Bioengineering and Therapeutic Sciences, University of California at San Francisco, San Francisco, CA 94158, USA.
Protein Eng Des Sel. 2009 Sep;22(9):569-74. doi: 10.1093/protein/gzp040. Epub 2009 Jul 8.
Comparing the structures of proteins is crucial to gaining insight into protein evolution and function. Here, we align the sequences of multiple protein structures by a dynamic programming optimization of a scoring function that is a sum of an affine gap penalty and terms dependent on various sequence and structure features (SALIGN). The features include amino acid residue type, residue position, residue accessible surface area, residue secondary structure state and the conformation of a short segment centered on the residue. The multiple alignment is built by following the 'guide' tree constructed from the matrix of all pairwise protein alignment scores. Importantly, the method does not depend on the exact values of various parameters, such as feature weights and gap penalties, because the optimal alignment across a range of parameter values is found. Using multiple structure alignments in the HOMSTRAD database, SALIGN was benchmarked against MUSTANG for multiple alignments as well as against TM-align and CE for pairwise alignments. On the average, SALIGN produces a 15% improvement in structural overlap over HOMSTRAD and 14% over MUSTANG, and yields more equivalent structural positions than TM-align and CE in 90% and 95% of cases, respectively. The utility of accurate multiple structure alignment is illustrated by its application to comparative protein structure modeling.
比较蛋白质的结构对于深入了解蛋白质的进化和功能至关重要。在这里,我们通过对一个评分函数进行动态规划优化来比对多个蛋白质结构的序列,该评分函数是一个仿射空位罚分与依赖于各种序列和结构特征的项之和(SALIGN)。这些特征包括氨基酸残基类型、残基位置、残基可及表面积、残基二级结构状态以及以该残基为中心的短片段的构象。多重比对是通过遵循从所有成对蛋白质比对分数矩阵构建的“引导”树来构建的。重要的是,该方法不依赖于各种参数的精确值,如特征权重和空位罚分,因为可以找到一系列参数值上的最优比对。使用HOMSTRAD数据库中的多个结构比对,将SALIGN与用于多重比对的MUSTANG以及用于成对比对的TM-align和CE进行了基准测试。平均而言,SALIGN在结构重叠方面比HOMSTRAD提高了15%,比MUSTANG提高了14%,并且在90%和95%的情况下分别比TM-align和CE产生更多等效的结构位置。准确的多重结构比对在比较蛋白质结构建模中的应用说明了其效用。