Mukherjee Srayanta, Zhang Yang
Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA.
Nucleic Acids Res. 2009 Jun;37(11):e83. doi: 10.1093/nar/gkp318. Epub 2009 May 14.
Structural comparison of multiple-chain protein complexes is essential in many studies of protein-protein interactions. We develop a new algorithm, MM-align, for sequence-independent alignment of protein complex structures. The algorithm is built on a heuristic iteration of a modified Needleman-Wunsch dynamic programming (DP) algorithm, with the alignment score specified by the inter-complex residue distances. The multiple chains in each complex are first joined, in every possible order, and then simultaneously aligned with cross-chain alignments prevented. The alignments of interface residues are enhanced by an interface-specific weighting factor. MM-align is tested on a large-scale benchmark set of 205 x 3897 non-homologous multiple-chain complex pairs. Compared with a naïve extension of the monomer alignment program of TM-align, the alignment accuracy of MM-align is significantly higher as judged by the average TM-score of the physically-aligned residues. MM-align is about two times faster than TM-align because of omitting the cross-alignment zone of the DP matrix. It also shows that the enhanced alignment of the interfaces helps in identifying biologically relevant protein complex pairs.
在许多蛋白质-蛋白质相互作用研究中,多链蛋白质复合物的结构比较至关重要。我们开发了一种新算法MM-align,用于蛋白质复合物结构的序列无关比对。该算法基于对改进的Needleman-Wunsch动态规划(DP)算法的启发式迭代,比对分数由复合物间残基距离指定。每个复合物中的多条链首先以每种可能的顺序连接,然后在防止跨链比对的情况下同时进行比对。通过特定于界面的加权因子增强界面残基的比对。MM-align在由205×3897个非同源多链复合物对组成的大规模基准数据集上进行了测试。与TM-align单体比对程序的简单扩展相比,根据物理比对残基的平均TM分数判断,MM-align的比对准确性显著更高。由于省略了DP矩阵的交叉比对区域,MM-align比TM-align快约两倍。这也表明,界面的增强比对有助于识别生物学上相关的蛋白质复合物对。