Salem Saeed, Zaki Mohammed J
Computer Science Department, Rensselaer Polytechnic Institute, 110 8th St. Troy, NY 12180, USA.
Comput Syst Bioinformatics Conf. 2008;7:183-94.
Structural similarity between proteins gives us insights on the evolutionary relationship between proteins which have low sequence similarity. In this paper, we present a novel approach called STSA for non-sequential pair-wise structural alignment. Starting from an initial alignment, our approach iterates over a two-step process, a superposition step and an alignment step, until convergence. Given two superposed structures, we propose a novel greedy algorithm to construct both sequential and non-sequential alignments. The quality of STSA alignments is evident in the high agreement it has with the reference alignments in the challenging-to-align RPIC set. Moreover, on a dataset of 4410 protein pairs selected from the CATH database, STSA has a high sensitivity and high specificity values and is competitive with state-of-the-art alignment methods and gives longer alignments with lower rmsd. The STSA software along with the data sets will be made available on line at http://www.cs.rpi.edu/-zaki/software/STSA.
蛋白质之间的结构相似性使我们能够洞察那些序列相似性较低的蛋白质之间的进化关系。在本文中,我们提出了一种名为STSA的新颖方法,用于非序列成对结构比对。从初始比对开始,我们的方法在一个两步过程中迭代,即叠加步骤和比对步骤,直至收敛。给定两个叠加的结构,我们提出一种新颖的贪心算法来构建序列和非序列比对。STSA比对的质量在与具有挑战性的RPIC集中的参考比对高度一致中得以体现。此外,在从CATH数据库中选择的4410对蛋白质数据集上,STSA具有高灵敏度和高特异性值,并且与最先进的比对方法具有竞争力,能够给出更长且均方根偏差更低的比对。STSA软件以及数据集将在http://www.cs.rpi.edu/-zaki/software/STSA上在线提供。