Ye Yuzhen, Godzik Adam
Program in Bioinformatics and Systems Biology, The Burnham Institute, 10901 N. Torrey Pines Road, La Jolla, CA 92037, USA.
Protein Sci. 2004 Jul;13(7):1841-50. doi: 10.1110/ps.03602304.
We have recently developed a flexible protein structure alignment program (FATCAT) that identifies structural similarity, at the same time accounting for flexibility of protein structures. One of the most important applications of a structure alignment method is to aid in functional annotations by identifying similar structures in large structural databases. However, none of the flexible structure alignment methods were applied in this task because of a lack of significance estimation of flexible alignments. In this paper, we developed an estimate of the statistical significance of FATCAT alignment score, allowing us to use it as a database-searching tool. The results reported here show that (1) the distribution of the similarity score of FATCAT alignment between two unrelated protein structures follows the extreme value distribution (EVD), adding one more example to the current collection of EVDs of sequence and structure similarities; (2) introducing flexibility into structure comparison only slightly influences the sensitivity and specificity of identifying similar structures; and (3) the overall performance of FATCAT as a database searching tool is comparable to that of the widely used rigid-body structure comparison programs DALI and CE. Two examples illustrating the advantages of using flexible structure alignments in database searching are also presented. The conformational flexibilities that were detected in the first example may be involved with substrate specificity, and the conformational flexibilities detected in the second example may reflect the evolution of structures by block building.
我们最近开发了一种灵活的蛋白质结构比对程序(FATCAT),它能够识别结构相似性,同时考虑到蛋白质结构的灵活性。结构比对方法的最重要应用之一是通过在大型结构数据库中识别相似结构来辅助功能注释。然而,由于缺乏对灵活比对的显著性估计,没有一种灵活的结构比对方法应用于这项任务。在本文中,我们对FATCAT比对分数的统计显著性进行了估计,使其能够用作数据库搜索工具。此处报告的结果表明:(1)两个不相关蛋白质结构之间的FATCAT比对相似性分数分布遵循极值分布(EVD),这为当前序列和结构相似性的EVD集合增添了又一个实例;(2)在结构比较中引入灵活性仅对识别相似结构的灵敏度和特异性有轻微影响;(3)FATCAT作为数据库搜索工具的整体性能与广泛使用的刚体结构比较程序DALI和CE相当。还给出了两个例子来说明在数据库搜索中使用灵活结构比对的优势。在第一个例子中检测到的构象灵活性可能与底物特异性有关,在第二个例子中检测到的构象灵活性可能反映了通过模块构建的结构进化。