Fischer D, Bachar O, Nussinov R, Wolfson H
Computer Science Department, School of Mathematical Sciences, Tel Aviv University, Israel.
J Biomol Struct Dyn. 1992 Feb;9(4):769-89. doi: 10.1080/07391102.1992.10507955.
As the number of available three dimensional coordinates of proteins increases, it is now recognized that proteins from different families and topologies are constructed from independent motifs. Detection of specific structural motifs within proteins aids in understanding their role and the mechanism of their operation. To aid in identification and use of these motifs it has become necessary to develop efficient methods for systematic scanning of structural databases. To date, methods of structural protein comparison suffer from at least one of the following limitations: (1) are not fully automated (require human intervention), (2) are limited to relatively similar structures, (3) are constrained to linear alignments of the structures, (4) are sensitive to insertions, deletions or gaps in the sequences or (5) are very time consuming. We present a method to overcome the above limitations. The method discovers and ranks every piece of structural similarity between the structures compared, thus allowing the simultaneous detection of real 3-D motifs in different domains, between domains, in active sites, surfaces etc. The method uses the Geometric Hashing Paradigm which is an efficient technique originally developed for Computer Vision. The algorithm exploits the geometrical constraints of rigid objects, it is especially geared towards recognition of partial structures in rigid objects belonging to large data bases and is straightforwardly parallelizable. Computer Vision techniques are for the first time applied to molecular structure comparison, resulting in an efficient, fully automated tool. The method has been tested in a number of cases, including comparisons of the haemoglobins, immunoglobulins, serine proteinases, calcium binding proteins, DNA binding proteins and others. In all examples our results were equivalent to the published results from previous methods and in some cases additional structural information was obtained by our method.
随着蛋白质可用三维坐标数量的增加,现在人们认识到不同家族和拓扑结构的蛋白质是由独立的基序构建而成的。检测蛋白质内特定的结构基序有助于理解其作用及其运作机制。为了辅助这些基序的识别和应用,开发用于系统扫描结构数据库的高效方法变得十分必要。迄今为止,结构蛋白质比较方法至少存在以下局限性之一:(1)并非完全自动化(需要人工干预);(2)仅限于相对相似的结构;(3)局限于结构的线性比对;(4)对序列中的插入、缺失或缺口敏感;(5)非常耗时。我们提出一种方法来克服上述局限性。该方法能发现并对所比较结构之间的每一处结构相似性进行排序,从而能够同时检测不同结构域之间、结构域内部、活性位点、表面等位置的真实三维基序。该方法采用几何哈希范式,这是一种最初为计算机视觉开发的高效技术。该算法利用了刚性物体的几何约束,特别适用于识别属于大型数据库的刚性物体中的部分结构,并且易于并行化。计算机视觉技术首次应用于分子结构比较,从而产生了一个高效、完全自动化的工具。该方法已在多个案例中进行了测试,包括血红蛋白、免疫球蛋白、丝氨酸蛋白酶、钙结合蛋白、DNA结合蛋白等的比较。在所有实例中,我们的结果与先前方法公布的结果相当,并且在某些情况下,我们的方法还获得了额外的结构信息。