Kikuchi T, Némethy G, Scheraga H A
Baker Laboratory of Chemistry, Cornell University, Ithaca, New York 14853-1301.
J Protein Chem. 1988 Aug;7(4):427-71. doi: 10.1007/BF01024890.
The location of structural domains in proteins is predicted from the amino acid sequence, based on the analysis of a computed contact map for the protein, the average distance map (ADM). Interactions between residues i and j in a protein are subdivided into several ranges, according to the separation [i-j[ in the amino acid sequence. Within each range, average spatial distances between every pair of amino acid residues are computed from a data base of known protein structures. Infrequently occurring pairs are omitted as being statistically insignificant. The average distances are used to construct a predicted ADM. The ADM is analyzed for the occurrence of regions with high densities of contacts (compact regions). Locations of rapid changes of density between various parts of the map are determined by the use of scanning plots of contact densities. These locations serve to pinpoint the distribution of compact regions. This distribution, in turn, is used to predict boundaries of domains in the protein. The technique provides an objective method for the location of domains both on a contact map derived from a known three-dimensional protein structure, the real distance map (RDM), and on an ADM. While most other published methods for the identification of domains locate them in the known three-dimensional structure of a protein, the technique presented here also permits the prediction of domains in proteins of unknown spatial structure, as the construction of the ADM for a given protein requires knowledge of only its amino acid sequence.
基于对蛋白质计算接触图(平均距离图,ADM)的分析,从氨基酸序列预测蛋白质中结构域的位置。根据氨基酸序列中的间隔[i - j],将蛋白质中残基i和j之间的相互作用细分为几个范围。在每个范围内,从已知蛋白质结构数据库中计算每对氨基酸残基之间的平均空间距离。不常出现的对作为统计上无意义的对被省略。平均距离用于构建预测的ADM。分析ADM中具有高密度接触区域(紧密区域)的出现情况。通过使用接触密度扫描图来确定图中不同部分之间密度快速变化的位置。这些位置用于确定紧密区域的分布。反过来,这种分布用于预测蛋白质中结构域的边界。该技术为在从已知三维蛋白质结构导出的接触图(实际距离图,RDM)和ADM上定位结构域提供了一种客观方法。虽然大多数其他已发表的用于鉴定结构域的方法是在蛋白质的已知三维结构中定位它们,但这里提出的技术也允许预测空间结构未知的蛋白质中的结构域,因为构建给定蛋白质的ADM仅需要其氨基酸序列的知识。