Holm L, Sander C
European Molecular Biology Laboratory, Heidelberg, Germany.
Proteins. 1994 Jul;19(3):256-68. doi: 10.1002/prot.340190309.
General patterns of protein structural organization have emerged from studies of hundreds of structures elucidated by X-ray crystallography and nuclear magnetic resonance. Structural units are commonly identified by visual inspection of molecular models using qualitative criteria. Here, we propose an algorithm for identification of structural units by objective, quantitative criteria based on atomic interactions. The underlying physical concept is maximal interactions within each unit and minimal interaction between units (domains). In a simple harmonic approximation, interdomain dynamics is determined by the strength of the interface and the distribution of masses. The most likely domain decomposition involves units with the most correlated motion, or largest interdomain fluctuation time. The decomposition of a convoluted 3-D structure is complicated by the possibility that the chain can cross over several times between units. Grouping the residues by solving an eigenvalue problem for the contact matrix reduces the problem to a one-dimensional search for all reasonable trial bisections. Recursive bisection yields a tree of putative folding units. Simple physical criteria are used to identify units that could exist by themselves. The units so defined closely correspond to crystallographers' notion of structural domains. The results are useful for the analysis of folding principles, for modular protein design and for protein engineering.
通过X射线晶体学和核磁共振解析出的数百种蛋白质结构的研究,已揭示出蛋白质结构组织的一般模式。结构单元通常是通过使用定性标准对分子模型进行目视检查来识别的。在此,我们提出一种基于原子相互作用的客观、定量标准来识别结构单元的算法。其潜在的物理概念是每个单元内的最大相互作用以及单元(结构域)之间的最小相互作用。在简谐近似中,结构域间动力学由界面强度和质量分布决定。最可能的结构域分解涉及具有最相关运动或最大结构域间波动时间的单元。由于链可能在单元之间多次交叉,因此卷积三维结构的分解变得复杂。通过求解接触矩阵的特征值问题对残基进行分组,将问题简化为对所有合理试验二分法的一维搜索。递归二分法产生一个假定折叠单元的树状结构。使用简单的物理标准来识别可以独立存在的单元。如此定义的单元与晶体学家对结构域的概念密切对应。这些结果对于折叠原理的分析、模块化蛋白质设计和蛋白质工程是有用的。