Krishnan Arun, Giuliani Alessandro, Zbilut Joseph P, Tomita Masaru
Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan.
J Proteome Res. 2007 Oct;6(10):3924-34. doi: 10.1021/pr070162v. Epub 2007 Sep 12.
The structural architecture of proteins continues to be an area of active research. Despite the difference in models dealing with the way proteins fold into their tertiary structures, it is recognized that small regions of proteins tend to fold independently and are then stabilized by interactions between these distinct subunits. However, there are a number of different definitions of what comprises an independent subunit. In the belief that an unequivocal definition of a domain must be based on the most fundamental property of protein 3D structure, namely, the adjacency matrix of inter-residues contact, we adopt a network representation of the protein. In this work, we used a well-established, global method for identifying modules in networks, without any specific reference to the kind of network being analyzed. The algorithm converges toward the maximization of the modularity of the given protein network and, in doing so, allows the representation of the residues of the protein in terms of their intramodule degree, z, and participation coefficient, P. We demonstrate that the labeling of residues in terms of these invariants allows for information-rich representations of the studied proteins as well as to sketch a new way to link sequence, structure, and the dynamical properties of proteins. We discovered a strong invariant character of protein molecules in terms of P/z characterization, pointing to a common topological design of all protein structures. This invariant representation, applied to different protein systems, enabled us to identify the possible functional role of high P/z residues during the folding process. Additionally, we observe a hierarchical behavior of protein structural organization that provides a sequence-secondary-tertiary structure link. The discovery of similar and repeatable scaling laws at different level of definitions going from hydrophobicity patterning along the sequence up to the size of an autonomous folding unit (AFU) and general contact distribution of the entire molecule suggest a hierarchical-like behavior of protein architecture. This implies the possibility to select different privileged scales of observation for deriving useful information on protein systems.
蛋白质的结构架构仍然是一个活跃的研究领域。尽管在处理蛋白质折叠成三级结构方式的模型上存在差异,但人们认识到蛋白质的小区域倾向于独立折叠,然后通过这些不同亚基之间的相互作用得以稳定。然而,对于什么构成一个独立亚基有多种不同的定义。基于这样一种信念,即对结构域的明确界定必须基于蛋白质三维结构的最基本属性,即残基间接触的邻接矩阵,我们采用蛋白质的网络表示法。在这项工作中,我们使用了一种成熟的全局方法来识别网络中的模块,而无需特别提及所分析网络的类型。该算法朝着使给定蛋白质网络的模块性最大化的方向收敛,在此过程中,能够根据蛋白质残基的模块内度(z)和参与系数(P)来表示这些残基。我们证明,根据这些不变量对残基进行标记,能够对所研究的蛋白质进行信息丰富的表示,同时勾勒出一种将蛋白质序列、结构和动力学性质联系起来的新方法。我们发现,就(P/z)特征而言,蛋白质分子具有很强的不变性特征,这表明所有蛋白质结构都有共同的拓扑设计。这种不变性表示应用于不同的蛋白质系统,使我们能够确定高(P/z)残基在折叠过程中可能发挥的功能作用。此外,我们观察到蛋白质结构组织具有层次行为,它提供了序列 - 二级结构 - 三级结构的联系。从沿着序列的疏水性模式到自主折叠单元(AFU)的大小以及整个分子的一般接触分布,在不同定义层面发现的相似且可重复的标度律表明蛋白质结构具有类似层次的行为。这意味着有可能选择不同的优先观察尺度来获取有关蛋白质系统的有用信息。