Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India.
J Biomol Struct Dyn. 2012;29(6):606-22. doi: 10.1080/07391102.2011.672625.
Convergence of the vast sequence space of proteins into a highly restricted fold/conformational space suggests a simple yet unique underlying mechanism of protein folding that has been the subject of much debate in the last several decades. One of the major challenges related to the understanding of protein folding or in silico protein structure prediction is the discrimination of non-native structures/decoys from the native structure. Applications of knowledge-based potentials to attain this goal have been extensively reported in the literature. Also, scoring functions based on accessible surface area and amino acid neighbourhood considerations were used in discriminating the decoys from native structures. In this article, we have explored the potential of protein structure network (PSN) parameters to validate the native proteins against a large number of decoy structures generated by diverse methods. We are guided by two principles: (a) the PSNs capture the local properties from a global perspective and (b) inclusion of non-covalent interactions, at all-atom level, including the side-chain atoms, in the network construction accommodates the sequence dependent features. Several network parameters such as the size of the largest cluster, community size, clustering coefficient are evaluated and scored on the basis of the rank of the native structures and the Z-scores. The network analysis of decoy structures highlights the importance of the global properties contributing to the uniqueness of native structures. The analysis also exhibits that the network parameters can be used as metrics to identify the native structures and filter out non-native structures/decoys in a large number of data-sets; thus also has a potential to be used in the protein 'structure prediction' problem.
蛋白质巨大序列空间的收敛进入一个高度受限的折叠/构象空间,这表明蛋白质折叠具有一个简单而独特的潜在机制,这是过去几十年中争论的焦点。理解蛋白质折叠或计算蛋白质结构预测的主要挑战之一是将非天然结构/诱饵与天然结构区分开来。在文献中广泛报道了基于知识的势能在实现这一目标方面的应用。此外,还基于可及表面积和氨基酸邻域考虑的评分函数来区分诱饵和天然结构。在本文中,我们探索了蛋白质结构网络(PSN)参数的潜力,以针对通过多种方法生成的大量诱饵结构来验证天然蛋白质。我们的指导原则有两条:(a)PSN 从全局角度捕获局部特性;(b)在网络构建中包含非共价相互作用,包括所有原子水平的侧链原子,以适应序列依赖性特征。根据天然结构的等级和 Z 分数,评估并评分了几个网络参数,如最大簇的大小、社区大小、聚类系数。对诱饵结构的网络分析强调了对天然结构独特性有贡献的全局特性的重要性。分析还表明,网络参数可用作指标来识别天然结构并从大量数据集的非天然结构/诱饵中筛选出来;因此,它也有可能用于蛋白质“结构预测”问题。