Laboratoire Interdisciplinaire Carnot de Bourgogne, UMR CNRS 6303, Université de Bourgogne, 21078 Dijon CEDEX, France.
Baker Laboratory of Chemistry and Chemical Biology, Cornell University, Ithaca, NY 14853, USA.
Molecules. 2023 Sep 16;28(18):6659. doi: 10.3390/molecules28186659.
The folded structures of proteins can be accurately predicted by deep learning algorithms from their amino-acid sequences. By contrast, in spite of decades of research studies, the prediction of folding pathways and the unfolded and misfolded states of proteins, which are intimately related to diseases, remains challenging. A two-state (folded/unfolded) description of protein folding dynamics hides the complexity of the unfolded and misfolded microstates. Here, we focus on the development of simplified order parameters to decipher the complexity of disordered protein structures. First, we show that any connected, undirected, and simple graph can be associated with a linear chain of atoms in thermal equilibrium. This analogy provides an interpretation of the usual topological descriptors of a graph, namely the Kirchhoff index and Randić resistance, in terms of effective force constants of a linear chain. We derive an exact relation between the Kirchhoff index and the average shortest path length for a linear graph and define the free energies of a graph using an Einstein model. Second, we represent the three-dimensional protein structures by connected, undirected, and simple graphs. As a proof of concept, we compute the topological descriptors and the graph free energies for an all-atom molecular dynamics trajectory of folding/unfolding events of the proteins Trp-cage and HP-36 and for the ensemble of experimental NMR models of Trp-cage. The present work shows that the local, nonlocal, and global force constants and free energies of a graph are promising tools to quantify unfolded/disordered protein states and folding/unfolding dynamics. In particular, they allow the detection of transient misfolded rigid states.
蛋白质的折叠结构可以通过深度学习算法从其氨基酸序列中准确预测。相比之下,尽管经过了几十年的研究,与疾病密切相关的蛋白质折叠途径和未折叠及错误折叠状态的预测仍然具有挑战性。蛋白质折叠动力学的两态(折叠/未折叠)描述隐藏了未折叠和错误折叠微态的复杂性。在这里,我们专注于开发简化的序参数来破译无序蛋白质结构的复杂性。首先,我们表明任何连通的、无向的、简单的图都可以与处于热平衡的原子线性链相关联。这种类比提供了对图的通常拓扑描述符(即 Kirchhoff 指数和 Randić 电阻)的解释,这些描述符可以用线性链的有效力常数来表示。我们导出了线性图的 Kirchhoff 指数和平均最短路径长度之间的确切关系,并使用爱因斯坦模型定义了图的自由能。其次,我们通过连通的、无向的、简单的图来表示三维蛋白质结构。作为概念验证,我们计算了蛋白质 Trp-cage 和 HP-36 的折叠/展开事件的全原子分子动力学轨迹以及 Trp-cage 的实验 NMR 模型集合的拓扑描述符和图自由能。本工作表明,图的局部、非局部和全局力常数和自由能是量化未折叠/无序蛋白质状态和折叠/展开动力学的有前途的工具。特别是,它们允许检测瞬态错误折叠的刚性状态。