Department of Chemistry, The Pennsylvania State University, University Park, PA 16802.
Department of Physics, The Pennsylvania State University, University Park, PA 16802.
Proc Natl Acad Sci U S A. 2020 Sep 29;117(39):24061-24068. doi: 10.1073/pnas.2000098117. Epub 2020 Sep 14.
The success of any physical model critically depends upon adopting an appropriate representation for the phenomenon of interest. Unfortunately, it remains generally challenging to identify the essential degrees of freedom or, equivalently, the proper order parameters for describing complex phenomena. Here we develop a statistical physics framework for exploring and quantitatively characterizing the space of order parameters for representing physical systems. Specifically, we examine the space of low-resolution representations that correspond to particle-based coarse-grained (CG) models for a simple microscopic model of protein fluctuations. We employ Monte Carlo (MC) methods to sample this space and determine the density of states for CG representations as a function of their ability to preserve the configurational information, I, and large-scale fluctuations, Q, of the microscopic model. These two metrics are uncorrelated in high-resolution representations but become anticorrelated at lower resolutions. Moreover, our MC simulations suggest an emergent length scale for coarse-graining proteins, as well as a qualitative distinction between good and bad representations of proteins. Finally, we relate our work to recent approaches for clustering graphs and detecting communities in networks.
任何物理模型的成功都取决于对感兴趣的现象采用适当的表示方法。不幸的是,通常仍然难以确定描述复杂现象的基本自由度,或者等效地,描述复杂现象的适当序参量。在这里,我们开发了一个统计物理框架,用于探索和定量描述表示物理系统的序参量空间。具体来说,我们研究了低分辨率表示的空间,这些表示对应于简单蛋白质波动微观模型的基于粒子的粗粒化 (CG) 模型。我们采用蒙特卡罗 (MC) 方法对该空间进行采样,并确定 CG 表示的态密度作为其保持微观模型的构形信息 I 和大尺度波动 Q 的能力的函数。这两个度量在高分辨率表示中是不相关的,但在较低分辨率下变得负相关。此外,我们的 MC 模拟表明,蛋白质的粗粒化存在一个新兴的长度尺度,以及蛋白质的良好和不良表示之间的定性区别。最后,我们将我们的工作与最近用于聚类图和检测网络中的社区的方法联系起来。