Department of Computer Science, Rice University, Houston, Texas 77005, USA.
Proteins. 2010 Feb 1;78(2):223-35. doi: 10.1002/prot.22526.
The automatic classification of the wealth of molecular configurations gathered in simulation in the form of a few coordinates that help to explain the main states and transitions of the system is a recurring problem in computational molecular biophysics. We use the recently proposed ScIMAP algorithm to automatically extract motion parameters from simulation data. The procedure uses only molecular shape similarity and topology information inferred directly from the simulated conformations, and is not biased by a priori known information. The automatically recovered coordinates prove as excellent reaction coordinates for the molecules studied and can be used to identify stable states and transitions, and as a basis to build free-energy surfaces. The coordinates provide a better description of the free energy landscape when compared with coordinates computed using principal components analysis, the most popular linear dimensionality reduction technique. The method is first validated on the analysis of the dynamics of an all-atom model of alanine dipeptide, where it successfully recover all previously known metastable states. When applied to characterize the simulated folding of a coarse-grained model of beta-hairpin, in addition to the folded and unfolded states, two symmetric misfolding crossings of the hairpin strands are observed, together with the most likely transitions from one to the other.
自动对以少数坐标形式收集的模拟中分子构象的财富进行分类,这些坐标有助于解释系统的主要状态和转变,这是计算分子生物物理学中的一个常见问题。我们使用最近提出的 ScIMAP 算法从模拟数据中自动提取运动参数。该过程仅使用从模拟构象中直接推断出的分子形状相似性和拓扑信息,不受先验已知信息的影响。自动恢复的坐标被证明是研究分子的优秀反应坐标,可用于识别稳定状态和转变,并作为构建自由能表面的基础。与使用最流行的线性降维技术主成分分析计算的坐标相比,这些坐标提供了对自由能景观的更好描述。该方法首先在对丙氨酸二肽的全原子模型动力学的分析中进行了验证,其中成功地恢复了所有先前已知的亚稳态。当应用于模拟β发夹的粗粒模型的折叠时,除了折叠和未折叠状态外,还观察到发夹链的两个对称错误折叠交叉,以及最有可能从一个到另一个的转变。