Liu Bojun, Cao Siqin, Boysen Jordan G, Xue Mingyi, Huang Xuhui
Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI, USA.
Nat Comput Sci. 2025 Jul;5(7):562-571. doi: 10.1038/s43588-025-00815-8. Epub 2025 Jun 10.
Identifying collective variables (CVs) that accurately capture the slowest timescales of protein conformational changes is crucial to comprehend numerous biological processes. Here we introduce memory kernel minimization-based neural networks (MEMnets), a deep learning framework that accurately identifies the slow CVs of biomolecular dynamics. Unlike popular CV-identification methods, which typically assume Markovian dynamics, MEMnets is built on the integrative generalized master equation theory, which incorporates non-Markovian dynamics by encoding them in a memory kernel for continuous CVs. The key innovation of MEMnets is the identification of optimal CVs by minimizing the upper bound for the time-integrated memory kernels through parallel encoder networks. We demonstrate that MEMnets effectively identifies slow CVs involved in the folding of the FIP35 WW domain, revealing two parallel folding pathways. In addition, we illustrate MEMnets' robust numerical stability in identifying meaningful CVs in large biomolecular dynamic systems with limited sampling by applying it to the clamp opening of bacterial RNA polymerase, a much more complex conformational change.
识别能够准确捕捉蛋白质构象变化最慢时间尺度的集体变量(CVs)对于理解众多生物过程至关重要。在此,我们引入基于记忆核最小化的神经网络(MEMnets),这是一种深度学习框架,可准确识别生物分子动力学的慢集体变量。与通常假设马尔可夫动力学的流行集体变量识别方法不同,MEMnets基于整合广义主方程理论构建,该理论通过将非马尔可夫动力学编码到连续集体变量的记忆核中来纳入非马尔可夫动力学。MEMnets的关键创新在于通过并行编码器网络最小化时间积分记忆核的上限来识别最优集体变量。我们证明MEMnets有效地识别了参与FIP35 WW结构域折叠的慢集体变量,揭示了两条平行的折叠途径。此外,我们通过将MEMnets应用于细菌RNA聚合酶的钳口打开这一更为复杂的构象变化,展示了其在有限采样的大型生物分子动力学系统中识别有意义集体变量时强大的数值稳定性。