Division of Chemical Engineering, Department of Materials Engineering Science, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531, Japan.
Institute for Molecular Science, Okazaki, Aichi 444-8585, Japan.
J Chem Phys. 2020 Aug 7;153(5):054115. doi: 10.1063/5.0009066.
We propose a cross-entropy minimization method for finding the reaction coordinate from a large number of collective variables in complex molecular systems. This method is an extension of the likelihood maximization approach describing the committor function with a sigmoid. By design, the reaction coordinate as a function of various collective variables is optimized such that the distribution of the committor p values generated from molecular dynamics simulations can be described in a sigmoidal manner. We also introduce the L-norm regularization used in the machine learning field to prevent overfitting when the number of considered collective variables is large. The current method is applied to study the isomerization of alanine dipeptide in vacuum, where 45 dihedral angles are used as candidate variables. The regularization parameter is determined by cross-validation using training and test datasets. It is demonstrated that the optimal reaction coordinate involves important dihedral angles, which are consistent with the previously reported results. Furthermore, the points with p ∼0.5 clearly indicate a separatrix distinguishing reactant and product states on the potential of mean force using the extracted dihedral angles.
我们提出了一种用于从复杂分子系统中的大量集体变量中找到反应坐标的交叉熵最小化方法。这种方法是扩展了似然最大化方法,用 sigmoid 函数描述了易位概率函数。通过设计,反应坐标作为各种集体变量的函数被优化,使得从分子动力学模拟生成的易位概率 p 值的分布可以用 sigmoid 方式来描述。我们还引入了机器学习领域中使用的 L1 正则化,以防止在考虑的集体变量数量较大时出现过拟合。当前的方法应用于研究真空状态下丙氨酸二肽的异构化,其中使用了 45 个二面角作为候选变量。正则化参数通过使用训练集和测试集的交叉验证来确定。结果表明,最优反应坐标涉及重要的二面角,这与之前报道的结果一致。此外,使用提取的二面角,p 值约为 0.5 的点清楚地表明了位势平均力上反应物和产物状态之间的分隔线。