Oliveira Felipe L, Luan Binquan, Esteves Pierre M, Steiner Mathias, Neumann Barros Ferreira Rodrigo
IBM Research, Av. República do Chile, 330, Rio de Janeiro, Rio de Janeiro CEP 20031-170, Brazil.
Instituto de Química, Universidade Federal do Rio de Janeiro, Av. Athos da Silveira Ramos, 149, CT A-622, Cid. Univ., Rio de Janeiro, Rio de Janeiro 21941-909, Brazil.
J Chem Theory Comput. 2024 Oct 8;20(19):8559-8568. doi: 10.1021/acs.jctc.4c00417. Epub 2024 Sep 18.
Automated molecular simulations are used extensively for predicting material properties. Typically, these simulations exhibit two regimes: a dynamic equilibration part, followed by a steady state. For extracting observable properties, the simulations must first reach a steady state so that thermodynamic averages can be taken. However, as equilibration depends on simulation conditions, predicting the optimal number of simulation steps is impossible. Here, we demonstrate the application of the Marginal Standard Error Rule (MSER) for automatically identifying the optimal truncation point in Grand Canonical Monte Carlo (GCMC) simulations. This novel automatic procedure determines the point at which a steady state is reached, ensuring that figures of merit are extracted in an objective, accurate, and reproducible fashion. In the case of GCMC simulations of gas adsorption in metal-organic frameworks, we find that this methodology reduces the computational cost by up to 90%. As MSER statistics are independent of the simulation method that creates the data, this library is, in principle, applicable to any time series analysis in which equilibration truncation is required. The open-source Python implementation of our method, pyMSER, is publicly available for reuse and validation at https://github.com/IBM/pymser.
自动分子模拟被广泛用于预测材料特性。通常,这些模拟呈现出两种状态:动态平衡部分,随后是稳态。为了提取可观测特性,模拟必须首先达到稳态,以便能够求取热力学平均值。然而,由于平衡取决于模拟条件,预测最佳模拟步数是不可能的。在此,我们展示了边际标准误差规则(MSER)在自动识别巨正则蒙特卡罗(GCMC)模拟中的最佳截断点方面的应用。这种新颖的自动程序确定达到稳态的点,确保以客观、准确和可重复的方式提取品质因数。在金属有机框架中气体吸附的GCMC模拟案例中,我们发现这种方法可将计算成本降低多达90%。由于MSER统计与创建数据的模拟方法无关,该库原则上适用于任何需要平衡截断的时间序列分析。我们方法的开源Python实现pyMSER可在https://github.com/IBM/pymser上公开获取以供重用和验证。