Thürlemann Moritz, Riniker Sereina
Department of Chemistry and Applied Biosciences, ETH Zürich Vladimir-Prelog-Weg 2 Zürich 8093 Switzerland
Chem Sci. 2023 Oct 31;14(44):12661-12675. doi: 10.1039/d3sc04317g. eCollection 2023 Nov 15.
Electronic structure methods offer in principle accurate predictions of molecular properties, however, their applicability is limited by computational costs. Empirical methods are cheaper, but come with inherent approximations and are dependent on the quality and quantity of training data. The rise of machine learning (ML) force fields (FFs) exacerbates limitations related to training data even further, especially for condensed-phase systems for which the generation of large and high-quality training datasets is difficult. Here, we propose a hybrid ML/classical FF model that is parametrized exclusively on high-quality data of dimers and monomers in vacuum but is transferable to condensed-phase systems. The proposed hybrid model combines our previous ML-parametrized classical model with ML corrections for situations where classical approximations break down, thus combining the robustness and efficiency of classical FFs with the flexibility of ML. Extensive validation on benchmarking datasets and experimental condensed-phase data, including organic liquids and small-molecule crystal structures, showcases how the proposed approach may promote FF development and unlock the full potential of classical FFs.
电子结构方法原则上能够精确预测分子性质,然而,其适用性受到计算成本的限制。经验方法成本较低,但存在固有的近似性,并且依赖于训练数据的质量和数量。机器学习(ML)力场(FFs)的兴起进一步加剧了与训练数据相关的局限性,特别是对于凝聚相系统而言,生成大量高质量的训练数据集非常困难。在此,我们提出了一种混合ML/经典FF模型,该模型仅根据真空中二聚体和单体的高质量数据进行参数化,但可转移至凝聚相系统。所提出的混合模型将我们之前的ML参数化经典模型与经典近似失效时的ML校正相结合,从而将经典FFs的稳健性和效率与ML的灵活性结合起来。在基准数据集和实验凝聚相数据(包括有机液体和小分子晶体结构)上进行的广泛验证表明了所提出的方法如何促进FF的发展并释放经典FFs的全部潜力。