Suppr超能文献

用于训练机器学习模型以预测保留流动性的粗粒度分子模拟势的合成力场数据库。

Synthetic Force-Field Database for Training Machine Learning Models to Predict Mobility-Preserving Coarse-Grained Molecular-Simulation Potentials.

作者信息

Bag Saientan, Meinel Melissa K, Müller-Plathe Florian

机构信息

Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Peter-Grünberg-Str. 8, 64287 Darmstadt, Germany.

出版信息

J Chem Theory Comput. 2024 Apr 23;20(8):3046-3060. doi: 10.1021/acs.jctc.4c00242. Epub 2024 Apr 9.

Abstract

Balancing accuracy and efficiency is a common problem in molecular simulation. This tradeoff is evident in coarse-grained molecular dynamics simulation, which prioritizes efficiency, and all-atom molecular simulation, which prioritizes accuracy. Despite continuous efforts, creating a coarse-grained model that accurately captures both the system's structure and dynamics remains elusive. In this article, we present a data-driven approach for constructing coarse-grained models that aim to describe both the structure and dynamics of the system equally well. While the development of machine learning models is well-received in the scientific community, the significance of dataset creation for these models is often overlooked. However, data-driven approaches cannot progress without a robust dataset. To address this, we construct a database of synthetic coarse-grained potentials generated from unphysical all-atom models. A neural network is trained with the generated database to predict the coarse-grained potentials of real liquids. We evaluate their quality by calculating the combined loss of structural and dynamical accuracy upon coarse-graining. When we compare our machine learning-based coarse-grained potential with the one from iterative Boltzmann inversion, the machine learning prediction turns out better for all eight hydrocarbon liquids we studied. As all-atom surfaces turn more nonspherical, both ways of coarse-graining degrade. Still, the neural network outperforms iterative Boltzmann inversion in constructing good quality coarse-grained models for such cases. The synthetic database and the developed machine learning models are freely available to the community, and we believe that our approach will generate interest in efficiently deriving accurate coarse-grained models for liquids.

摘要

在分子模拟中,平衡准确性和效率是一个常见问题。这种权衡在粗粒度分子动力学模拟(优先考虑效率)和全原子分子模拟(优先考虑准确性)中很明显。尽管不断努力,但创建一个能准确捕捉系统结构和动力学的粗粒度模型仍然难以实现。在本文中,我们提出了一种数据驱动的方法来构建粗粒度模型,旨在同样出色地描述系统的结构和动力学。虽然机器学习模型的发展在科学界受到广泛欢迎,但这些模型的数据集创建的重要性常常被忽视。然而,没有强大的数据集,数据驱动的方法就无法取得进展。为了解决这个问题,我们构建了一个由非物理全原子模型生成的合成粗粒度势数据库。使用生成的数据库训练神经网络,以预测真实液体的粗粒度势。我们通过计算粗粒度时结构和动力学准确性的综合损失来评估它们的质量。当我们将基于机器学习的粗粒度势与迭代玻尔兹曼反演得到的粗粒度势进行比较时,对于我们研究的所有八种烃类液体,机器学习预测结果都更好。随着全原子表面变得更加非球形,两种粗粒度方法都会退化。不过,在为这种情况构建高质量粗粒度模型方面,神经网络优于迭代玻尔兹曼反演。合成数据库和开发的机器学习模型可供社区免费使用,我们相信我们的方法将激发人们对高效推导准确的液体粗粒度模型的兴趣。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验