Suppr超能文献

证据表明,对于可转移力场来说,少即是多。

Evidence That Less Can Be More for Transferable Force Fields.

机构信息

Davidson School of Chemical Engineering, Purdue University, West Lafayette, Indiana47906, United States.

出版信息

J Chem Inf Model. 2023 Feb 27;63(4):1188-1195. doi: 10.1021/acs.jcim.2c01163. Epub 2023 Feb 6.

Abstract

Graph-based parameter assignment has been the basis for developing transferable force fields for molecular dynamics simulations for decades. Nevertheless, transferable force fields vary in how specifically terms are defined with respect to the molecular graph and the procedures for generating parametrization data. More-specific force-field terms increase the complexity of the force field, theoretically increasing accuracy but also increasing training data requirements. In contrast, less-specific force fields can be reused across larger regions of chemical space, theoretically reducing accuracy but also reducing the number of parameters and training data requirements. Here, the tradeoffs between force-field specificity and accuracy are quantified by parametrizing three new sets of force fields with varying levels of graph specificity, using a shared procedure for generating training data. These force fields are benchmarked for their ability to reproduce the structural features and liquid properties of 87 organic molecules at 146 distinct state points. The overall accuracy for properties that were directly trained on rapidly saturates as the graph specificity of the force-field increases. From this, we conclude there is at best a marginal benefit of using less transferable and more complex force fields with common sources of quantum-chemically derived training data. When looking at properties unseen during training, there is some evidence that the more-complex force fields even perform slightly worse. These results are rationalized by the fortuitous regularization of force fields based on less-specific and more-transferable atom types. Both the saturation in the accuracy of training properties and the marginally worse performance on off-target properties fundamentally contradict the expectation that bespoke force fields are generally more accurate, given their larger number of parameters, and suggests that increasing force-field complexity should be carefully justified against performance gains and balanced against available training data.

摘要

基于图的参数赋值方法是几十年来开发用于分子动力学模拟的可转移力场的基础。然而,可转移力场在术语相对于分子图的具体定义以及参数化数据生成过程方面存在差异。更具体的力场项增加了力场的复杂性,理论上提高了准确性,但也增加了训练数据的需求。相比之下,不太具体的力场可以在更大的化学空间区域内重复使用,理论上降低了准确性,但也减少了参数数量和训练数据的需求。在这里,通过使用共享的生成训练数据的过程,为三个新的力场集赋予不同程度的图特异性,从而量化了力场特异性和准确性之间的权衡。这些力场在 146 个不同状态点的 87 个有机分子的结构特征和液体性质方面的重现能力进行了基准测试。随着力场的图特异性的增加,直接用于训练的属性的整体准确性迅速饱和。由此,我们得出结论,使用较少转移和更复杂的力场与常见的量子化学衍生训练数据来源相比,最好只有边际收益。在研究训练中未见过的属性时,有一些证据表明,更复杂的力场甚至表现稍差。这些结果通过基于不太具体和更可转移的原子类型的力场的幸运正则化得到了合理化。训练属性准确性的饱和以及针对目标属性的边际性能下降,从根本上与定制力场通常更准确的预期相矛盾,因为它们具有更多的参数,这表明应该根据性能增益仔细权衡增加力场复杂性,并与可用的训练数据相平衡。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验