Department of Pharmaceutical Science , University of California , Irvine , California 92697 , United States.
Department of Chemistry , University of California , Irvine , California 92697 , United States.
J Chem Theory Comput. 2018 Nov 13;14(11):6076-6092. doi: 10.1021/acs.jctc.8b00640. Epub 2018 Oct 30.
Traditional approaches to specifying a molecular mechanics force field encode all the information needed to assign force field parameters to a given molecule into a discrete set of atom types. This is equivalent to a representation consisting of a molecular graph comprising a set of vertices, which represent atoms labeled by atom type, and unlabeled edges, which represent chemical bonds. Bond stretch, angle bend, and dihedral parameters are then assigned by looking up bonded pairs, triplets, and quartets of atom types in parameter tables to assign valence terms and using the atom types themselves to assign nonbonded parameters. This approach, which we call indirect chemical perception because it operates on the intermediate graph of atom-typed nodes, creates a number of technical problems. For example, atom types must be sufficiently complex to encode all necessary information about the molecular environment, making it difficult to extend force fields encoded this way. Atom typing also results in a proliferation of redundant parameters applied to chemically equivalent classes of valence terms, needlessly increasing force field complexity. Here, we describe a new approach to assigning force field parameters via direct chemical perception. Rather than working through the intermediary of the atom-typed graph, direct chemical perception operates directly on the unmodified chemical graph of the molecule to assign parameters. In particular, parameters are assigned to each type of force field term (e.g., bond stretch, angle bend, torsion, and Lennard-Jones) based on standard chemical substructure queries implemented via the industry-standard SMARTS chemical perception language, using SMIRKS extensions that permit labeling of specific atoms within a chemical pattern. We use this to implement a new force field format, called the SMIRKS Native Open Force Field (SMIRNOFF) format. We demonstrate the power and generality of this approach using examples of specific molecules that pose problems for indirect chemical perception and construct and validate a minimalist yet very general force field, SMIRNOFF99Frosst. We find that a parameter definition file only ∼300 lines long provides coverage of all but <0.02% of a 5 million molecule drug-like test set. Despite its simplicity, the accuracy of SMIRNOFF99Frosst for small molecule hydration free energies and selected properties of pure organic liquids is similar to that of the General Amber Force Field, whose specification requires thousands of parameters. This force field provides a starting point for further optimization and refitting work to follow.
传统的指定分子力学力场的方法将为给定分子分配力场参数所需的所有信息编码为一组离散的原子类型。这相当于由一组包含顶点的分子图表示,其中顶点代表标记有原子类型的原子,无标记的边代表化学键。然后通过查找参数表中的键合对、三键和四键原子类型来分配键拉伸、角度弯曲和二面角参数,并使用原子类型本身来分配非键参数。我们将这种方法称为间接化学感知,因为它作用于原子类型节点的中间图上,这会产生许多技术问题。例如,原子类型必须足够复杂,以编码有关分子环境的所有必要信息,从而使得以这种方式编码的力场难以扩展。原子类型化也导致应用于化学等价的价键类的冗余参数大量增加,不必要地增加了力场的复杂性。在这里,我们描述了一种通过直接化学感知分配力场参数的新方法。直接化学感知不是通过原子类型图进行工作,而是直接作用于分子的未修改化学图来分配参数。具体来说,根据通过行业标准 SMARTS 化学感知语言实现的标准化学子结构查询,为每种力场项(例如键拉伸、角度弯曲、扭转和 Lennard-Jones)分配参数,并使用允许在化学模式内标记特定原子的 SMIRKS 扩展。我们使用此方法实现了一种新的力场格式,称为 SMIRKS Native Open Force Field(SMIRNOFF)格式。我们使用对间接化学感知构成问题的特定分子的示例演示了这种方法的强大功能和通用性,并构建和验证了一个最小但非常通用的力场,SMIRNOFF99Frosst。我们发现,一个参数定义文件只有大约 300 行长,可提供对除 500 万药物样测试集之外的 <0.02%的覆盖率。尽管简单,但 SMIRNOFF99Frosst 对小分子水合自由能和纯有机液体的某些性质的准确性与通用 Amber 力场的准确性相似,而通用 Amber 力场的规格需要数千个参数。这个力场为进一步的优化和拟合工作提供了一个起点。