Department of Mathematics, Michigan State University, Michigan, 48824.
School of Statistics and Mathematics, Central University of Finance and Economics, Beijing, 100081, China.
J Comput Chem. 2018 Feb 5;39(4):217-233. doi: 10.1002/jcc.25107. Epub 2017 Nov 11.
Implicit solvent models divide solvation free energies into polar and nonpolar additive contributions, whereas polar and nonpolar interactions are inseparable and nonadditive. We present a feature functional theory (FFT) framework to break this ad hoc division. The essential ideas of FFT are as follows: (i) representability assumption: there exists a microscopic feature vector that can uniquely characterize and distinguish one molecule from another; (ii) feature-function relationship assumption: the macroscopic features, including solvation free energy, of a molecule is a functional of microscopic feature vectors; and (iii) similarity assumption: molecules with similar microscopic features have similar macroscopic properties, such as solvation free energies. Based on these assumptions, solvation free energy prediction is carried out in the following protocol. First, we construct a molecular microscopic feature vector that is efficient in characterizing the solvation process using quantum mechanics and Poisson-Boltzmann theory. Microscopic feature vectors are combined with macroscopic features, that is, physical observable, to form extended feature vectors. Additionally, we partition a solvation dataset into queries according to molecular compositions. Moreover, for each target molecule, we adopt a machine learning algorithm for its nearest neighbor search, based on the selected microscopic feature vectors. Finally, from the extended feature vectors of obtained nearest neighbors, we construct a functional of solvation free energy, which is employed to predict the solvation free energy of the target molecule. The proposed FFT model has been extensively validated via a large dataset of 668 molecules. The leave-one-out test gives an optimal root-mean-square error (RMSE) of 1.05 kcal/mol. FFT predictions of SAMPL0, SAMPL1, SAMPL2, SAMPL3, and SAMPL4 challenge sets deliver the RMSEs of 0.61, 1.86, 1.64, 0.86, and 1.14 kcal/mol, respectively. Using a test set of 94 molecules and its associated training set, the present approach was carefully compared with a classic solvation model based on weighted solvent accessible surface area. © 2017 Wiley Periodicals, Inc.
隐溶剂模型将溶剂化自由能分为极性和非极性加和贡献,而极性和非极性相互作用是不可分割和不可加和的。我们提出了一种特征功能理论(FFT)框架来打破这种特殊的划分。FFT 的基本思想如下:(i)可表示性假设:存在一个微观特征向量,可以唯一地描述和区分一个分子与另一个分子;(ii)特征-函数关系假设:分子的宏观特征,包括溶剂化自由能,是微观特征向量的函数;(iii)相似性假设:具有相似微观特征的分子具有相似的宏观性质,如溶剂化自由能。基于这些假设,我们按照以下方案进行溶剂化自由能预测。首先,我们使用量子力学和泊松-玻尔兹曼理论构建一个有效的分子微观特征向量,用于描述溶剂化过程。微观特征向量与宏观特征(即物理可观测量)相结合,形成扩展特征向量。此外,我们根据分子组成将溶剂化数据集划分为查询。此外,对于每个目标分子,我们根据选定的微观特征向量采用机器学习算法进行最近邻搜索。最后,从获得的最近邻的扩展特征向量中,我们构建一个溶剂化自由能的函数,用于预测目标分子的溶剂化自由能。该 FFT 模型已通过 668 个分子的大型数据集进行了广泛验证。留一法测试给出了最优均方根误差(RMSE)为 1.05 kcal/mol。对 SAMPL0、SAMPL1、SAMPL2、SAMPL3 和 SAMPL4 挑战集的 FFT 预测分别给出了 0.61、1.86、1.64、0.86 和 1.14 kcal/mol 的 RMSE。使用 94 个分子的测试集及其相关训练集,本方法与基于加权溶剂可及表面积的经典溶剂化模型进行了仔细比较。© 2017 威利期刊公司