UFZ Department of Ecological Chemistry, Helmholtz Centre for Environmental Research, Leipzig, Germany.
J Chem Inf Model. 2011 Sep 26;51(9):2336-44. doi: 10.1021/ci200233s. Epub 2011 Aug 11.
A quantum chemical method has been developed to estimate the dissociation constant pK(a) of organic acids from their neutral molecular structures by employing electronic structure properties. The data set covers 219 phenols (including 29 phenols with intramolecular H-bonding), 150 aromatic carboxylic acids, 190 aliphatic carboxylic acids, and 138 alcohols, with pK(a) varying by 16 units (0.38-16.80). Optimized ground-state geometries employing the semiempirical AM1 Hamiltonian have been used to quantify the site-specific molecular readiness to donate or accept electron charge in terms of both charge-associated energies and energy-associated charges, augmented by an ortho substitution indicator for aromatic compounds. The resultant regression models yield squared correlation coefficients (r(2)) from 0.82 to 0.90 and root-mean-square errors (rms) from 0.39 to 0.70 pK(a) units, corresponding to an overall (subset-weighted) r(2) of 0.86. Simulated external validation, leave-10%-out cross-validation and target value scrambling demonstrate the statistical robustness and prediction power of the derived model suite. The low intercorrelation with prediction errors from the commercial ACD package provides opportunity for a consensus model approach, offering a pragmatic way for further increasing the confidence in prediction significantly. Interestingly, inclusion of calculated free energies of aqueous solvation does not improve the prediction performance, probably because of the limited precision provided by available continuum-solvation models.
一种量子化学方法已经被开发出来,通过利用电子结构性质,从有机分子的中性结构估算有机酸的离解常数 pK(a)。该数据集涵盖了 219 种苯酚(包括 29 种具有分子内氢键的苯酚)、150 种芳香羧酸、190 种脂肪羧酸和 138 种醇,pK(a) 值变化 16 个单位(0.38-16.80)。采用半经验 AM1 哈密顿量优化的基态几何结构,用于量化特定位置的分子在提供或接受电子电荷方面的准备程度,这是通过对芳香族化合物的邻位取代指标来衡量的,涉及电荷相关能量和能量相关电荷。所得回归模型的平方相关系数(r(2))为 0.82 至 0.90,均方根误差(rms)为 0.39 至 0.70 pK(a)单位,总(子集加权)r(2)为 0.86。模拟外部验证、10%留一交叉验证和目标值混淆表明了所得到的模型套件的统计稳健性和预测能力。与商业 ACD 包的预测误差的低相关性为共识模型方法提供了机会,为进一步显著提高预测置信度提供了一种实用方法。有趣的是,包括水溶剂化自由能的计算并没有改善预测性能,可能是因为可用的连续溶剂化模型提供的精度有限。