Kromann Jimmy C, Larsen Frej, Moustafa Hadeel, Jensen Jan H
Department of Chemistry, University of Copenhagen , Copenhagen , Denmark.
PeerJ. 2016 Aug 11;4:e2335. doi: 10.7717/peerj.2335. eCollection 2016.
The PM6 semiempirical method and the dispersion and hydrogen bond-corrected PM6-D3H+ method are used together with the SMD and COSMO continuum solvation models to predict pKa values of pyridines, alcohols, phenols, benzoic acids, carboxylic acids, and phenols using isodesmic reactions and compared to published ab initio results. The pKa values of pyridines, alcohols, phenols, and benzoic acids considered in this study can generally be predicted with PM6 and ab initio methods to within the same overall accuracy, with average mean absolute differences (MADs) of 0.6-0.7 pH units. For carboxylic acids, the accuracy (0.7-1.0 pH units) is also comparable to ab initio results if a single outlier is removed. For primary, secondary, and tertiary amines the accuracy is, respectively, similar (0.5-0.6), slightly worse (0.5-1.0), and worse (1.0-2.5), provided that di- and tri-ethylamine are used as reference molecules for secondary and tertiary amines. When applied to a drug-like molecule where an empirical pKa predictor exhibits a large (4.9 pH unit) error, we find that the errors for PM6-based predictions are roughly the same in magnitude but opposite in sign. As a result, most of the PM6-based methods predict the correct protonation state at physiological pH, while the empirical predictor does not. The computational cost is around 2-5 min per conformer per core processor, making PM6-based pKa prediction computationally efficient enough to be used for high-throughput screening using on the order of 100 core processors.
采用PM6半经验方法以及经色散和氢键校正的PM6-D3H+方法,结合SMD和COSMO连续介质溶剂化模型,通过等键反应预测吡啶、醇、酚、苯甲酸、羧酸和酚的pKa值,并与已发表的从头算结果进行比较。本研究中考虑的吡啶、醇、酚和苯甲酸的pKa值通常可以用PM6方法和从头算方法以相同的总体精度预测,平均绝对偏差(MAD)为0.6 - 0.7个pH单位。对于羧酸,如果去除一个异常值,其预测精度(0.7 - 1.0个pH单位)也与从头算结果相当。对于伯胺、仲胺和叔胺,若分别以二乙胺和三乙胺作为仲胺和叔胺的参考分子,则预测精度分别相似(0.5 - 0.6)、稍差(0.5 - 1.0)和较差(1.0 - 2.5)。当应用于一个经验pKa预测器显示出较大(4.9个pH单位)误差的类药物分子时,我们发现基于PM6的预测误差在大小上大致相同,但符号相反。因此,大多数基于PM6的方法在生理pH下能预测正确的质子化状态,而经验预测器则不能。每个核心处理器预测每个构象体的计算成本约为2 - 5分钟,这使得基于PM6的pKa预测在计算效率上足以用于使用约100个核心处理器的高通量筛选。