Suppr超能文献

扩展溶剂接触模型方法用于药物类分子分配系数的盲SAMPL5预测挑战

Extended solvent-contact model approach to blind SAMPL5 prediction challenge for the distribution coefficients of drug-like molecules.

作者信息

Chung Kee-Choo, Park Hwangseo

机构信息

Department of Bioscience and Biotechnology, Sejong University, 209 Neungdong-ro, Kwangjin-gu, Seoul, 143-747, Republic of Korea.

出版信息

J Comput Aided Mol Des. 2016 Nov;30(11):1019-1033. doi: 10.1007/s10822-016-9928-x. Epub 2016 Jul 23.

Abstract

The performance of the extended solvent-contact model has been addressed in the SAMPL5 blind prediction challenge for distribution coefficient (LogD) of drug-like molecules with respect to the cyclohexane/water partitioning system. All the atomic parameters defined for 41 atom types in the solvation free energy function were optimized by operating a standard genetic algorithm with respect to water and cyclohexane solvents. In the parameterizations for cyclohexane, the experimental solvation free energy (ΔG ) data of 15 molecules for 1-octanol were combined with those of 77 molecules for cyclohexane to construct a training set because ΔG values of the former were unavailable for cyclohexane in publicly accessible databases. Using this hybrid training set, we established the LogD prediction model with the correlation coefficient (R), average error (AE), and root mean square error (RMSE) of 0.55, 1.53, and 3.03, respectively, for the comparison of experimental and computational results for 53 SAMPL5 molecules. The modest accuracy in LogD prediction could be attributed to the incomplete optimization of atomic solvation parameters for cyclohexane. With respect to 31 SAMPL5 molecules containing the atom types for which experimental reference data for ΔG were available for both water and cyclohexane, the accuracy in LogD prediction increased remarkably with the R, AE, and RMSE values of 0.82, 0.89, and 1.60, respectively. This significant enhancement in performance stemmed from the better optimization of atomic solvation parameters by limiting the element of training set to the molecules with experimental ΔG data for cyclohexane. Due to the simplicity in model building and to low computational cost for parameterizations, the extended solvent-contact model is anticipated to serve as a valuable computational tool for LogD prediction upon the enrichment of experimental ΔG data for organic solvents.

摘要

扩展溶剂接触模型的性能已在SAMPL5盲预测挑战中得到检验,该挑战针对类药物分子在环己烷/水分配系统中的分配系数(LogD)。通过对水和环己烷溶剂运行标准遗传算法,对溶剂化自由能函数中为41种原子类型定义的所有原子参数进行了优化。在环己烷的参数化过程中,将15种分子在1-辛醇中的实验溶剂化自由能(ΔG )数据与77种分子在环己烷中的数据相结合,构建了一个训练集,因为在公开可用的数据库中,前者在环己烷中的ΔG 值不可用。使用这个混合训练集,我们建立了LogD预测模型,对于53个SAMPL5分子的实验和计算结果比较,其相关系数(R)、平均误差(AE)和均方根误差(RMSE)分别为0.55、1.53和3.03。LogD预测的适度准确性可归因于环己烷原子溶剂化参数的优化不完全。对于31个包含水和环己烷均有ΔG 实验参考数据的原子类型的SAMPL5分子,LogD预测的准确性显著提高,R、AE和RMSE值分别为0.82、0.89和1.60。性能的显著提高源于通过将训练集元素限制为具有环己烷实验ΔG 数据的分子,对原子溶剂化参数进行了更好的优化。由于模型构建简单且参数化计算成本低,预计扩展溶剂接触模型将在丰富有机溶剂实验ΔG 数据后,成为LogD预测的有价值计算工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验