Suppr超能文献

结合化学信息学与化学理论预测类药物结晶分子的固有水溶性。

Uniting cheminformatics and chemical theory to predict the intrinsic aqueous solubility of crystalline druglike molecules.

作者信息

McDonagh James L, Nath Neetika, De Ferrari Luna, van Mourik Tanja, Mitchell John B O

机构信息

Biomedical Sciences Research Complex and ‡EaStCHEM, School of Chemistry, Purdie Building, University of St. Andrews , North Haugh, St. Andrews, Scotland , KY16 9ST, United Kingdom.

出版信息

J Chem Inf Model. 2014 Mar 24;54(3):844-56. doi: 10.1021/ci4005805. Epub 2014 Mar 11.

Abstract

We present four models of solution free-energy prediction for druglike molecules utilizing cheminformatics descriptors and theoretically calculated thermodynamic values. We make predictions of solution free energy using physics-based theory alone and using machine learning/quantitative structure-property relationship (QSPR) models. We also develop machine learning models where the theoretical energies and cheminformatics descriptors are used as combined input. These models are used to predict solvation free energy. While direct theoretical calculation does not give accurate results in this approach, machine learning is able to give predictions with a root mean squared error (RMSE) of ~1.1 log S units in a 10-fold cross-validation for our Drug-Like-Solubility-100 (DLS-100) dataset of 100 druglike molecules. We find that a model built using energy terms from our theoretical methodology as descriptors is marginally less predictive than one built on Chemistry Development Kit (CDK) descriptors. Combining both sets of descriptors allows a further but very modest improvement in the predictions. However, in some cases, this is a statistically significant enhancement. These results suggest that there is little complementarity between the chemical information provided by these two sets of descriptors, despite their different sources and methods of calculation. Our machine learning models are also able to predict the well-known Solubility Challenge dataset with an RMSE value of 0.9-1.0 log S units.

摘要

我们提出了四种利用化学信息学描述符和理论计算的热力学值来预测类药物分子溶液自由能的模型。我们分别使用基于物理的理论以及机器学习/定量结构-性质关系(QSPR)模型来预测溶液自由能。我们还开发了将理论能量和化学信息学描述符作为组合输入的机器学习模型。这些模型用于预测溶剂化自由能。虽然在这种方法中直接理论计算无法给出准确结果,但在我们包含100个类药物分子的类药物溶解度-100(DLS-100)数据集的10折交叉验证中,机器学习能够给出均方根误差(RMSE)约为1.1 log S单位的预测结果。我们发现,使用我们理论方法中的能量项作为描述符构建的模型,其预测能力略低于基于化学开发工具包(CDK)描述符构建的模型。将两组描述符结合使用可使预测结果进一步但非常适度地得到改善。然而,在某些情况下,这是一个具有统计学意义的增强。这些结果表明,尽管这两组描述符的来源和计算方法不同,但它们所提供的化学信息之间几乎没有互补性。我们的机器学习模型还能够以0.9 - 1.0 log S单位的RMSE值预测著名的溶解度挑战数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b95f/3966526/9a626d1d9c63/ci-2013-005805_0009.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验