Avdeef Alex
in-ADME Research, New York, NY 10128 USA.
ADMET DMPK. 2023 Aug 21;11(3):317-330. doi: 10.5599/admet.1879. eCollection 2023.
Yalkowsky's General Solubility Equation (GSE), with its three fixed constants, is popular and easy to apply, but is not very accurate for polar, zwitterionic, or flexible molecules. This review examines the findings of a series of studies, where we have sought to come up with a better prediction model, by comparing the performances of the GSE to Abraham's Solvation Equation (ABSOLV), and Random Forest regression (RFR) machine-learning (ML) method. Large, well-curated aqueous intrinsic solubility databases are available. However, drugs may be sparsely distributed in chemical space, concentrated in clusters. Even a large database might overlook some regions. Test compounds from under-represented portions of space may be poorly predicted, as might be the case with the 'loose' set of 32 drugs in the Second Solubility Challenge (2020). There appears to be still a need for better coverage of drug space. Increasingly, current trends in predictions of solubility use calculated input descriptors, which may be an advantage for exploring properties of molecules yet to be synthesized. The risk may be that overall prediction approaches might be based on accumulated uncertainty. The increasing use of ML/AI methods can lead to accurate predictions, but such predictions may not readily suggest the strategies to pursue in selecting yet-to-be-synthesized compounds. Based on our latest findings, we recommend predictions based on both 'grouped' ABSOLV(GRP) and 'Flexible Acceptor' GSE(,) models with the provided best-fit parameters, where is the Kier molecular flexibility index and is the Abraham H-bond acceptor strength. For molecules with < 11, the prudent choice is to pick the Consensus Model, the average of ABSOLV(GRP) and GSE(Φ,B). For more flexible molecules, GSE(Φ,B) is recommended.
亚尔科夫斯基通用溶解度方程(GSE)有三个固定常数,它很受欢迎且易于应用,但对于极性、两性离子或柔性分子来说不是很准确。本综述考察了一系列研究的结果,在这些研究中,我们试图通过比较GSE与亚伯拉罕溶剂化方程(ABSOLV)以及随机森林回归(RFR)机器学习(ML)方法的性能,得出一个更好的预测模型。有大量精心整理的水溶性固有溶解度数据库。然而,药物在化学空间中可能分布稀疏,集中在簇中。即使是一个大型数据库也可能忽略一些区域。来自空间代表性不足部分的测试化合物可能预测效果不佳,就像第二次溶解度挑战(2020年)中32种药物的“宽松”集合那样。似乎仍然需要更好地覆盖药物空间。目前,溶解度预测的趋势越来越多地使用计算得到的输入描述符,这对于探索尚未合成的分子的性质可能是一个优势。风险可能在于整体预测方法可能基于累积的不确定性。ML/AI方法的使用越来越多可以带来准确的预测,但这样的预测可能不容易暗示在选择尚未合成的化合物时应采取的策略。基于我们的最新发现,我们建议基于“分组”的ABSOLV(GRP)和“柔性受体”GSE(,)模型进行预测,并使用提供的最佳拟合参数,其中是基尔分子柔性指数,是亚伯拉罕氢键受体强度。对于<11的分子,谨慎的选择是采用共识模型,即ABSOLV(GRP)和GSE(Φ,B)的平均值。对于更柔性的分子,建议使用GSE(Φ,B)。