Roche Pharmaceutical Research & Early Development, Pre-Clinical CMC, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, 4000 Basel, Switzerland.
University of Applied Sciences and Arts Northwestern Switzerland, Institute of Pharma Technology, Hofackerstr. 30, CH-4132 Muttenz, Switzerland.
Mol Pharm. 2020 Jul 6;17(7):2660-2671. doi: 10.1021/acs.molpharmaceut.0c00355. Epub 2020 Jun 19.
There has been much recent interest in machine learning (ML) and molecular quantitative structure property relationships (QSPR). The present research evaluated modern ML-based methods implemented in commercial software (COSMOquick and Molecular Modeling Pro), compared to a classical group contribution approach (Joback and Reid method), to estimate melting points and enthalpy of fusion values. A broad data set of market compounds was gathered from the literature, together with new data measured by differential scanning calorimetry for drug candidates. The highest prediction accuracy was achieved by QSPR using stochastic gradient boosting. The model deviations were discussed, particularly the implications on thermodynamic solubility modeling, as this typically requires estimation of both melting point and enthalpy of fusion. The results suggested that despite considerable advancement in prediction accuracy, there are still limitations especially with complex drug candidates. It is recommended that in such cases, melting properties obtained should be used carefully as input data for thermodynamic solubility modeling. Future research will show how the prediction limits of thermophysical drug properties can be further advanced by even larger data sets and other ML algorithms or also by using molecular simulations.
近年来,机器学习 (ML) 和分子定量构效关系 (QSPR) 引起了广泛关注。本研究评估了商业软件(COSMOquick 和 Molecular Modeling Pro)中实现的现代基于 ML 的方法,并与经典的基团贡献方法(Joback 和 Reid 方法)进行了比较,以估计熔点和熔融焓值。从文献中收集了广泛的市场化合物数据集,并通过差示扫描量热法测量了候选药物的新数据。使用随机梯度提升实现了 QSPR,获得了最高的预测准确性。讨论了模型偏差,特别是对热力学溶解度建模的影响,因为这通常需要估计熔点和熔融焓。结果表明,尽管预测准确性有了相当大的提高,但仍存在限制,特别是对于复杂的候选药物。建议在这种情况下,应谨慎使用获得的熔融性质作为热力学溶解度建模的输入数据。未来的研究将展示如何通过更大的数据集和其他 ML 算法,或者通过使用分子模拟,进一步提高热物理药物性质的预测极限。