Suppr超能文献

我们能多准确地预测类药性化合物的熔点?

How accurately can we predict the melting points of drug-like compounds?

机构信息

Helmholtz-Zentrum München - German Research Centre for Environmental Health (GmbH), Institute of Structural Biology , Munich 85764, Germany.

出版信息

J Chem Inf Model. 2014 Dec 22;54(12):3320-9. doi: 10.1021/ci5005288. Epub 2014 Dec 9.

Abstract

This article contributes a highly accurate model for predicting the melting points (MPs) of medicinal chemistry compounds. The model was developed using the largest published data set, comprising more than 47k compounds. The distributions of MPs in drug-like and drug lead sets showed that >90% of molecules melt within [50,250]°C. The final model calculated an RMSE of less than 33 °C for molecules from this temperature interval, which is the most important for medicinal chemistry users. This performance was achieved using a consensus model that performed calculations to a significantly higher accuracy than the individual models. We found that compounds with reactive and unstable groups were overrepresented among outlying compounds. These compounds could decompose during storage or measurement, thus introducing experimental errors. While filtering the data by removing outliers generally increased the accuracy of individual models, it did not significantly affect the results of the consensus models. Three analyzed distance to models did not allow us to flag molecules, which had MP values fell outside the applicability domain of the model. We believe that this negative result and the public availability of data from this article will encourage future studies to develop better approaches to define the applicability domain of models. The final model, MP data, and identified reactive groups are available online at http://ochem.eu/article/55638.

摘要

这篇文章贡献了一个高度准确的模型,用于预测药物化学化合物的熔点(MP)。该模型是使用最大的已发表数据集开发的,该数据集包含超过 47,000 种化合物。药物样和药物先导化合物集的 MPs 分布表明,>90%的分子在[50,250]°C 内熔化。最终模型计算出的温度间隔内分子的 RMSE 小于 33°C,这对药物化学用户来说是最重要的。这种性能是使用共识模型实现的,该模型的计算精度明显高于单个模型。我们发现,具有反应性和不稳定性基团的化合物在离群化合物中过度代表。这些化合物在储存或测量过程中可能会分解,从而引入实验误差。虽然通过去除离群值过滤数据通常会提高单个模型的准确性,但它不会显著影响共识模型的结果。我们分析的三个距离模型并没有使我们能够标记那些 MP 值落在模型应用域之外的分子。我们认为,这一负面结果以及本文中数据的公开可用性将鼓励未来的研究开发出更好的方法来定义模型的应用域。最终模型、MP 数据和识别出的反应性基团可在 http://ochem.eu/article/55638 上在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f99c/4702524/716cf7096872/ci-2014-005288_0001.jpg

相似文献

1
How accurately can we predict the melting points of drug-like compounds?
J Chem Inf Model. 2014 Dec 22;54(12):3320-9. doi: 10.1021/ci5005288. Epub 2014 Dec 9.
3
Experimental and computational prediction of glass transition temperature of drugs.
J Chem Inf Model. 2014 Dec 22;54(12):3396-403. doi: 10.1021/ci5004834. Epub 2014 Dec 1.
6
Machine learning models for lipophilicity and their domain of applicability.
Mol Pharm. 2007 Jul-Aug;4(4):524-38. doi: 10.1021/mp0700413. Epub 2007 Jul 19.
7
Comments on the article "Evaluation of pK(a) estimation methods on 211 druglike compounds".
J Chem Inf Model. 2011 Jan 24;51(1):102-4. doi: 10.1021/ci100332m. Epub 2010 Dec 7.
8
One- to four-dimensional kernels for virtual screening and the prediction of physical, chemical, and biological properties.
J Chem Inf Model. 2007 May-Jun;47(3):965-74. doi: 10.1021/ci600397p. Epub 2007 Mar 6.
9
Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.
J Comput Aided Mol Des. 2007 Sep;21(9):485-98. doi: 10.1007/s10822-007-9125-z. Epub 2007 Jul 14.
10
Prediction of solubility curves and melting properties of organic and pharmaceutical compounds.
Eur J Pharm Sci. 2009 Feb 15;36(2-3):330-44. doi: 10.1016/j.ejps.2008.10.009. Epub 2008 Oct 30.

引用本文的文献

1
Physics-Based Solubility Prediction for Organic Molecules.
Chem Rev. 2025 Aug 13;125(15):7057-7098. doi: 10.1021/acs.chemrev.4c00855. Epub 2025 Jul 29.
2
Prediction of Melting Points of Chemicals with a Data Augmentation-Based Neural Network Approach.
ACS Omega. 2025 Jun 3;10(23):24296-24306. doi: 10.1021/acsomega.5c00205. eCollection 2025 Jun 17.
3
5
Discovery of Crystallizable Organic Semiconductors with Machine Learning.
J Am Chem Soc. 2024 Aug 7;146(31):21583-21590. doi: 10.1021/jacs.4c05245. Epub 2024 Jul 25.
6
Targeting highly resisted anticancer drugs through topological descriptors using VIKOR multi-criteria decision analysis.
Eur Phys J Plus. 2022;137(11):1245. doi: 10.1140/epjp/s13360-022-03469-x. Epub 2022 Nov 15.
7
Transformer-CNN: Swiss knife for QSAR modeling and interpretation.
J Cheminform. 2020 Mar 18;12(1):17. doi: 10.1186/s13321-020-00423-w.
10
Visual and Semantic Enrichment of Analytical Chemistry Literature Searches by Combining Text Mining and Computational Chemistry.
Anal Chem. 2019 Apr 2;91(7):4312-4316. doi: 10.1021/acs.analchem.8b05818. Epub 2019 Mar 13.

本文引用的文献

1
Modeling the Biodegradability of Chemical Compounds Using the Online CHEmical Modeling Environment (OCHEM).
Mol Inform. 2014 Jan;33(1):73-85. doi: 10.1002/minf.201300030. Epub 2013 Nov 28.
2
Applicability Domain Dependent Predictive Uncertainty in QSAR Regressions.
Mol Inform. 2014 Jan;33(1):26-35. doi: 10.1002/minf.201200131. Epub 2013 Oct 7.
3
CADASTER QSPR Models for Predictions of Melting and Boiling Points of Perfluorinated Chemicals.
Mol Inform. 2011 Mar 14;30(2-3):189-204. doi: 10.1002/minf.201000133. Epub 2011 Mar 17.
4
Using beta binomials to estimate classification uncertainty for ensemble models.
J Cheminform. 2014 Jun 22;6:34. doi: 10.1186/1758-2946-6-34. eCollection 2014.
6
The QSPR-THESAURUS: the online platform of the CADASTER project.
Altern Lab Anim. 2014 Mar;42(1):13-24. doi: 10.1177/026119291404200104.
7
Cross-validation pitfalls when selecting and assessing regression and classification models.
J Cheminform. 2014 Mar 29;6(1):10. doi: 10.1186/1758-2946-6-10.
8
Using random forest to model the domain applicability of another random forest model.
J Chem Inf Model. 2013 Nov 25;53(11):2837-50. doi: 10.1021/ci400482e. Epub 2013 Nov 5.
10
QSPR prediction of physico-chemical properties for REACH.
SAR QSAR Environ Res. 2013;24(4):279-318. doi: 10.1080/1062936X.2013.773372. Epub 2013 Mar 25.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验