Suppr超能文献

两全其美:结合制药数据与先进建模技术以改进计算机辅助pKa预测。

Best of both worlds: combining pharma data and state of the art modeling technology to improve in Silico pKa prediction.

作者信息

Fraczkiewicz Robert, Lobell Mario, Göller Andreas H, Krenz Ursula, Schoenneis Rolf, Clark Robert D, Hillisch Alexander

机构信息

Simulations Plus, Inc. 42505 10th Street West, Lancaster, California 93534, United States.

出版信息

J Chem Inf Model. 2015 Feb 23;55(2):389-97. doi: 10.1021/ci500585w. Epub 2014 Dec 16.

Abstract

In a unique collaboration between a software company and a pharmaceutical company, we were able to develop a new in silico pKa prediction tool with outstanding prediction quality. An existing pKa prediction method from Simulations Plus based on artificial neural network ensembles (ANNE), microstates analysis, and literature data was retrained with a large homogeneous data set of drug-like molecules from Bayer. The new model was thus built with curated sets of ∼14,000 literature pKa values (∼11,000 compounds, representing literature chemical space) and ∼19,500 pKa values experimentally determined at Bayer Pharma (∼16,000 compounds, representing industry chemical space). Model validation was performed with several test sets consisting of a total of ∼31,000 new pKa values measured at Bayer. For the largest and most difficult test set with >16,000 pKa values that were not used for training, the original model achieved a mean absolute error (MAE) of 0.72, root-mean-square error (RMSE) of 0.94, and squared correlation coefficient (R(2)) of 0.87. The new model achieves significantly improved prediction statistics, with MAE = 0.50, RMSE = 0.67, and R(2) = 0.93. It is commercially available as part of the Simulations Plus ADMET Predictor release 7.0. Good predictions are only of value when delivered effectively to those who can use them. The new pKa prediction model has been integrated into Pipeline Pilot and the PharmacophorInformatics (PIx) platform used by scientists at Bayer Pharma. Different output formats allow customized application by medicinal chemists, physical chemists, and computational chemists.

摘要

在一家软件公司与一家制药公司的独特合作中,我们成功开发了一款预测质量卓越的全新计算机pKa预测工具。基于人工神经网络集成(ANNE)、微状态分析和文献数据,对Simulations Plus现有的一种pKa预测方法进行了重新训练,训练数据集是来自拜耳公司的大量同类药物分子。新模型由此构建而成,其包含约14,000个文献pKa值(约11,000种化合物,代表文献化学空间)和约19,500个在拜耳制药公司通过实验测定的pKa值(约16,000种化合物,代表工业化学空间)。使用由拜耳公司测量的总共约31,000个新pKa值组成的多个测试集进行模型验证。对于未用于训练的、pKa值超过16,000个的最大且最难的测试集,原始模型的平均绝对误差(MAE)为0.72,均方根误差(RMSE)为0.94,平方相关系数(R²)为0.87。新模型的预测统计数据有显著改善,MAE = 0.50,RMSE = 0.67,R² = 0.93。它作为Simulations Plus ADMET Predictor 7.0版本的一部分进行商业销售。只有有效地提供给能够使用它们的人,良好的预测才有价值。新的pKa预测模型已集成到拜耳制药公司科学家使用的Pipeline Pilot和药效团信息学(PIx)平台中。不同的输出格式允许药物化学家、物理化学家和计算化学家进行定制应用。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验