Suppr超能文献

药物发现中的ADME评估。8. 用支持向量机预测人体肠道吸收情况。

ADME evaluation in drug discovery. 8. The prediction of human intestinal absorption by a support vector machine.

作者信息

Hou Tingjun, Wang Junmei, Li Youyong

机构信息

Department of Chemistry and Biochemistry, Center for Theoretical Biological Physics, University of California at San Diego, La Jolla, CA 92093, USA.

出版信息

J Chem Inf Model. 2007 Nov-Dec;47(6):2408-15. doi: 10.1021/ci7002076. Epub 2007 Oct 12.

Abstract

Human intestinal absorption (HIA) is an important roadblock in the formulation of new drug substances. In silico models for predicting the percentage of HIA based on calculated molecular descriptors are highly needed for the rapid estimation of this property. Here, we have studied the performance of a support vector machine (SVM) to classify compounds with high or low fractional absorption (%FA > 30% or %FA < or = 30%). The analyzed data set consists of 578 structural diverse druglike molecules, which have been divided into a 480-molecule training set and a 98-molecule test set. Ten SVM classification models have been generated to investigate the impact of different individual molecular properties on %FA. Among these studied important molecule descriptors, topological polar surface area (TPSA) and predicted apparent octanol-water distribution coefficient at pH 6.5 (logD6.5) show better classification performance than the others. To obtain the best SVM classifier, the influences of different kernel functions and different combinations of molecular descriptors were investigated using a rigorous training-validation procedure. The best SVM classifier can give satisfactory predictions for the training set (97.8% for the poor-absorption class and 94.5% for the good-absorption class). Moreover, 100% of the poor-absorption class and 97.8% of the good-absorption class in the external test set could be correctly classified. Finally, the influence of the size of the training set and the unbalanced nature of the data set have been studied. The analysis demonstrates that large data set is necessary for the stability of the classification models. Furthermore, the weights for the poor-absorption class and the good-absorption class should be properly balanced to generate unbiased classification models. Our work illustrates that SVMs used in combination with simple molecular descriptors can provide an extremely reliable assessment of intestinal absorption in an early in silico filtering process.

摘要

人体肠道吸收(HIA)是新药物制剂研发中的一个重要障碍。基于计算得到的分子描述符预测HIA百分比的计算机模拟模型对于快速评估这一性质非常必要。在此,我们研究了支持向量机(SVM)对化合物进行高或低分数吸收(%FA > 30%或%FA < 或 = 30%)分类的性能。分析的数据集由578个结构多样的类药物分子组成,这些分子被分为一个480分子的训练集和一个98分子的测试集。已生成10个SVM分类模型,以研究不同单个分子性质对%FA的影响。在这些研究的重要分子描述符中,拓扑极性表面积(TPSA)和pH 6.5时预测的表观正辛醇 - 水分配系数(logD6.5)表现出比其他描述符更好的分类性能。为了获得最佳的SVM分类器,使用严格的训练 - 验证程序研究了不同核函数和分子描述符不同组合的影响。最佳的SVM分类器对训练集能够给出令人满意的预测(吸收差的类别为97.8%,吸收好的类别为94.5%)。此外,外部测试集中100%的吸收差类别和97.8%的吸收好类别能够被正确分类。最后,研究了训练集大小和数据集不平衡性质的影响。分析表明,大数据集对于分类模型的稳定性是必要的。此外,吸收差类别和吸收好类别的权重应适当平衡,以生成无偏的分类模型。我们的工作表明,SVM与简单的分子描述符结合使用,可以在早期的计算机模拟筛选过程中对肠道吸收提供极其可靠的评估。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验