• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

稀疏分布活性化合物训练的支持向量机虚拟筛选性能评估。

Evaluation of virtual screening performance of support vector machines trained by sparsely distributed active compounds.

作者信息

Ma X H, Wang R, Yang S Y, Li Z R, Xue Y, Wei Y C, Low B C, Chen Y Z

机构信息

Centre for Computational Science and Engineering, National University of Singapore, Singapore.

出版信息

J Chem Inf Model. 2008 Jun;48(6):1227-37. doi: 10.1021/ci800022e. Epub 2008 Jun 6.

DOI:10.1021/ci800022e
PMID:18533644
Abstract

Virtual screening performance of support vector machines (SVM) depends on the diversity of training active and inactive compounds. While diverse inactive compounds can be routinely generated, the number and diversity of known actives are typically low. We evaluated the performance of SVM trained by sparsely distributed actives in six MDDR biological target classes composed of a high number of known actives (983-1645) of high, intermediate, and low structural diversity (muscarinic M1 receptor agonists, NMDA receptor antagonists, thrombin inhibitors, HIV protease inhibitors, cephalosporins, and renin inhibitors). SVM trained by regularly sparse data sets of 100 actives show improved yields at substantially reduced false-hit rates compared to those of published studies and those of Tanimoto-based similarity searching method based on the same data sets and molecular descriptors. SVM trained by very sparse data sets of 40 actives (2.4%-4.1% of the known actives) predicted 17.5-39.5%, 23.0-48.1%, and 70.2-92.4% of the remaining 943-1605 actives in the high, intermediate, and low diversity classes, respectively, 13.8-68.7% of which are outside the training compound families. SVM predicted 99.97% and 97.1% of the 9.997 M PUBCHEM and 167K remaining MDDR compounds as inactive and 2.6%-8.3% of the 19,495-38,483 MDDR compounds similar to the known actives as active. These suggest that SVM has substantial capability in identifying novel active compounds from sparse active data sets at low false-hit rates.

摘要

支持向量机(SVM)的虚拟筛选性能取决于训练用活性和非活性化合物的多样性。虽然非活性化合物可以常规生成,但已知活性化合物的数量和多样性通常较低。我们评估了在六个MDDR生物靶标类别中,由稀疏分布的活性化合物训练的SVM的性能,这些类别包含大量高、中、低结构多样性的已知活性化合物(983 - 1645种)(毒蕈碱M1受体激动剂、NMDA受体拮抗剂、凝血酶抑制剂、HIV蛋白酶抑制剂、头孢菌素和肾素抑制剂)。与已发表的研究以及基于相同数据集和分子描述符的基于Tanimoto相似性搜索方法相比,由100种活性化合物的规则稀疏数据集训练的SVM在显著降低假阳性率的情况下,产率有所提高。由40种活性化合物(已知活性化合物的2.4% - 4.1%)的非常稀疏数据集训练的SVM,分别预测了高、中、低多样性类别中其余943 - 1605种活性化合物的17.5% - 39.5%、23.0% - 48.1%和70.2% - 92.4%,其中13.8% - 68.7%不在训练化合物家族中。SVM将99.97%的999.7万个PUBCHEM化合物和97.1%的其余MDDR化合物预测为非活性,将与已知活性化合物相似的19495 - 38483个MDDR化合物中的2.6% - 8.3%预测为活性。这些结果表明,SVM在从稀疏活性数据集中以低假阳性率识别新型活性化合物方面具有强大能力。

相似文献

1
Evaluation of virtual screening performance of support vector machines trained by sparsely distributed active compounds.稀疏分布活性化合物训练的支持向量机虚拟筛选性能评估。
J Chem Inf Model. 2008 Jun;48(6):1227-37. doi: 10.1021/ci800022e. Epub 2008 Jun 6.
2
Virtual screening of Abl inhibitors from large compound libraries by support vector machines.利用支持向量机从大型化合物库中虚拟筛选Abl抑制剂
J Chem Inf Model. 2009 Sep;49(9):2101-10. doi: 10.1021/ci900135u.
3
Identification of small molecule aggregators from large compound libraries by support vector machines.通过支持向量机从大型化合物库中鉴定小分子聚集物。
J Comput Chem. 2010 Mar;31(4):752-63. doi: 10.1002/jcc.21347.
4
Virtual screening of selective multitarget kinase inhibitors by combinatorial support vector machines.组合支持向量机的选择性多靶点激酶抑制剂虚拟筛选
Mol Pharm. 2010 Oct 4;7(5):1545-60. doi: 10.1021/mp100179t. Epub 2010 Aug 26.
5
SVM model for virtual screening of Lck inhibitors.用于Lck抑制剂虚拟筛选的支持向量机模型
J Chem Inf Model. 2009 Apr;49(4):877-85. doi: 10.1021/ci800387z.
6
Screening for new antidepressant leads of multiple activities by support vector machines.利用支持向量机筛选具有多种活性的新型抗抑郁药先导物。
J Chem Inf Model. 2006 Jan-Feb;46(1):158-67. doi: 10.1021/ci050301y.
7
A support vector machines approach for virtual screening of active compounds of single and multiple mechanisms from large libraries at an improved hit-rate and enrichment factor.一种支持向量机方法,用于从大型文库中虚拟筛选具有单一和多种作用机制的活性化合物,以提高命中率和富集因子。
J Mol Graph Model. 2008 Jun;26(8):1276-86. doi: 10.1016/j.jmgm.2007.12.002. Epub 2007 Dec 15.
8
Data shaving: a focused screening approach.数据筛选:一种聚焦式筛查方法。
J Chem Inf Comput Sci. 2004 Mar-Apr;44(2):470-9. doi: 10.1021/ci030025s.
9
Extraction and visualization of potential pharmacophore points using support vector machines: application to ligand-based virtual screening for COX-2 inhibitors.使用支持向量机提取和可视化潜在药效团点:在基于配体的COX-2抑制剂虚拟筛选中的应用。
J Med Chem. 2005 Nov 3;48(22):6997-7004. doi: 10.1021/jm050619h.
10
Effect of training data size and noise level on support vector machines virtual screening of genotoxic compounds from large compound libraries.训练数据大小和噪声水平对支持向量机从大型化合物库中虚拟筛选遗传毒性化合物的影响。
J Comput Aided Mol Des. 2011 May;25(5):455-67. doi: 10.1007/s10822-011-9431-3. Epub 2011 May 10.

引用本文的文献

1
InertDB as a generative AI-expanded resource of biologically inactive small molecules from PubChem.InertDB作为一种通过生成式人工智能扩展的来自PubChem的生物无活性小分子资源。
J Cheminform. 2025 Apr 10;17(1):49. doi: 10.1186/s13321-025-00999-1.
2
Evolution of Support Vector Machine and Regression Modeling in Chemoinformatics and Drug Discovery.支持向量机和回归建模在化学生信学和药物发现中的发展演变。
J Comput Aided Mol Des. 2022 May;36(5):355-362. doi: 10.1007/s10822-022-00442-9. Epub 2022 Mar 19.
3
Introduction to the BioChemical Library (BCL): An Application-Based Open-Source Toolkit for Integrated Cheminformatics and Machine Learning in Computer-Aided Drug Discovery.
生物化学库(BCL)简介:一种基于应用的开源工具包,用于计算机辅助药物发现中的综合化学信息学和机器学习。
Front Pharmacol. 2022 Feb 21;13:833099. doi: 10.3389/fphar.2022.833099. eCollection 2022.
4
Mutual Support of Ligand- and Structure-Based Approaches-To What Extent We Can Optimize the Power of Predictive Model? Case Study of Opioid Receptors.配体和基于结构的方法相互支持——在多大程度上我们可以优化预测模型的能力?以阿片受体为例。
Molecules. 2021 Mar 14;26(6):1607. doi: 10.3390/molecules26061607.
5
The influence of the negative-positive ratio and screening database size on the performance of machine learning-based virtual screening.正负比例和筛选数据库大小对基于机器学习的虚拟筛选性能的影响。
PLoS One. 2017 Apr 6;12(4):e0175410. doi: 10.1371/journal.pone.0175410. eCollection 2017.
6
Discovery of Influenza A virus neuraminidase inhibitors using support vector machine and Naïve Bayesian models.利用支持向量机和朴素贝叶斯模型发现甲型流感病毒神经氨酸酶抑制剂
Mol Divers. 2016 May;20(2):439-51. doi: 10.1007/s11030-015-9641-z. Epub 2015 Dec 21.
7
The influence of negative training set size on machine learning-based virtual screening.基于机器学习的虚拟筛选中负训练集大小的影响。
J Cheminform. 2014 Jun 11;6:32. doi: 10.1186/1758-2946-6-32. eCollection 2014.
8
Fast rule-based bioactivity prediction using associative classification mining.基于关联分类挖掘的快速规则生物活性预测。
J Cheminform. 2012 Nov 23;4(1):29. doi: 10.1186/1758-2946-4-29.
9
Development and experimental test of support vector machines virtual screening method for searching Src inhibitors from large compound libraries.用于从大型化合物库中搜索Src抑制剂的支持向量机虚拟筛选方法的开发与实验测试
Chem Cent J. 2012 Nov 23;6(1):139. doi: 10.1186/1752-153X-6-139.
10
Exploiting PubChem for Virtual Screening.利用PubChem进行虚拟筛选。
Expert Opin Drug Discov. 2010 Dec;5(12):1205-1220. doi: 10.1517/17460441.2010.524924.