• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于支持向量机的虚拟筛选中,将确证无活性化合物与随机选择的化合物进行比较,作为负训练实例。

Comparison of confirmed inactive and randomly selected compounds as negative training examples in support vector machine-based virtual screening.

机构信息

LIMES Program Unit, Chemical Biology and Medicinal Chemistry, Department of Life Science Informatics, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany.

出版信息

J Chem Inf Model. 2013 Jul 22;53(7):1595-601. doi: 10.1021/ci4002712. Epub 2013 Jul 3.

DOI:10.1021/ci4002712
PMID:23799269
Abstract

The choice of negative training data for machine learning is a little explored issue in chemoinformatics. In this study, the influence of alternative sets of negative training data and different background databases on support vector machine (SVM) modeling and virtual screening has been investigated. Target-directed SVM models have been derived on the basis of differently composed training sets containing confirmed inactive molecules or randomly selected database compounds as negative training instances. These models were then applied to search background databases consisting of biological screening data or randomly assembled compounds for available hits. Negative training data were found to systematically influence compound recall in virtual screening. In addition, different background databases had a strong influence on the search results. Our findings also indicated that typical benchmark settings lead to an overestimation of SVM-based virtual screening performance compared to search conditions that are more relevant for practical applications.

摘要

在化学生物信息学中,机器学习的负训练数据选择是一个研究较少的问题。在这项研究中,我们研究了不同的负训练数据集和不同背景数据库对支持向量机(SVM)建模和虚拟筛选的影响。基于由不同组成的训练集,其中包含确证的非活性分子或随机选择的数据库化合物作为负训练实例,我们构建了基于靶标的 SVM 模型。然后,将这些模型应用于基于生物筛选数据或随机组装化合物的背景数据库搜索中,以寻找可用的命中化合物。我们发现负训练数据会系统地影响虚拟筛选中的化合物召回率。此外,不同的背景数据库对搜索结果也有很大的影响。我们的研究结果还表明,与更符合实际应用的搜索条件相比,典型的基准设置会导致基于 SVM 的虚拟筛选性能的高估。

相似文献

1
Comparison of confirmed inactive and randomly selected compounds as negative training examples in support vector machine-based virtual screening.基于支持向量机的虚拟筛选中,将确证无活性化合物与随机选择的化合物进行比较,作为负训练实例。
J Chem Inf Model. 2013 Jul 22;53(7):1595-601. doi: 10.1021/ci4002712. Epub 2013 Jul 3.
2
REPROVIS-DB: a benchmark system for ligand-based virtual screening derived from reproducible prospective applications.REPROVIS-DB:一个基于配体的虚拟筛选基准系统,源自可重现的前瞻性应用。
J Chem Inf Model. 2011 Oct 24;51(10):2467-73. doi: 10.1021/ci200309j. Epub 2011 Sep 26.
3
Potency-directed similarity searching using support vector machines.基于支持向量机的效价导向相似度搜索。
Chem Biol Drug Des. 2011 Jan;77(1):30-8. doi: 10.1111/j.1747-0285.2010.01059.x. Epub 2010 Nov 29.
4
Influence of Varying Training Set Composition and Size on Support Vector Machine-Based Prediction of Active Compounds.不同训练集组成和大小对基于支持向量机的活性化合物预测的影响。
J Chem Inf Model. 2017 Apr 24;57(4):710-716. doi: 10.1021/acs.jcim.7b00088. Epub 2017 Apr 10.
5
Critical comparison of virtual screening methods against the MUV data set.针对MUV数据集的虚拟筛选方法的关键比较。
J Chem Inf Model. 2009 Oct;49(10):2168-78. doi: 10.1021/ci900249b.
6
Similarity searching for potent compounds using feature selection.基于特征选择的有效化合物相似性搜索。
J Chem Inf Model. 2013 Jul 22;53(7):1613-9. doi: 10.1021/ci4003206. Epub 2013 Jul 9.
7
Evaluation of virtual screening performance of support vector machines trained by sparsely distributed active compounds.稀疏分布活性化合物训练的支持向量机虚拟筛选性能评估。
J Chem Inf Model. 2008 Jun;48(6):1227-37. doi: 10.1021/ci800022e. Epub 2008 Jun 6.
8
Virtual screening of Abl inhibitors from large compound libraries by support vector machines.利用支持向量机从大型化合物库中虚拟筛选Abl抑制剂
J Chem Inf Model. 2009 Sep;49(9):2101-10. doi: 10.1021/ci900135u.
9
Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach.基于配体效率的训练可以提高机器学习方法中配体和药物靶标蛋白生物活性预测的准确性。
J Chem Inf Model. 2013 Oct 28;53(10):2525-37. doi: 10.1021/ci400240u. Epub 2013 Sep 24.
10
Virtual screening of selective multitarget kinase inhibitors by combinatorial support vector machines.组合支持向量机的选择性多靶点激酶抑制剂虚拟筛选
Mol Pharm. 2010 Oct 4;7(5):1545-60. doi: 10.1021/mp100179t. Epub 2010 Aug 26.

引用本文的文献

1
InertDB as a generative AI-expanded resource of biologically inactive small molecules from PubChem.InertDB作为一种通过生成式人工智能扩展的来自PubChem的生物无活性小分子资源。
J Cheminform. 2025 Apr 10;17(1):49. doi: 10.1186/s13321-025-00999-1.
2
Identification of novel inhibitors of Keap1/Nrf2 by a promising method combining protein-protein interaction-oriented library and machine learning.通过一种有前途的结合蛋白质-蛋白质相互作用定向文库和机器学习的方法鉴定 Keap1/Nrf2 的新型抑制剂。
Sci Rep. 2021 Apr 1;11(1):7420. doi: 10.1038/s41598-021-86616-1.
3
How to Achieve Better Results Using PASS-Based Virtual Screening: Case Study for Kinase Inhibitors.
如何通过基于PASS的虚拟筛选获得更好的结果:激酶抑制剂的案例研究
Front Chem. 2018 Apr 26;6:133. doi: 10.3389/fchem.2018.00133. eCollection 2018.
4
Nature is the best source of anticancer drugs: Indexing natural products for their anticancer bioactivity.大自然是抗癌药物的最佳来源:为天然产物的抗癌生物活性编制索引。
PLoS One. 2017 Nov 9;12(11):e0187925. doi: 10.1371/journal.pone.0187925. eCollection 2017.
5
Predicting novel substrates for enzymes with minimal experimental effort with active learning.用主动学习以最少的实验工作量预测酶的新底物。
Metab Eng. 2017 Nov;44:171-181. doi: 10.1016/j.ymben.2017.09.016. Epub 2017 Oct 10.
6
Nature is the best source of anti-inflammatory drugs: indexing natural products for their anti-inflammatory bioactivity.大自然是最好的抗炎药物来源:为具有抗炎生物活性的天然产物编制索引。
Inflamm Res. 2018 Jan;67(1):67-75. doi: 10.1007/s00011-017-1096-5. Epub 2017 Sep 27.
7
Indexing Natural Products for Their Potential Anti-Diabetic Activity: Filtering and Mapping Discriminative Physicochemical Properties.为潜在的抗糖尿病活性对天然产物进行索引:筛选和描绘有区别的物理化学特性。
Molecules. 2017 Sep 17;22(9):1563. doi: 10.3390/molecules22091563.
8
The influence of the negative-positive ratio and screening database size on the performance of machine learning-based virtual screening.正负比例和筛选数据库大小对基于机器学习的虚拟筛选性能的影响。
PLoS One. 2017 Apr 6;12(4):e0175410. doi: 10.1371/journal.pone.0175410. eCollection 2017.
9
Influence of Varying Training Set Composition and Size on Support Vector Machine-Based Prediction of Active Compounds.不同训练集组成和大小对基于支持向量机的活性化合物预测的影响。
J Chem Inf Model. 2017 Apr 24;57(4):710-716. doi: 10.1021/acs.jcim.7b00088. Epub 2017 Apr 10.
10
Collaborative drug discovery for More Medicines for Tuberculosis (MM4TB).用于治疗更多结核病患者的合作药物研发(MM4TB)。
Drug Discov Today. 2017 Mar;22(3):555-565. doi: 10.1016/j.drudis.2016.10.009. Epub 2016 Nov 22.