• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于配体的虚拟筛选中机器学习方法的评估

Evaluation of machine-learning methods for ligand-based virtual screening.

作者信息

Chen Beining, Harrison Robert F, Papadatos George, Willett Peter, Wood David J, Lewell Xiao Qing, Greenidge Paulette, Stiefl Nikolaus

机构信息

Department of Chemistry, University of Sheffield, Western Bank, Sheffield, UK.

出版信息

J Comput Aided Mol Des. 2007 Jan-Mar;21(1-3):53-62. doi: 10.1007/s10822-006-9096-5. Epub 2007 Jan 5.

DOI:10.1007/s10822-006-9096-5
PMID:17205373
Abstract

Machine-learning methods can be used for virtual screening by analysing the structural characteristics of molecules of known (in)activity, and we here discuss the use of kernel discrimination and naive Bayesian classifier (NBC) methods for this purpose. We report a kernel method that allows the processing of molecules represented by binary, integer and real-valued descriptors, and show that it is little different in screening performance from a previously described kernel that had been developed specifically for the analysis of binary fingerprint representations of molecular structure. We then evaluate the performance of an NBC when the training-set contains only a very few active molecules. In such cases, a simpler approach based on group fusion would appear to provide superior screening performance, especially when structurally heterogeneous datasets are to be processed.

摘要

机器学习方法可通过分析已知(非)活性分子的结构特征用于虚拟筛选,我们在此讨论为此目的使用核判别和朴素贝叶斯分类器(NBC)方法。我们报告了一种核方法,该方法允许处理由二进制、整数和实值描述符表示的分子,并表明其筛选性能与先前专门为分析分子结构的二进制指纹表示而开发的核方法几乎没有差异。然后,我们评估了训练集仅包含极少数活性分子时NBC的性能。在这种情况下,基于基团融合的更简单方法似乎能提供更好的筛选性能,尤其是在处理结构异质数据集时。

相似文献

1
Evaluation of machine-learning methods for ligand-based virtual screening.基于配体的虚拟筛选中机器学习方法的评估
J Comput Aided Mol Des. 2007 Jan-Mar;21(1-3):53-62. doi: 10.1007/s10822-006-9096-5. Epub 2007 Jan 5.
2
Heterogeneous classifier fusion for ligand-based virtual screening: or, how decision making by committee can be a good thing.基于配体的虚拟筛选中的异类分类器融合:或者,委员会决策如何成为一件好事。
J Chem Inf Model. 2013 Nov 25;53(11):2829-36. doi: 10.1021/ci400466r. Epub 2013 Nov 14.
3
New methods for ligand-based virtual screening: use of data fusion and machine learning to enhance the effectiveness of similarity searching.基于配体的虚拟筛选新方法:利用数据融合和机器学习提高相似性搜索的有效性。
J Chem Inf Model. 2006 Mar-Apr;46(2):462-70. doi: 10.1021/ci050348j.
4
Similarity searching of chemical databases using atom environment descriptors (MOLPRINT 2D): evaluation of performance.使用原子环境描述符(MOLPRINT 2D)对化学数据库进行相似性搜索:性能评估
J Chem Inf Comput Sci. 2004 Sep-Oct;44(5):1708-18. doi: 10.1021/ci0498719.
5
Drug-likeness analysis of traditional Chinese medicines: prediction of drug-likeness using machine learning approaches.中药类药性分析:基于机器学习方法的类药性预测。
Mol Pharm. 2012 Oct 1;9(10):2875-86. doi: 10.1021/mp300198d. Epub 2012 Sep 20.
6
Performance of machine learning methods for ligand-based virtual screening.基于配体的虚拟筛选中机器学习方法的性能
Comb Chem High Throughput Screen. 2009 May;12(4):358-68. doi: 10.2174/138620709788167962.
7
Sparse Bayesian modeling with adaptive kernel learning.基于自适应核学习的稀疏贝叶斯建模
IEEE Trans Neural Netw. 2009 Jun;20(6):926-37. doi: 10.1109/TNN.2009.2014060. Epub 2009 May 5.
8
Predictions of BuChE inhibitors using support vector machine and naive Bayesian classification techniques in drug discovery.使用支持向量机和朴素贝叶斯分类技术在药物发现中预测丁酰胆碱酯酶抑制剂。
J Chem Inf Model. 2013 Nov 25;53(11):3009-20. doi: 10.1021/ci400331p. Epub 2013 Nov 6.
9
kScore: a novel machine learning approach that is not dependent on the data structure of the training set.k评分:一种不依赖于训练集数据结构的新型机器学习方法。
J Comput Aided Mol Des. 2007 Jan-Mar;21(1-3):87-95. doi: 10.1007/s10822-007-9108-0. Epub 2007 Feb 28.
10
Application of machine learning to improve the results of high-throughput docking against the HIV-1 protease.应用机器学习提高针对HIV-1蛋白酶的高通量对接结果。
J Chem Inf Comput Sci. 2004 Nov-Dec;44(6):2216-24. doi: 10.1021/ci0497861.

引用本文的文献

1
Artificial intelligence in predicting pathogenic microorganisms' antimicrobial resistance: challenges, progress, and prospects.人工智能在预测病原微生物的抗菌药物耐药性方面的应用:挑战、进展和展望。
Front Cell Infect Microbiol. 2024 Nov 1;14:1482186. doi: 10.3389/fcimb.2024.1482186. eCollection 2024.
2
Machine learning models to identify lead compound and substitution optimization to have derived energetics and conformational stability through docking and MD simulations for sphingosine kinase 1.通过对接和分子动力学模拟,利用机器学习模型识别鞘氨醇激酶1的先导化合物并进行取代优化,以获得能量学和构象稳定性。
Mol Divers. 2025 Aug;29(4):2945-2977. doi: 10.1007/s11030-024-10997-4. Epub 2024 Oct 17.
3

本文引用的文献

1
Reverse fingerprinting, similarity searching by group fusion and fingerprint bit importance.反向指纹识别、通过基团融合进行相似性搜索以及指纹位重要性
Mol Divers. 2006 Aug;10(3):311-32. doi: 10.1007/s11030-006-9039-z. Epub 2006 Sep 21.
2
Prediction of protein-ligand interactions. Docking and scoring: successes and gaps.蛋白质-配体相互作用的预测。对接与评分:成功与差距。
J Med Chem. 2006 Oct 5;49(20):5851-5. doi: 10.1021/jm060999m.
3
Modern agrochemical research: a missed opportunity for drug discovery?
Lessons learned during the journey of data: from experiment to model for predicting kinase affinity, selectivity, polypharmacology, and resistance.
数据之旅中的经验教训:从实验到预测激酶亲和力、选择性、多药理学和耐药性的模型。
bioRxiv. 2024 Sep 10:2024.09.10.612176. doi: 10.1101/2024.09.10.612176.
4
BAT2: an Open-Source Tool for Flexible, Automated, and Low Cost Absolute Binding Free Energy Calculations.BAT2:一个用于灵活、自动化和低成本绝对结合自由能计算的开源工具。
J Chem Theory Comput. 2024 Aug 13;20(15):6518-6530. doi: 10.1021/acs.jctc.4c00205. Epub 2024 Aug 1.
5
Kinome-Wide Virtual Screening by Multi-Task Deep Learning.基于多任务深度学习的激酶组全虚拟筛选
Int J Mol Sci. 2024 Feb 22;25(5):2538. doi: 10.3390/ijms25052538.
6
Cinobufotalin prevents bone loss induced by ovariectomy in mice through the BMPs/SMAD and Wnt/β-catenin signaling pathways.华蟾素通过 BMPs/SMAD 和 Wnt/β-catenin 信号通路预防去卵巢小鼠的骨丢失。
Animal Model Exp Med. 2024 Jun;7(3):208-221. doi: 10.1002/ame2.12359. Epub 2023 Nov 28.
7
Pharmacological affinity fingerprints derived from bioactivity data for the identification of designer drugs.源自生物活性数据的药理学亲和力指纹图谱用于新型毒品的鉴定。
J Cheminform. 2022 Jun 7;14(1):35. doi: 10.1186/s13321-022-00607-6.
8
Deep learning tools for advancing drug discovery and development.用于推进药物发现与开发的深度学习工具。
3 Biotech. 2022 May;12(5):110. doi: 10.1007/s13205-022-03165-8. Epub 2022 Apr 9.
9
Turbo prediction: a new approach for bioactivity prediction.Turbo预测:一种生物活性预测的新方法。
J Comput Aided Mol Des. 2022 Jan;36(1):77-85. doi: 10.1007/s10822-021-00440-3. Epub 2022 Jan 21.
10
Building 2D classification models and 3D CoMSIA models on small-molecule inhibitors of both wild-type and T790M/L858R double-mutant EGFR.构建野生型和 T790M/L858R 双突变 EGFR 小分子抑制剂的 2D 分类模型和 3D CoMSIA 模型。
Mol Divers. 2022 Jun;26(3):1715-1730. doi: 10.1007/s11030-021-10300-9. Epub 2021 Oct 12.
Drug Discov Today. 2006 Sep;11(17-18):839-45. doi: 10.1016/j.drudis.2006.07.002.
4
Determination and mapping of activity-specific descriptor value ranges for the identification of active compounds.用于鉴定活性化合物的活性特异性描述符值范围的确定与映射。
J Med Chem. 2006 Apr 6;49(7):2284-93. doi: 10.1021/jm051110p.
5
Generation of a focused set of GSK compounds biased toward ligand-gated ion-channel ligands.
J Chem Inf Model. 2006 Mar-Apr;46(2):659-64. doi: 10.1021/ci050353n.
6
Virtual screening using binary kernel discrimination: effect of noisy training data and the optimization of performance.使用二元核判别法的虚拟筛选:噪声训练数据的影响及性能优化
J Chem Inf Model. 2006 Mar-Apr;46(2):478-86. doi: 10.1021/ci0505426.
7
Virtual screening using binary kernel discrimination: analysis of pesticide data.使用二元核判别法的虚拟筛选:农药数据的分析
J Chem Inf Model. 2006 Mar-Apr;46(2):471-7. doi: 10.1021/ci050397w.
8
Scaffold hopping through virtual screening using 2D and 3D similarity descriptors: ranking, voting, and consensus scoring.利用二维和三维相似性描述符通过虚拟筛选进行骨架跃迁:排序、投票和共识评分。
J Med Chem. 2006 Mar 9;49(5):1536-48. doi: 10.1021/jm050468i.
9
Enrichment of high-throughput screening data with increasing levels of noise using support vector machines, recursive partitioning, and laplacian-modified naive bayesian classifiers.使用支持向量机、递归划分和拉普拉斯修正朴素贝叶斯分类器,对噪声水平不断增加的高通量筛选数据进行富集。
J Chem Inf Model. 2006 Jan-Feb;46(1):193-200. doi: 10.1021/ci050374h.
10
Using extended-connectivity fingerprints with Laplacian-modified Bayesian analysis in high-throughput screening follow-up.在高通量筛选后续研究中使用具有拉普拉斯修正贝叶斯分析的扩展连接指纹。
J Biomol Screen. 2005 Oct;10(7):682-6. doi: 10.1177/1087057105281365. Epub 2005 Sep 16.