• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在基于结构的虚拟筛选中,要警惕基于通用机器学习的打分函数。

Beware of the generic machine learning-based scoring functions in structure-based virtual screening.

机构信息

Central South University, China.

出版信息

Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa070.

DOI:10.1093/bib/bbaa070
PMID:32484221
Abstract

Machine learning-based scoring functions (MLSFs) have attracted extensive attention recently and are expected to be potential rescoring tools for structure-based virtual screening (SBVS). However, a major concern nowadays is whether MLSFs trained for generic uses rather than a given target can consistently be applicable for VS. In this study, a systematic assessment was carried out to re-evaluate the effectiveness of 14 reported MLSFs in VS. Overall, most of these MLSFs could hardly achieve satisfactory results for any dataset, and they could even not outperform the baseline of classical SFs such as Glide SP. An exception was observed for RFscore-VS trained on the Directory of Useful Decoys-Enhanced dataset, which showed its superiority for most targets. However, in most cases, it clearly illustrated rather limited performance on the targets that were dissimilar to the proteins in the corresponding training sets. We also used the top three docking poses rather than the top one for rescoring and retrained the models with the updated versions of the training set, but only minor improvements were observed. Taken together, generic MLSFs may have poor generalization capabilities to be applicable for the real VS campaigns. Therefore, it should be quite cautious to use this type of methods for VS.

摘要

基于机器学习的打分函数(MLSFs)最近引起了广泛关注,有望成为基于结构的虚拟筛选(SBVS)的潜在重打分工具。然而,目前人们主要关注的是,针对通用用途而不是特定目标训练的 MLSFs 是否能够始终如一地适用于 VS。在这项研究中,我们对 14 种已报道的 MLSFs 在 VS 中的有效性进行了系统评估。总体而言,这些 MLSFs 中的大多数对于任何数据集都很难获得令人满意的结果,它们甚至不能优于 Glide SP 等经典 SF 的基准。但在训练集来自 Directory of Useful Decoys-Enhanced 数据集的 RFscore-VS 上,观察到了一个例外,它在大多数目标上表现出优越性。然而,在大多数情况下,它清楚地表明在与相应训练集中的蛋白质不同的目标上的性能相当有限。我们还使用了前三个对接构象而不是前一个构象进行重打分,并使用更新后的训练集版本重新训练了模型,但只观察到了较小的改进。综上所述,通用 MLSFs 可能缺乏泛化能力,难以适用于真正的 VS 活动。因此,在进行 VS 时应该非常谨慎地使用这种类型的方法。

相似文献

1
Beware of the generic machine learning-based scoring functions in structure-based virtual screening.在基于结构的虚拟筛选中,要警惕基于通用机器学习的打分函数。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa070.
2
Accuracy or novelty: what can we gain from target-specific machine-learning-based scoring functions in virtual screening?准确性还是新颖性:在虚拟筛选中,基于目标的机器学习打分函数能为我们带来什么?
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa410.
3
Topology-Based and Conformation-Based Decoys Database: An Unbiased Online Database for Training and Benchmarking Machine-Learning Scoring Functions.基于拓扑结构和构象的诱饵数据库:一个用于培训和基准测试机器学习打分函数的无偏在线数据库。
J Med Chem. 2023 Jul 13;66(13):9174-9183. doi: 10.1021/acs.jmedchem.3c00801. Epub 2023 Jun 14.
4
Data-augmented machine learning scoring functions for virtual screening of YTHDF1 mA reader protein.基于数据增强的机器学习打分函数在 YTHDF1 mA 读蛋白虚拟筛选中的应用。
Comput Biol Med. 2024 Dec;183:109268. doi: 10.1016/j.compbiomed.2024.109268. Epub 2024 Oct 12.
5
Improving structure-based virtual screening performance via learning from scoring function components.通过从打分函数组件中学习来提高基于结构的虚拟筛选性能。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa094.
6
TocoDecoy: A New Approach to Design Unbiased Datasets for Training and Benchmarking Machine-Learning Scoring Functions.TocoDecoy:一种设计无偏数据集的新方法,用于训练和基准测试机器学习评分函数。
J Med Chem. 2022 Jun 9;65(11):7918-7932. doi: 10.1021/acs.jmedchem.2c00460. Epub 2022 Jun 1.
7
SCORCH: Improving structure-based virtual screening with machine learning classifiers, data augmentation, and uncertainty estimation.SCORCH:利用机器学习分类器、数据增强和不确定性估计改进基于结构的虚拟筛选。
J Adv Res. 2023 Apr;46:135-147. doi: 10.1016/j.jare.2022.07.001. Epub 2022 Jul 25.
8
Assessment of the Generalization Abilities of Machine-Learning Scoring Functions for Structure-Based Virtual Screening.基于结构的虚拟筛选中机器学习打分函数泛化能力的评估。
J Chem Inf Model. 2022 Nov 28;62(22):5485-5502. doi: 10.1021/acs.jcim.2c01149. Epub 2022 Oct 21.
9
ML-PLIC: a web platform for characterizing protein-ligand interactions and developing machine learning-based scoring functions.ML-PLIC:一个用于描述蛋白质-配体相互作用和开发基于机器学习的打分函数的网络平台。
Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad295.
10
The impact of compound library size on the performance of scoring functions for structure-based virtual screening.化合物库大小对基于结构的虚拟筛选打分函数性能的影响。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa095.

引用本文的文献

1
Decoding the limits of deep learning in molecular docking for drug discovery.解码深度学习在药物发现分子对接中的局限性。
Chem Sci. 2025 Aug 19. doi: 10.1039/d5sc05395a.
2
ColdstartCPI: Induced-fit theory-guided DTI predictive model with improved generalization performance.ColdstartCPI:基于诱导契合理论指导的具有改进泛化性能的DTI预测模型。
Nat Commun. 2025 Jul 11;16(1):6436. doi: 10.1038/s41467-025-61745-7.
3
SPLIF-Enhanced Attention-Driven 3D CNNs for Precise and Reliable Protein-Ligand Interaction Modeling for METTL3.用于METTL3精确可靠蛋白质-配体相互作用建模的基于SPLIF增强注意力驱动的3D卷积神经网络
ACS Omega. 2025 Apr 16;10(16):16748-16761. doi: 10.1021/acsomega.5c00538. eCollection 2025 Apr 29.
4
Exploring the potential of compound-protein complex structure-free models in virtual screening using BlendNet.利用BlendNet探索无复合蛋白复合物结构模型在虚拟筛选中的潜力。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae712.
5
Accurate prediction of protein-ligand interactions by combining physical energy functions and graph-neural networks.通过结合物理能量函数和图神经网络准确预测蛋白质-配体相互作用。
J Cheminform. 2024 Nov 4;16(1):121. doi: 10.1186/s13321-024-00912-2.
6
A comprehensive review of artificial intelligence for pharmacology research.药理学研究中人工智能的全面综述。
Front Genet. 2024 Sep 3;15:1450529. doi: 10.3389/fgene.2024.1450529. eCollection 2024.
7
Comprehensive machine learning boosts structure-based virtual screening for PARP1 inhibitors.综合机器学习助力基于结构的PARP1抑制剂虚拟筛选。
J Cheminform. 2024 Apr 7;16(1):40. doi: 10.1186/s13321-024-00832-1.
8
A new paradigm for applying deep learning to protein-ligand interaction prediction.深度学习在蛋白质-配体相互作用预测中的应用的新范例。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae145.
9
CarsiDock: a deep learning paradigm for accurate protein-ligand docking and screening based on large-scale pre-training.CarsiDock:一种基于大规模预训练的用于精确蛋白质-配体对接和筛选的深度学习范式。
Chem Sci. 2023 Dec 19;15(4):1449-1471. doi: 10.1039/d3sc05552c. eCollection 2024 Jan 24.
10
Using macromolecular electron densities to improve the enrichment of active compounds in virtual screening.利用大分子电子密度提高虚拟筛选中活性化合物的富集度。
Commun Chem. 2023 Aug 22;6(1):173. doi: 10.1038/s42004-023-00984-5.