Suppr超能文献

FRAGSITE2:一种基于结构和片段的虚拟配体筛选方法。

FRAGSITE2: A structure and fragment-based approach for virtual ligand screening.

机构信息

Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia, USA.

出版信息

Protein Sci. 2024 Jan;33(1):e4869. doi: 10.1002/pro.4869.

Abstract

Protein function annotation and drug discovery often involve finding small molecule binders. In the early stages of drug discovery, virtual ligand screening (VLS) is frequently applied to identify possible hits before experimental testing. While our recent ligand homology modeling (LHM)-machine learning VLS method FRAGSITE outperformed approaches that combined traditional docking to generate protein-ligand poses and deep learning scoring functions to rank ligands, a more robust approach that could identify a more diverse set of binding ligands is needed. Here, we describe FRAGSITE2 that shows significant improvement on protein targets lacking known small molecule binders and no confident LHM identified template ligands when benchmarked on two commonly used VLS datasets: For both the DUD-E set and DEKOIS2.0 set and ligands having a Tanimoto coefficient (TC) < 0.7 to the template ligands, the 1% enrichment factor (EF ) of FRAGSITE2 is significantly better than those for FINDSITE , an earlier LHM algorithm. For the DUD-E set, FRAGSITE2 also shows better ROC enrichment factor and AUPR (area under the precision-recall curve) than the deep learning DenseFS scoring function. Comparison with the RF-score-VS on the 76 target subset of DEKOIS2.0 and a TC < 0.99 to training DUD-E ligands, FRAGSITE2 has double the EF . Its boosted tree regression method provides for more robust performance than a deep learning multiple layer perceptron method. When compared with the pretrained language model for protein target features, FRAGSITE2 also shows much better performance. Thus, FRAGSITE2 is a promising approach that can discover novel hits for protein targets. FRAGSITE2's web service is freely available to academic users at http://sites.gatech.edu/cssb/FRAGSITE2.

摘要

蛋白质功能注释和药物发现通常涉及寻找小分子配体。在药物发现的早期阶段,虚拟配体筛选(VLS)经常被用于在实验测试之前识别可能的命中。虽然我们最近的配体同源建模(LHM)-机器学习 VLS 方法 FRAGSITE 在性能上优于将传统对接与生成蛋白质-配体构象和深度学习打分函数相结合的方法,但是需要一种更强大的方法来识别更多样化的结合配体。在这里,我们描述了 FRAGSITE2,它在两个常用的 VLS 数据集上进行基准测试时,在缺乏已知小分子配体的蛋白质靶标和没有可信 LHM 鉴定模板配体的情况下,显示出显著的改进:对于 DUD-E 集和 DEKOIS2.0 集,以及与模板配体的拓朴相似系数(TC)<0.7 的配体,FRAGSITE2 的 1%富集因子(EF)显著优于早期的 LHM 算法 FINDSITE。对于 DUD-E 集,FRAGSITE2 还显示出比深度学习 DenseFS 打分函数更好的 ROC 富集因子和 AUPR(精度-召回曲线下的面积)。与 RF-score-VS 在 DEKOIS2.0 的 76 个目标子集和 TC<0.99 到训练 DUD-E 配体的比较,FRAGSITE2 的 EF 是其两倍。其增强树回归方法提供了比深度学习多层感知器方法更稳健的性能。与蛋白质目标特征的预训练语言模型相比,FRAGSITE2 也显示出更好的性能。因此,FRAGSITE2 是一种很有前途的方法,可以为蛋白质靶标发现新的命中。FRAGSITE2 的网络服务可免费向学术用户提供,网址为 http://sites.gatech.edu/cssb/FRAGSITE2。

相似文献

2
FRAGSITE: A Fragment-Based Approach for Virtual Ligand Screening.FRAGSITE:基于片段的虚拟配体筛选方法。
J Chem Inf Model. 2021 Apr 26;61(4):2074-2089. doi: 10.1021/acs.jcim.0c01160. Epub 2021 Mar 16.
7
FINDSITE: a threading-based approach to ligand homology modeling.FINDSITE:一种基于穿线法的配体同源建模方法。
PLoS Comput Biol. 2009 Jun;5(6):e1000405. doi: 10.1371/journal.pcbi.1000405. Epub 2009 Jun 5.

引用本文的文献

1
Introduction to Respiratory Syncytial Virus.呼吸道合胞病毒简介
Methods Mol Biol. 2025;2948:1-17. doi: 10.1007/978-1-0716-4666-3_1.

本文引用的文献

2
Machine-learning methods for ligand-protein molecular docking.基于机器学习的配体-蛋白分子对接方法。
Drug Discov Today. 2022 Jan;27(1):151-164. doi: 10.1016/j.drudis.2021.09.007. Epub 2021 Sep 21.
3
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
4
ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning.ProtTrans:通过自监督学习理解生命语言。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7112-7127. doi: 10.1109/TPAMI.2021.3095381. Epub 2022 Sep 14.
5
FRAGSITE: A Fragment-Based Approach for Virtual Ligand Screening.FRAGSITE:基于片段的虚拟配体筛选方法。
J Chem Inf Model. 2021 Apr 26;61(4):2074-2089. doi: 10.1021/acs.jcim.0c01160. Epub 2021 Mar 16.
6
LIT-PCBA: An Unbiased Data Set for Machine Learning and Virtual Screening.LIT-PCBA:用于机器学习和虚拟筛选的无偏数据集。
J Chem Inf Model. 2020 Sep 28;60(9):4263-4273. doi: 10.1021/acs.jcim.0c00155. Epub 2020 Apr 23.
8
AlphaFold at CASP13.AlphaFold 在 CASP13 中的应用。
Bioinformatics. 2019 Nov 1;35(22):4862-4865. doi: 10.1093/bioinformatics/btz422.
9
PubChem 2019 update: improved access to chemical data.PubChem 2019 年更新:改善化学数据获取。
Nucleic Acids Res. 2019 Jan 8;47(D1):D1102-D1109. doi: 10.1093/nar/gky1033.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验