• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于监督学习的USP7抑制剂识别的多模态数据融合:系统比较

Multimodal data fusion for supervised learning-based identification of USP7 inhibitors: a systematic comparison.

作者信息

Shen Wen-Feng, Tang He-Wei, Li Jia-Bo, Li Xiang, Chen Si

机构信息

School of Medicine & School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China.

School of Pharmacy, Second Military Medical University, Shanghai, 200433, China.

出版信息

J Cheminform. 2023 Jan 11;15(1):5. doi: 10.1186/s13321-022-00675-8.

DOI:10.1186/s13321-022-00675-8
PMID:36631899
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9835315/
Abstract

Ubiquitin-specific-processing protease 7 (USP7) is a promising target protein for cancer therapy, and great attention has been given to the identification of USP7 inhibitors. Traditional virtual screening methods have now been successfully applied to discover USP7 inhibitors aiming at reducing costs and speeding up time in several studies. However, due to their unsatisfactory accuracy, it is still a difficult task to develop USP7 inhibitors. In this study, multiple supervised learning classifiers were built to distinguish active USP7 inhibitors from inactive ligands. Physicochemical descriptors, MACCS keys, ECFP4 fingerprints and SMILES were first calculated to represent the compounds in our in-house dataset. Two deep learning (DL) models and nine classical machine learning (ML) models were then constructed based on different combinations of the above molecular representations under three activity cutoff values, and a total of 15 groups of experiments (75 experiments) were implemented. The performance of the models in these experiments was evaluated, compared and discussed using a variety of metrics. The optimal models are ensemble learning models when the dataset is balanced or severely imbalanced, and SMILES-based DL performs the best when the dataset is slightly imbalanced. Meanwhile, multimodal data fusion in some cases can improve the performance of ML and DL models. In addition, SMOTE, unbiased decoy selection and SMILES enumeration can improve the performance of ML and DL models when the dataset is severely imbalanced, and SMOTE works the best. Our study established highly accurate supervised learning classification models, which would accelerate the development of USP7 inhibitors. Some guidance was also provided for drug researchers in selecting supervised models and molecular representations as well as handling imbalanced datasets.

摘要

泛素特异性加工蛋白酶7(USP7)是一种很有前景的癌症治疗靶蛋白,人们对USP7抑制剂的鉴定给予了极大关注。在多项研究中,传统的虚拟筛选方法现已成功应用于发现USP7抑制剂,旨在降低成本并加快研发进程。然而,由于其准确性不尽人意,开发USP7抑制剂仍然是一项艰巨的任务。在本研究中,构建了多个监督学习分类器,以区分活性USP7抑制剂和非活性配体。首先计算物理化学描述符、MACCS键、ECFP4指纹和SMILES,以表示我们内部数据集中的化合物。然后基于上述分子表示的不同组合,在三个活性截止值下构建了两个深度学习(DL)模型和九个经典机器学习(ML)模型,并总共进行了15组实验(75次实验)。使用各种指标对这些实验中模型的性能进行了评估、比较和讨论。当数据集平衡或严重不平衡时,最优模型是集成学习模型;当数据集略有不平衡时,基于SMILES的深度学习表现最佳。同时,在某些情况下,多模态数据融合可以提高机器学习和深度学习模型的性能。此外,当数据集严重不平衡时,SMOTE、无偏诱饵选择和SMILES枚举可以提高机器学习和深度学习模型的性能,其中SMOTE效果最佳。我们的研究建立了高度准确的监督学习分类模型,这将加速USP7抑制剂的开发。还为药物研究人员在选择监督模型和分子表示以及处理不平衡数据集方面提供了一些指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/42700b1594df/13321_2022_675_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/a9fc8fbeb69f/13321_2022_675_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/12261d83054f/13321_2022_675_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/c8a43eef5764/13321_2022_675_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/51659fdb0392/13321_2022_675_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/df855bb57c8f/13321_2022_675_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/bc09a2279e97/13321_2022_675_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/68ab8835e4d3/13321_2022_675_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/0e646db6bc22/13321_2022_675_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/42700b1594df/13321_2022_675_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/a9fc8fbeb69f/13321_2022_675_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/12261d83054f/13321_2022_675_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/c8a43eef5764/13321_2022_675_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/51659fdb0392/13321_2022_675_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/df855bb57c8f/13321_2022_675_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/bc09a2279e97/13321_2022_675_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/68ab8835e4d3/13321_2022_675_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/0e646db6bc22/13321_2022_675_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/75d5/9835315/42700b1594df/13321_2022_675_Fig9_HTML.jpg

相似文献

1
Multimodal data fusion for supervised learning-based identification of USP7 inhibitors: a systematic comparison.基于监督学习的USP7抑制剂识别的多模态数据融合:系统比较
J Cheminform. 2023 Jan 11;15(1):5. doi: 10.1186/s13321-022-00675-8.
2
Prediction of Orthosteric and Allosteric Regulations on Cannabinoid Receptors Using Supervised Machine Learning Classifiers.使用监督机器学习分类器预测大麻素受体的变构调节。
Mol Pharm. 2019 Jun 3;16(6):2605-2615. doi: 10.1021/acs.molpharmaceut.9b00182. Epub 2019 May 3.
3
A real use case of semi-supervised learning for mammogram classification in a local clinic of Costa Rica.半监督学习在哥斯达黎加当地诊所的乳房 X 光分类中的实际应用案例。
Med Biol Eng Comput. 2022 Apr;60(4):1159-1175. doi: 10.1007/s11517-021-02497-6. Epub 2022 Mar 3.
4
Cheminformatic Identification of Tyrosyl-DNA Phosphodiesterase 1 (Tdp1) Inhibitors: A Comparative Study of SMILES-Based Supervised Machine Learning Models.酪氨酸-DNA磷酸二酯酶1(Tdp1)抑制剂的化学信息学鉴定:基于SMILES的监督式机器学习模型的比较研究
J Pers Med. 2024 Sep 15;14(9):981. doi: 10.3390/jpm14090981.
5
Large-scale comparison of machine learning methods for profiling prediction of kinase inhibitors.激酶抑制剂谱预测的机器学习方法大规模比较
J Cheminform. 2024 Jan 30;16(1):13. doi: 10.1186/s13321-023-00799-5.
6
Classification of HIV-1 Protease Inhibitors by Machine Learning Methods.基于机器学习方法的HIV-1蛋白酶抑制剂分类
ACS Omega. 2018 Nov 30;3(11):15837-15849. doi: 10.1021/acsomega.8b01843. Epub 2018 Nov 21.
7
Recent advances in the development of ubiquitin-specific-processing protease 7 (USP7) inhibitors.泛素特异性加工蛋白酶 7(USP7)抑制剂的最新研究进展。
Eur J Med Chem. 2020 Apr 1;191:112107. doi: 10.1016/j.ejmech.2020.112107. Epub 2020 Feb 1.
8
Protein Fitness Prediction Is Impacted by the Interplay of Language Models, Ensemble Learning, and Sampling Methods.蛋白质适应性预测受到语言模型、集成学习和采样方法相互作用的影响。
Pharmaceutics. 2023 Apr 25;15(5):1337. doi: 10.3390/pharmaceutics15051337.
9
Structure-activity relationship-based chemical classification of highly imbalanced Tox21 datasets.基于结构-活性关系的高度不平衡Tox21数据集的化学分类
J Cheminform. 2020 Oct 27;12(1):66. doi: 10.1186/s13321-020-00468-x.
10
A patent review of ubiquitin-specific protease 7 (USP7) inhibitors (2014-present).USP7 抑制剂的专利研究综述(2014 年至今)。
Expert Opin Ther Pat. 2022 Jul;32(7):753-767. doi: 10.1080/13543776.2022.2058873. Epub 2022 Mar 31.

引用本文的文献

1
Inferring kinase-phosphosite regulation from phosphoproteome-enriched cancer multi-omics datasets.从富含磷酸化蛋白质组的癌症多组学数据集中推断激酶-磷酸化位点调控。
Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf143.
2
Benchmarking Active Learning Protocols for Ligand-Binding Affinity Prediction.基于配体结合亲和力预测的主动学习协议基准测试。
J Chem Inf Model. 2024 Mar 25;64(6):1955-1965. doi: 10.1021/acs.jcim.4c00220. Epub 2024 Mar 6.
3
Artificial intelligence for prediction of biological activities and generation of molecular hits using stereochemical information.

本文引用的文献

1
Multimodal deep learning for biomedical data fusion: a review.多模态深度学习在生物医学数据融合中的应用综述。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab569.
2
Virtual Screening Inhibitors of Ubiquitin-specific Protease 7 Combining Pharmacophore Modeling and Molecular Docking.基于药效团模型和分子对接的泛素特异性蛋白酶 7 抑制剂虚拟筛选。
Mol Inform. 2022 Jul;41(7):e2100273. doi: 10.1002/minf.202100273. Epub 2022 Feb 8.
3
The USP7 protein interaction network and its roles in tumorigenesis.USP7蛋白相互作用网络及其在肿瘤发生中的作用。
利用立体化学信息进行生物活性预测和分子命中生成的人工智能。
J Comput Aided Mol Des. 2023 Dec;37(12):791-806. doi: 10.1007/s10822-023-00539-9. Epub 2023 Oct 17.
Genes Dis. 2020 Oct 22;9(1):41-50. doi: 10.1016/j.gendis.2020.10.004. eCollection 2022 Jan.
4
Systematic comparison of ligand-based and structure-based virtual screening methods on poly (ADP-ribose) polymerase-1 inhibitors.基于配体和基于结构的虚拟筛选方法在聚(ADP-核糖)聚合酶-1 抑制剂上的系统比较。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab135.
5
Molecular representations in AI-driven drug discovery: a review and practical guide.人工智能驱动的药物发现中的分子表征:综述与实践指南
J Cheminform. 2020 Sep 17;12(1):56. doi: 10.1186/s13321-020-00460-5.
6
The emerging nature of Ubiquitin-specific protease 7 (USP7): a new target in cancer therapy.泛素特异性蛋白酶 7(USP7)的新兴特性:癌症治疗的新靶点。
Drug Discov Today. 2021 Feb;26(2):490-502. doi: 10.1016/j.drudis.2020.10.028. Epub 2020 Nov 4.
7
An Integrated in silico Approach and in vitro Study for the Discovery of Small-Molecule USP7 Inhibitors as Potential Cancer Therapies.一种用于发现小分子 USP7 抑制剂作为潜在癌症疗法的计算集成方法和体外研究。
ChemMedChem. 2021 Feb 4;16(3):555-567. doi: 10.1002/cmdc.202000675. Epub 2020 Nov 12.
8
Discovery of Ubiquitin-Specific Protease 7 (USP7) Inhibitors with Novel Scaffold Structures by Virtual Screening, Molecular Dynamics Simulation, and Biological Evaluation.通过虚拟筛选、分子动力学模拟和生物学评估发现具有新型骨架结构的泛素特异性蛋白酶 7(USP7)抑制剂。
J Chem Inf Model. 2020 Jun 22;60(6):3255-3264. doi: 10.1021/acs.jcim.0c00154. Epub 2020 Apr 23.
9
The power of deep learning to ligand-based novel drug discovery.深度学习在基于配体的新药发现中的作用。
Expert Opin Drug Discov. 2020 Jul;15(7):755-764. doi: 10.1080/17460441.2020.1745183. Epub 2020 Mar 31.
10
A Survey on Deep Learning for Multimodal Data Fusion.深度学习在多模态数据融合中的研究综述。
Neural Comput. 2020 May;32(5):829-864. doi: 10.1162/neco_a_01273. Epub 2020 Mar 18.