开发一种用于 COVID-19 治疗药物再利用的 SARS-CoV-2 主蛋白酶结合预测随机森林模型。

Developing a SARS-CoV-2 main protease binding prediction random forest model for drug repurposing for COVID-19 treatment.

机构信息

National Center for Toxicological Research, U.S. Food & Drug Administration, Jefferson, AR 72079, USA.

出版信息

Exp Biol Med (Maywood). 2023 Nov;248(21):1927-1936. doi: 10.1177/15353702231209413. Epub 2023 Nov 24.

DOI:10.1177/15353702231209413

PMID:37997891

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10798185/

Abstract

The coronavirus disease 2019 (COVID-19) global pandemic resulted in millions of people becoming infected with the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and close to seven million deaths worldwide. It is essential to further explore and design effective COVID-19 treatment drugs that target the main protease of SARS-CoV-2, a major target for COVID-19 drugs. In this study, machine learning was applied for predicting the SARS-CoV-2 main protease binding of Food and Drug Administration (FDA)-approved drugs to assist in the identification of potential repurposing candidates for COVID-19 treatment. Ligands bound to the SARS-CoV-2 main protease in the Protein Data Bank and compounds experimentally tested in SARS-CoV-2 main protease binding assays in the literature were curated. These chemicals were divided into training (516 chemicals) and testing (360 chemicals) data sets. To identify SARS-CoV-2 main protease binders as potential candidates for repurposing to treat COVID-19, 1188 FDA-approved drugs from the Liver Toxicity Knowledge Base were obtained. A random forest algorithm was used for constructing predictive models based on molecular descriptors calculated using Mold2 software. Model performance was evaluated using 100 iterations of fivefold cross-validations which resulted in 78.8% balanced accuracy. The random forest model that was constructed from the whole training dataset was used to predict SARS-CoV-2 main protease binding on the testing set and the FDA-approved drugs. Model applicability domain and prediction confidence on drugs predicted as the main protease binders discovered 10 FDA-approved drugs as potential candidates for repurposing to treat COVID-19. Our results demonstrate that machine learning is an efficient method for drug repurposing and, thus, may accelerate drug development targeting SARS-CoV-2.

摘要

2019 年冠状病毒病（COVID-19）全球大流行导致数百万人感染严重急性呼吸综合征冠状病毒 2（SARS-CoV-2）病毒，全球近 700 万人死亡。进一步探索和设计针对 SARS-CoV-2 主要蛋白酶的有效 COVID-19 治疗药物至关重要，SARS-CoV-2 主要蛋白酶是 COVID-19 药物的主要靶点。在这项研究中，应用机器学习预测已获美国食品和药物管理局（FDA）批准的药物与 SARS-CoV-2 主要蛋白酶的结合，以协助识别 COVID-19 治疗的潜在再利用候选药物。从蛋白质数据库中筛选与 SARS-CoV-2 主要蛋白酶结合的配体和文献中实验测试的 SARS-CoV-2 主要蛋白酶结合化合物。这些化合物被分为训练（516 种化合物）和测试（360 种化合物）数据集。为了识别 SARS-CoV-2 主要蛋白酶结合物作为治疗 COVID-19 的再利用潜在候选物，从肝毒性知识库中获得了 1188 种已获 FDA 批准的药物。使用 Mold2 软件计算的分子描述符，应用随机森林算法构建预测模型。通过 100 次五重交叉验证的迭代评估模型性能，得到 78.8%的平衡准确性。从整个训练数据集构建的随机森林模型用于预测测试集中 SARS-CoV-2 主要蛋白酶的结合和 FDA 批准的药物。模型适用性域和对预测为主要蛋白酶结合物的药物的预测置信度发现 10 种已获 FDA 批准的药物可作为治疗 COVID-19 的再利用潜在候选物。我们的结果表明，机器学习是一种有效的药物再利用方法，因此可能会加速针对 SARS-CoV-2 的药物开发。

相似文献

Developing a SARS-CoV-2 main protease binding prediction random forest model for drug repurposing for COVID-19 treatment.开发一种用于 COVID-19 治疗药物再利用的 SARS-CoV-2 主蛋白酶结合预测随机森林模型。

Exp Biol Med (Maywood). 2023 Nov;248(21):1927-1936. doi: 10.1177/15353702231209413. Epub 2023 Nov 24.

Targeting SARS-CoV-2 Main Protease: A Computational Drug Repurposing Study.靶向 SARS-CoV-2 主蛋白酶：一项计算药物再利用研究。

Arch Med Res. 2021 Jan;52(1):38-47. doi: 10.1016/j.arcmed.2020.09.013. Epub 2020 Sep 17.

Drugs Repurposing Using QSAR, Docking and Molecular Dynamics for Possible Inhibitors of the SARS-CoV-2 M Protease.利用定量构效关系、对接和分子动力学对严重急性呼吸综合征冠状病毒2 M蛋白酶的潜在抑制剂进行药物再利用研究

Molecules. 2020 Nov 6;25(21):5172. doi: 10.3390/molecules25215172.

Repurposing potential of FDA-approved and investigational drugs for COVID-19 targeting SARS-CoV-2 spike and main protease and validation by machine learning algorithm.经 FDA 批准和正在研究的药物针对 SARS-CoV-2 刺突蛋白和主蛋白酶的再利用潜力，以及通过机器学习算法进行验证。

Chem Biol Drug Des. 2021 Apr;97(4):836-853. doi: 10.1111/cbdd.13812. Epub 2020 Dec 22.

Computational Screening Using a Combination of Ligand-Based Machine Learning and Molecular Docking Methods for the Repurposing of Antivirals Targeting the SARS-CoV-2 Main Protease.基于配体的机器学习和分子对接方法的组合进行计算筛选，以重新利用针对 SARS-CoV-2 主蛋白酶的抗病毒药物。

Daru. 2024 Jun;32(1):47-65. doi: 10.1007/s40199-023-00484-w. Epub 2023 Oct 31.

Computational Insights on the Potential of Some NSAIDs for Treating COVID-19: Priority Set and Lead Optimization.计算洞察某些 NSAIDs 治疗 COVID-19 的潜力：优先级设定和先导优化。

Molecules. 2021 Jun 21;26(12):3772. doi: 10.3390/molecules26123772.

Potential SARS-CoV-2 protease M inhibitors: repurposing FDA-approved drugs.潜在的 SARS-CoV-2 蛋白酶 M 抑制剂：重新利用 FDA 批准的药物。

Phys Biol. 2021 Feb 9;18(2):025001. doi: 10.1088/1478-3975/abcb66.

In silico prediction of potential inhibitors for the main protease of SARS-CoV-2 using molecular docking and dynamics simulation based drug-repurposing.基于药物再利用的分子对接和动力学模拟预测 SARS-CoV-2 主要蛋白酶的潜在抑制剂的计算机预测。

J Infect Public Health. 2020 Sep;13(9):1210-1223. doi: 10.1016/j.jiph.2020.06.016. Epub 2020 Jun 16.

Virtual screening and repurposing of FDA approved drugs against COVID-19 main protease.针对 COVID-19 主蛋白酶的虚拟筛选和再利用 FDA 批准的药物。

Life Sci. 2020 Jun 15;251:117627. doi: 10.1016/j.lfs.2020.117627. Epub 2020 Apr 3.

A multi-stage virtual screening of FDA-approved drugs reveals potential inhibitors of SARS-CoV-2 main protease.多阶段虚拟筛选 FDA 批准药物揭示 SARS-CoV-2 主蛋白酶潜在抑制剂。

J Biomol Struct Dyn. 2022 Mar;40(5):2327-2338. doi: 10.1080/07391102.2020.1837680. Epub 2020 Oct 23.

引用本文的文献

Integrating Molecular Dynamics, Molecular Docking, and Machine Learning for Predicting SARS-CoV-2 Papain-like Protease Binders.整合分子动力学、分子对接和机器学习以预测严重急性呼吸综合征冠状病毒2（SARS-CoV-2）木瓜样蛋白酶结合物

Molecules. 2025 Jul 16;30(14):2985. doi: 10.3390/molecules30142985.

A refined set of RxNorm drug names for enhancing unstructured data analysis in drug safety surveillance.一组经过优化的RxNorm药物名称，用于加强药物安全监测中的非结构化数据分析。

Exp Biol Med (Maywood). 2025 May 2;250:10374. doi: 10.3389/ebm.2025.10374. eCollection 2025.

Developing predictive models for µ opioid receptor binding using machine learning and deep learning techniques.使用机器学习和深度学习技术开发μ阿片受体结合的预测模型。

Exp Biol Med (Maywood). 2025 Mar 19;250:10359. doi: 10.3389/ebm.2025.10359. eCollection 2025.

Development of a comprehensive open access "molecules with androgenic activity resource (MAAR)" to facilitate risk assessment of chemicals.开发一个全面的开放获取的“具有雄激素活性的分子资源（MAAR）”，以促进化学物质风险评估。

Exp Biol Med (Maywood). 2024 Sep 19;249:10279. doi: 10.3389/ebm.2024.10279. eCollection 2024.

BERT-based language model for accurate drug adverse event extraction from social media: implementation, evaluation, and contributions to pharmacovigilance practices.基于 BERT 的社交媒体中药物不良反应精准提取语言模型：实现、评估及对药物警戒实践的贡献。

Front Public Health. 2024 Apr 23;12:1392180. doi: 10.3389/fpubh.2024.1392180. eCollection 2024.

Integrating artificial intelligence with bioinformatics promotes public health.将人工智能与生物信息学相结合可促进公共卫生。

Exp Biol Med (Maywood). 2023 Nov;248(21):1905-1907. doi: 10.1177/15353702231223575.

本文引用的文献

Analyzing 3D structures of the SARS-CoV-2 main protease reveals structural features of ligand binding for COVID-19 drug discovery.分析 SARS-CoV-2 主要蛋白酶的 3D 结构揭示了 COVID-19 药物发现中配体结合的结构特征。

Drug Discov Today. 2023 Oct;28(10):103727. doi: 10.1016/j.drudis.2023.103727. Epub 2023 Jul 27.

Staying Ahead of the Game: How SARS-CoV-2 has Accelerated the Application of Machine Learning in Pandemic Management.走在游戏前面：SARS-CoV-2 如何加速机器学习在大流行管理中的应用。

BioDrugs. 2023 Sep;37(5):649-674. doi: 10.1007/s40259-023-00611-8. Epub 2023 Jul 18.

Anti-Biofilm: Machine Learning Assisted Prediction of IC Activity of Chemicals Against Biofilms of Microbes Causing Antimicrobial Resistance and Implications in Drug Repurposing.抗生物膜：机器学习辅助预测化学品对导致抗菌耐药性的微生物生物膜的 IC 活性及其在药物再利用中的意义。

J Mol Biol. 2023 Jul 15;435(14):168115. doi: 10.1016/j.jmb.2023.168115. Epub 2023 Apr 20.

Mpropred: A machine learning (ML) driven Web-App for bioactivity prediction of SARS-CoV-2 main protease (Mpro) antagonists.Mpropred：一个基于机器学习 (ML) 的 SARS-CoV-2 主蛋白酶 (Mpro) 拮抗剂生物活性预测的网络应用程序。

PLoS One. 2023 Jun 23;18(6):e0287179. doi: 10.1371/journal.pone.0287179. eCollection 2023.

Targeting SARS-CoV-2 Main Protease: A Successful Story Guided by an Drug Repurposing Approach.靶向 SARS-CoV-2 主蛋白酶：药物再利用方法指导下的成功故事。

J Chem Inf Model. 2023 Jun 12;63(11):3601-3613. doi: 10.1021/acs.jcim.3c00282. Epub 2023 May 25.

Machine-learning repurposing of DrugBank compounds for opioid use disorder.基于机器学习的药物再利用方法研究 DrugBank 化合物治疗阿片类药物使用障碍。

Comput Biol Med. 2023 Jun;160:106921. doi: 10.1016/j.compbiomed.2023.106921. Epub 2023 May 2.

drug repurposing by combining machine learning classification model and molecular dynamics to identify a potential OGT inhibitor.通过结合机器学习分类模型和分子动力学来重新利用药物，以鉴定潜在的 OGT 抑制剂。

J Biomol Struct Dyn. 2024 Feb-Mar;42(3):1417-1428. doi: 10.1080/07391102.2023.2199868. Epub 2023 Apr 13.

Identification of novel inhibitors for SARS-CoV-2 as therapeutic options using machine learning-based virtual screening, molecular docking and MD simulation.使用基于机器学习的虚拟筛选、分子对接和分子动力学模拟来鉴定新型严重急性呼吸综合征冠状病毒2（SARS-CoV-2）抑制剂作为治疗选择。

Front Mol Biosci. 2023 Mar 7;10:1060076. doi: 10.3389/fmolb.2023.1060076. eCollection 2023.

Turning high-throughput structural biology into predictive inhibitor design.将高通量结构生物学转化为可预测的抑制剂设计。

Proc Natl Acad Sci U S A. 2023 Mar 14;120(11):e2214168120. doi: 10.1073/pnas.2214168120. Epub 2023 Mar 6.

Distinct Conformations of SARS-CoV-2 Omicron Spike Protein and Its Interaction with ACE2 and Antibody.SARS-CoV-2 奥密克戎刺突蛋白的独特构象及其与 ACE2 和抗体的相互作用。

Int J Mol Sci. 2023 Feb 14;24(4):3774. doi: 10.3390/ijms24043774.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验