面向预测病毒错义突变的迁移学习：以严重急性呼吸综合征冠状病毒2为例的研究

Transfer learning towards predicting viral missense mutations: A case study on SARS-CoV-2.

作者信息

Govender Shaylyn, Morgan Emily, Ramahala Rabelani, Lobb Kevin, Bishop Nigel T, Tastan Bishop Özlem

机构信息

Research Unit in Bioinformatics (RUBi), Department of Biochemistry, Microbiology and Bioinformatics, Rhodes University, Makhanda 6139, South Africa.

Department of Chemistry, Rhodes University, Makhanda 6139, South Africa.

出版信息

Comput Struct Biotechnol J. 2025 Apr 22;27:1686-1692. doi: 10.1016/j.csbj.2025.04.029. eCollection 2025.

DOI:10.1016/j.csbj.2025.04.029

PMID:40352476

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12063013/

Abstract

Understanding viral evolution and predicting future mutations are crucial for overcoming drug resistance and developing long-lasting treatments. Previously, we established machine learning (ML) models using dynamic residue network (DRN) metric data and leveraging a vast amount of existing mutation data from the SARS-CoV-2 main protease (M). Here, we sought to assess the generalizability and robustness of the current models across other SARS-CoV-2 proteins. To achieve this, for the first time, we employed a transfer learning (TL) approach, allowing us to determine the extent to which M trained models could be applied to other SARS-CoV-2 proteins. The TL results were highly promising, with artificial neural network (ANN) and random forest (RF) correlation coefficients for M closely matching those of NSP10, NSP16, and PL. The ANN |R| value for M was 0.564, while NSP10, NSP16, and PL had values of 0.533, 0.527, and 0.464, respectively. Similarly, the RF |R| value for M was 0.673, compared to 0.457, 0.460, and 0.437 for NSP10, NSP16, and PL, respectively. Interestingly, we did not observe a strong correlation for the spike (S) protein monomer and its domains. The low p-values that are associated with the correlation |R| values show that the linear correlations between predicted and actual mutation frequencies are statistically significant. This indicates that TL may generalize well across structurally related viral proteins using DRN-derived ML model from M. Overall, we aim to develop a universal ML model for predicting missense mutation frequencies in viral proteins, and this study lays the foundation for that goal.

摘要

了解病毒进化并预测未来突变对于克服耐药性和开发持久治疗方法至关重要。此前，我们利用动态残基网络（DRN）度量数据并借助来自严重急性呼吸综合征冠状病毒2（SARS-CoV-2）主要蛋白酶（M）的大量现有突变数据建立了机器学习（ML）模型。在此，我们试图评估当前模型在其他SARS-CoV-2蛋白中的通用性和稳健性。为实现这一目标，我们首次采用了迁移学习（TL）方法，从而能够确定M训练模型可应用于其他SARS-CoV-2蛋白的程度。迁移学习的结果非常有前景，M的人工神经网络（ANN）和随机森林（RF）相关系数与非结构蛋白10（NSP10）、非结构蛋白16（NSP16）和木瓜蛋白酶样蛋白酶（PL）的相关系数紧密匹配。M的ANN |R|值为0.564，而NSP10、NSP16和PL的值分别为0.533、0.527和0.464。同样，M的RF |R|值为0.673，相比之下，NSP10、NSP16和PL的RF |R|值分别为0.457、0.460和0.437。有趣的是，我们未观察到刺突（S）蛋白单体及其结构域之间存在强相关性。与相关|R|值相关的低p值表明预测突变频率与实际突变频率之间的线性相关性具有统计学意义。这表明使用来自M的基于DRN的ML模型，迁移学习可能在结构相关的病毒蛋白中具有良好的通用性。总体而言，我们旨在开发一种通用的ML模型来预测病毒蛋白中的错义突变频率，而本研究为该目标奠定了基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb02/12063013/7a9967e25e54/ga1.jpg

相似文献

Transfer learning towards predicting viral missense mutations: A case study on SARS-CoV-2.面向预测病毒错义突变的迁移学习：以严重急性呼吸综合征冠状病毒2为例的研究

Comput Struct Biotechnol J. 2025 Apr 22;27:1686-1692. doi: 10.1016/j.csbj.2025.04.029. eCollection 2025.

Revealing SARS-CoV-2 M mutation cold and hot spots: Dynamic residue network analysis meets machine learning.揭示新冠病毒M突变的冷热点：动态残基网络分析与机器学习相结合

Comput Struct Biotechnol J. 2024 Oct 22;23:3800-3816. doi: 10.1016/j.csbj.2024.10.031. eCollection 2024 Dec.

Molecular Interactions of Zyesami with the SARS-CoV-2 nsp10/nsp16 Protein Complex.Zyesami 与 SARS-CoV-2 nsp10/nsp16 蛋白复合物的分子相互作用。

Comb Chem High Throughput Screen. 2023;26(6):1196-1203. doi: 10.2174/1386207325666220816141028.

Identification of novel mutations in the methyltransferase complex (Nsp10-Nsp16) of SARS-CoV-2.新型冠状病毒（SARS-CoV-2）甲基转移酶复合物（Nsp10-Nsp16）中新型突变的鉴定

Biochem Biophys Rep. 2020 Dec;24:100833. doi: 10.1016/j.bbrep.2020.100833. Epub 2020 Oct 10.

Structural and functional insights into the 2'-O-methyltransferase of SARS-CoV-2.对 SARS-CoV-2 2'-O-甲基转移酶的结构和功能的深入了解。

Virol Sin. 2024 Aug;39(4):619-631. doi: 10.1016/j.virs.2024.07.001. Epub 2024 Jul 3.

Virtual screening, ADME/T, and binding free energy analysis of anti-viral, anti-protease, and anti-infectious compounds against NSP10/NSP16 methyltransferase and main protease of SARS CoV-2.针对严重急性呼吸综合征冠状病毒2（SARS-CoV-2）的NSP10/NSP16甲基转移酶和主要蛋白酶的抗病毒、抗蛋白酶和抗感染化合物的虚拟筛选、药物代谢动力学/药物毒性（ADME/T）及结合自由能分析

J Recept Signal Transduct Res. 2020 Dec;40(6):605-612. doi: 10.1080/10799893.2020.1772298. Epub 2020 Jun 1.

Identification of naphthyridine and quinoline derivatives as potential Nsp16-Nsp10 inhibitors: a pharmacoinformatics study.鉴定萘啶和喹啉衍生物作为潜在的 Nsp16-Nsp10 抑制剂：基于计算药理学的研究。

J Biomol Struct Dyn. 2022 Jun;40(9):3899-3906. doi: 10.1080/07391102.2020.1851305. Epub 2020 Nov 30.

Coronavirus nsp10/nsp16 Methyltransferase Can Be Targeted by nsp10-Derived Peptide In Vitro and In Vivo To Reduce Replication and Pathogenesis.冠状病毒nsp10/nsp16甲基转移酶可在体外和体内被源自nsp10的肽靶向，以减少复制和发病机制。

J Virol. 2015 Aug;89(16):8416-27. doi: 10.1128/JVI.00948-15. Epub 2015 Jun 3.

Biochemical and structural insights into the mechanisms of SARS coronavirus RNA ribose 2'-O-methylation by nsp16/nsp10 protein complex.SARS 冠状病毒 RNA 核糖 2'-O-甲基化的机制的生化和结构见解：由 nsp16/nsp10 蛋白复合物介导。

PLoS Pathog. 2011 Oct;7(10):e1002294. doi: 10.1371/journal.ppat.1002294. Epub 2011 Oct 13.

Unveiling the antiviral inhibitory activity of ebselen and ebsulfur derivatives on SARS-CoV-2 using machine learning-based QSAR, LB-PaCS-MD, and experimental assay.利用基于机器学习的定量构效关系（QSAR）、线性约束势全原子分子动力学（LB-PaCS-MD）和实验分析揭示依布硒啉和依布硫衍生物对严重急性呼吸综合征冠状病毒2（SARS-CoV-2）的抗病毒抑制活性。

Sci Rep. 2025 Feb 26;15(1):6956. doi: 10.1038/s41598-025-91235-1.

本文引用的文献

Impact of African-Specific ACE2 Polymorphisms on Omicron BA.4/5 RBD Binding and Allosteric Communication Within the ACE2-RBD Protein Complex.非洲特异性ACE2多态性对奥密克戎BA.4/5刺突蛋白受体结合域（RBD）与ACE2-RBD蛋白复合物内变构通讯的影响。

Int J Mol Sci. 2025 Feb 6;26(3):1367. doi: 10.3390/ijms26031367.

SARS-CoV-2 Evolution: Implications for Diagnosis, Treatment, Vaccine Effectiveness and Development.严重急性呼吸综合征冠状病毒2的进化：对诊断、治疗、疫苗有效性及研发的影响

Vaccines (Basel). 2024 Dec 28;13(1):17. doi: 10.3390/vaccines13010017.

Revealing SARS-CoV-2 M mutation cold and hot spots: Dynamic residue network analysis meets machine learning.揭示新冠病毒M突变的冷热点：动态残基网络分析与机器学习相结合

Comput Struct Biotechnol J. 2024 Oct 22;23:3800-3816. doi: 10.1016/j.csbj.2024.10.031. eCollection 2024 Dec.

A prediction of mutations in infectious viruses using artificial intelligence.利用人工智能预测传染性病毒中的突变。

Genomics Inform. 2024 Oct 8;22(1):15. doi: 10.1186/s44342-024-00019-y.

AmberTools. AmberTools。

J Chem Inf Model. 2023 Oct 23;63(20):6183-6191. doi: 10.1021/acs.jcim.3c01153. Epub 2023 Oct 8.

TEMPO: A transformer-based mutation prediction framework for SARS-CoV-2 evolution.TEMPO：一种基于变压器的 SARS-CoV-2 进化突变预测框架。

Comput Biol Med. 2023 Jan;152:106264. doi: 10.1016/j.compbiomed.2022.106264. Epub 2022 Dec 14.

Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks.利用人工神经网络预测 SARS-CoV-2 的复发性突变。

Int J Mol Sci. 2022 Nov 24;23(23):14683. doi: 10.3390/ijms232314683.

Crystal structure of SARS-CoV-2 nsp10-nsp16 in complex with small molecule inhibitors, SS148 and WZ16.SARS-CoV-2 nsp10-nsp16 与小分子抑制剂 SS148 和 WZ16 复合物的晶体结构。

Protein Sci. 2022 Sep;31(9):e4395. doi: 10.1002/pro.4395.

Analysis of 6.4 million SARS-CoV-2 genomes identifies mutations associated with fitness.分析 640 万例 SARS-CoV-2 基因组，鉴定与适应性相关的突变。

Science. 2022 Jun 17;376(6599):1327-1332. doi: 10.1126/science.abm1208. Epub 2022 May 24.

Allostery and Missense Mutations as Intermittently Linked Promising Aspects of Modern Computational Drug Discovery.变构与错义突变作为现代计算药物发现中相互关联的潜在重要方面

J Mol Biol. 2022 Sep 15;434(17):167610. doi: 10.1016/j.jmb.2022.167610. Epub 2022 Apr 28.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

面向预测病毒错义突变的迁移学习：以严重急性呼吸综合征冠状病毒2为例的研究

Transfer learning towards predicting viral missense mutations: A case study on SARS-CoV-2.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献