• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

关注新冠病毒变体:一种预测新型蛋白质突变的深度神经网络方法。

Paying attention to the SARS-CoV-2 dialect : a deep neural network approach to predicting novel protein mutations.

作者信息

Elkin Magdalyn E, Zhu Xingquan

机构信息

Dept. Electrical Engineering and Computer Science, Florida Atlantic University, 777 Glades Road, Boca Raton, FL, 33431, USA.

出版信息

Commun Biol. 2025 Jan 21;8(1):98. doi: 10.1038/s42003-024-07262-7.

DOI:10.1038/s42003-024-07262-7
PMID:39838059
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11751191/
Abstract

Predicting novel mutations has long-lasting impacts on life science research. Traditionally, this problem is addressed through wet-lab experiments, which are often expensive and time consuming. The recent advancement in neural language models has provided stunning results in modeling and deciphering sequences. In this paper, we propose a Deep Novel Mutation Search (DNMS) method, using deep neural networks, to model protein sequence for mutation prediction. We use SARS-CoV-2 spike protein as the target and use a protein language model to predict novel mutations. Different from existing research which is often limited to mutating the reference sequence for prediction, we propose a parent-child mutation prediction paradigm where a parent sequence is modeled for mutation prediction. Because mutations introduce changing context to the underlying sequence, DNMS models three aspects of the protein sequences: semantic changes, grammatical changes, and attention changes, each modeling protein sequence aspects from shifting of semantics, grammar coherence, and amino-acid interactions in latent space. A ranking approach is proposed to combine all three aspects to capture mutations demonstrating evolving traits, in accordance with real-world SARS-CoV-2 spike protein sequence evolution. DNMS can be adopted for an early warning variant detection system, creating public health awareness of future SARS-CoV-2 mutations.

摘要

预测新出现的突变对生命科学研究有着持久的影响。传统上,这个问题是通过湿实验室实验来解决的,而这些实验往往既昂贵又耗时。神经语言模型的最新进展在序列建模和解码方面取得了惊人的成果。在本文中,我们提出了一种深度新突变搜索(DNMS)方法,利用深度神经网络对蛋白质序列进行建模以预测突变。我们以严重急性呼吸综合征冠状病毒2(SARS-CoV-2)刺突蛋白为目标,并使用蛋白质语言模型来预测新出现的突变。与现有研究通常局限于对参考序列进行突变以进行预测不同,我们提出了一种亲子突变预测范式,即对一个亲本序列进行建模以预测突变。由于突变会给基础序列引入不断变化的上下文,DNMS对蛋白质序列的三个方面进行建模:语义变化、语法变化和注意力变化,每个方面都从潜在空间中的语义转移、语法连贯性和氨基酸相互作用来对蛋白质序列方面进行建模。我们提出了一种排序方法,将这三个方面结合起来,以捕捉显示出进化特征的突变,这与现实世界中SARS-CoV-2刺突蛋白序列的进化情况一致。DNMS可用于早期预警变异检测系统,提高公众对未来SARS-CoV-2突变的健康意识。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/89df179a3d88/42003_2024_7262_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/07d1952230a5/42003_2024_7262_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/9c86652d9813/42003_2024_7262_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/6d778ae9aabb/42003_2024_7262_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/5e718dc9be7c/42003_2024_7262_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/ea05c6948635/42003_2024_7262_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/3b5b29bfc513/42003_2024_7262_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/d326d393ed72/42003_2024_7262_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/26f9f869efe4/42003_2024_7262_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/fa6c2736d979/42003_2024_7262_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/89df179a3d88/42003_2024_7262_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/07d1952230a5/42003_2024_7262_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/9c86652d9813/42003_2024_7262_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/6d778ae9aabb/42003_2024_7262_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/5e718dc9be7c/42003_2024_7262_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/ea05c6948635/42003_2024_7262_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/3b5b29bfc513/42003_2024_7262_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/d326d393ed72/42003_2024_7262_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/26f9f869efe4/42003_2024_7262_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/fa6c2736d979/42003_2024_7262_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a54d/11751191/89df179a3d88/42003_2024_7262_Fig10_HTML.jpg

相似文献

1
Paying attention to the SARS-CoV-2 dialect : a deep neural network approach to predicting novel protein mutations.关注新冠病毒变体:一种预测新型蛋白质突变的深度神经网络方法。
Commun Biol. 2025 Jan 21;8(1):98. doi: 10.1038/s42003-024-07262-7.
2
A predictive language model for SARS-CoV-2 evolution.一种用于严重急性呼吸综合征冠状病毒2(SARS-CoV-2)进化的预测性语言模型。
Signal Transduct Target Ther. 2024 Dec 23;9(1):353. doi: 10.1038/s41392-024-02066-x.
3
A Bayesian walker coupled with a computational workflow that generates the micro-evolution of SARS-CoV-2 and makes predictions of new mutations that can emerge.一种贝叶斯游走器与一个计算工作流程相结合,可以生成 SARS-CoV-2 的微观进化,并预测可能出现的新突变。
J Biomol Struct Dyn. 2024;42(21):11603-11611. doi: 10.1080/07391102.2023.2263798. Epub 2023 Sep 28.
4
An in silico deep learning approach to multi-epitope vaccine design: a SARS-CoV-2 case study.基于深度学习的多表位疫苗设计:以 SARS-CoV-2 为例的研究。
Sci Rep. 2021 Feb 5;11(1):3238. doi: 10.1038/s41598-021-81749-9.
5
Deep learning for discriminating non-trivial conformational changes in molecular dynamics simulations of SARS-CoV-2 spike-ACE2.用于区分 SARS-CoV-2 刺突蛋白-ACE2 分子动力学模拟中非平凡构象变化的深度学习。
Sci Rep. 2024 Sep 30;14(1):22639. doi: 10.1038/s41598-024-72842-w.
6
Epistatic models predict mutable sites in SARS-CoV-2 proteins and epitopes.上位模型预测了 SARS-CoV-2 蛋白和表位中的可变位点。
Proc Natl Acad Sci U S A. 2022 Jan 25;119(4). doi: 10.1073/pnas.2113118119.
7
Semi-Supervised Pipeline for Autonomous Annotation of SARS-CoV-2 Genomes.用于 SARS-CoV-2 基因组自主注释的半监督流水线。
Viruses. 2021 Dec 3;13(12):2426. doi: 10.3390/v13122426.
8
COVID-19 CG enables SARS-CoV-2 mutation and lineage tracking by locations and dates of interest.COVID-19 CG 通过关注的地点和日期来实现 SARS-CoV-2 的突变和谱系追踪。
Elife. 2021 Feb 23;10:e63409. doi: 10.7554/eLife.63409.
9
Mapping the Evolutionary Space of SARS-CoV-2 Variants to Anticipate Emergence of Subvariants Resistant to COVID-19 Therapeutics.绘制 SARS-CoV-2 变体的进化空间图以预测对 COVID-19 治疗药物具有抗性的亚变体的出现。
PLoS Comput Biol. 2024 Jun 10;20(6):e1012215. doi: 10.1371/journal.pcbi.1012215. eCollection 2024 Jun.
10
Role of glycosylation mutations at the N-terminal domain of SARS-CoV-2 XEC variant in immune evasion, cell-cell fusion, and spike stability.新型冠状病毒XEC变异株N端结构域糖基化突变在免疫逃逸、细胞间融合及刺突蛋白稳定性中的作用
J Virol. 2025 Apr 15;99(4):e0024225. doi: 10.1128/jvi.00242-25. Epub 2025 Mar 26.

引用本文的文献

1
Evolving fitness and immune escape: a retrospective analysis of SARS-CoV-2 spike protein (2020-2024) using protein language model.不断演变的适应性与免疫逃逸:使用蛋白质语言模型对严重急性呼吸综合征冠状病毒2刺突蛋白(2020 - 2024年)的回顾性分析
Front Immunol. 2025 Jun 18;16:1576414. doi: 10.3389/fimmu.2025.1576414. eCollection 2025.
2
AI-driven techniques for detection and mitigation of SARS-CoV-2 spread: a review, taxonomy, and trends.用于检测和缓解新冠病毒传播的人工智能驱动技术:综述、分类及趋势
Clin Exp Med. 2025 Jun 14;25(1):204. doi: 10.1007/s10238-025-01753-5.

本文引用的文献

1
Deciphering "the language of nature": A transformer-based language model for deleterious mutations in proteins.解读“自然语言”:一种基于Transformer的蛋白质有害突变语言模型。
Innovation (Camb). 2023 Jul 27;4(5):100487. doi: 10.1016/j.xinn.2023.100487. eCollection 2023 Sep 11.
2
Modeling SARS-CoV-2 nucleotide mutations as a stochastic process.模拟 SARS-CoV-2 核苷酸突变作为一个随机过程。
PLoS One. 2023 Apr 28;18(4):e0284874. doi: 10.1371/journal.pone.0284874. eCollection 2023.
3
The evolution of SARS-CoV-2.严重急性呼吸综合征冠状病毒2的进化
Nat Rev Microbiol. 2023 Jun;21(6):361-379. doi: 10.1038/s41579-023-00878-2. Epub 2023 Apr 5.
4
Early computational detection of potential high-risk SARS-CoV-2 variants.早期计算检测潜在的高风险 SARS-CoV-2 变体。
Comput Biol Med. 2023 Mar;155:106618. doi: 10.1016/j.compbiomed.2023.106618. Epub 2023 Feb 2.
5
Convergent Evolution in SARS-CoV-2 Spike Creates a Variant Soup from Which New COVID-19 Waves Emerge.SARS-CoV-2 刺突蛋白的趋同进化导致了新的 COVID-19 浪潮的变体汤的出现。
Int J Mol Sci. 2023 Jan 23;24(3):2264. doi: 10.3390/ijms24032264.
6
Imprinted SARS-CoV-2 humoral immunity induces convergent Omicron RBD evolution.印迹 SARS-CoV-2 体液免疫诱导奥密克戎 RBD 进化趋同。
Nature. 2023 Feb;614(7948):521-529. doi: 10.1038/s41586-022-05644-7. Epub 2022 Dec 19.
7
TEMPO: A transformer-based mutation prediction framework for SARS-CoV-2 evolution.TEMPO:一种基于变压器的 SARS-CoV-2 进化突变预测框架。
Comput Biol Med. 2023 Jan;152:106264. doi: 10.1016/j.compbiomed.2022.106264. Epub 2022 Dec 14.
8
Prediction of Recurrent Mutations in SARS-CoV-2 Using Artificial Neural Networks.利用人工神经网络预测 SARS-CoV-2 的复发性突变。
Int J Mol Sci. 2022 Nov 24;23(23):14683. doi: 10.3390/ijms232314683.
9
Deep mutational scans for ACE2 binding, RBD expression, and antibody escape in the SARS-CoV-2 Omicron BA.1 and BA.2 receptor-binding domains.在 SARS-CoV-2 奥密克戎 BA.1 和 BA.2 受体结合域中进行 ACE2 结合、RBD 表达和抗体逃逸的深度突变扫描。
PLoS Pathog. 2022 Nov 18;18(11):e1010951. doi: 10.1371/journal.ppat.1010951. eCollection 2022 Nov.
10
The roles of APOBEC-mediated RNA editing in SARS-CoV-2 mutations, replication and fitness.APOBEC 介导的 RNA 编辑在 SARS-CoV-2 突变、复制和适应性中的作用。
Sci Rep. 2022 Sep 13;12(1):14972. doi: 10.1038/s41598-022-19067-x.