• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过贝叶斯主动学习和生物物理学进行少样本病毒变体检测

Few-Shot Viral Variant Detection via Bayesian Active Learning and Biophysics.

作者信息

Huot Marian, Wang Dianzhuo, Liu Jiacheng, Shakhnovich Eugene

机构信息

Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA.

Laboratory of Physics of the Ecole Normale Supérieure, CNRS UMR 8023 and PSL Research, Sorbonne Université.

出版信息

bioRxiv. 2025 Mar 13:2025.03.12.642881. doi: 10.1101/2025.03.12.642881.

DOI:10.1101/2025.03.12.642881
PMID:40161822
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11952382/
Abstract

The early detection of high-fitness viral variants is critical for pandemic response, yet limited experimental resources at the onset of variant emergence hinder effective identification. To address this, we introduce an active learning framework that integrates protein language model ESM3, Gaussian process with uncertainty estimation, and a biophysical model to predict the fitness of novel variants in a few-shot learning setting. By benchmarking on past SARS-CoV-2 data, we demonstrate that our methods accelerates the identification of high-fitness variants by up to fivefold compared to random sampling while requiring experimental characterization of fewer than 1% of possible variants. We also demonstrate that our framework benchmarked on deep mutational scans effectively identifies sites that are frequently mutated during natural viral evolution with a predictive advantage of up to two years compared to baseline strategies, particularly those enabling antibody escape while preserving ACE2 binding. Through systematic analysis of different acquisition strategies, we show that incorporating uncertainty in variant selection enables broader exploration of the sequence landscape, leading to the discovery of evolutionarily distant but potentially dangerous variants. Our results suggest that this framework could serve as an effective early warning system for identifying concerning SARS-CoV-2 variants and potentially emerging viruses with pandemic potential before they achieve widespread circulation.

摘要

高适应性病毒变体的早期检测对于应对大流行至关重要,但在变体出现之初有限的实验资源阻碍了有效识别。为了解决这一问题,我们引入了一个主动学习框架,该框架整合了蛋白质语言模型ESM3、带不确定性估计的高斯过程和一个生物物理模型,以在少样本学习设置中预测新型变体的适应性。通过对过去的SARS-CoV-2数据进行基准测试,我们证明,与随机抽样相比,我们的方法将高适应性变体的识别速度提高了五倍,同时所需的可能变体实验表征不到1%。我们还证明,我们基于深度突变扫描进行基准测试的框架能够有效地识别自然病毒进化过程中频繁突变的位点,与基线策略相比,预测优势高达两年,特别是那些在保留ACE2结合的同时实现抗体逃逸的位点。通过对不同获取策略的系统分析,我们表明在变体选择中纳入不确定性能够更广泛地探索序列景观,从而发现进化上距离较远但可能危险的变体。我们的结果表明,该框架可作为一种有效的早期预警系统,用于在具有大流行潜力的SARS-CoV-2变体和潜在新兴病毒广泛传播之前识别它们。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aee2/11952382/16120e207341/nihpp-2025.03.12.642881v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aee2/11952382/1074d530e377/nihpp-2025.03.12.642881v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aee2/11952382/eeb4e4186e29/nihpp-2025.03.12.642881v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aee2/11952382/0aad6d11c9e2/nihpp-2025.03.12.642881v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aee2/11952382/16120e207341/nihpp-2025.03.12.642881v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aee2/11952382/1074d530e377/nihpp-2025.03.12.642881v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aee2/11952382/eeb4e4186e29/nihpp-2025.03.12.642881v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aee2/11952382/0aad6d11c9e2/nihpp-2025.03.12.642881v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aee2/11952382/16120e207341/nihpp-2025.03.12.642881v1-f0004.jpg

相似文献

1
Few-Shot Viral Variant Detection via Bayesian Active Learning and Biophysics.通过贝叶斯主动学习和生物物理学进行少样本病毒变体检测
bioRxiv. 2025 Mar 13:2025.03.12.642881. doi: 10.1101/2025.03.12.642881.
2
Predicting high-fitness viral protein variants with Bayesian active learning and biophysics.利用贝叶斯主动学习和生物物理学预测高适应性病毒蛋白变体
Proc Natl Acad Sci U S A. 2025 Jun 17;122(24):e2503742122. doi: 10.1073/pnas.2503742122. Epub 2025 Jun 9.
3
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
4
Sexual Harassment and Prevention Training性骚扰与预防培训
5
Antibody tests for identification of current and past infection with SARS-CoV-2.抗体检测用于鉴定 SARS-CoV-2 的现症感染和既往感染。
Cochrane Database Syst Rev. 2022 Nov 17;11(11):CD013652. doi: 10.1002/14651858.CD013652.pub2.
6
Comparison of self-administered survey questionnaire responses collected using mobile apps versus other methods.使用移动应用程序与其他方法收集的自我管理调查问卷回复的比较。
Cochrane Database Syst Rev. 2015 Jul 27;2015(7):MR000042. doi: 10.1002/14651858.MR000042.pub2.
7
Measures implemented in the school setting to contain the COVID-19 pandemic.学校为控制 COVID-19 疫情而采取的措施。
Cochrane Database Syst Rev. 2022 Jan 17;1(1):CD015029. doi: 10.1002/14651858.CD015029.
8
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
9
Systemic Inflammatory Response Syndrome全身炎症反应综合征
10
Perceptions and experiences of the prevention, detection, and management of postpartum haemorrhage: a qualitative evidence synthesis.预防、检测和管理产后出血的认知和经验:定性证据综合。
Cochrane Database Syst Rev. 2023 Nov 27;11(11):CD013795. doi: 10.1002/14651858.CD013795.pub2.

本文引用的文献

1
A protein language model for exploring viral fitness landscapes.一种用于探索病毒适应性景观的蛋白质语言模型。
Nat Commun. 2025 May 13;16(1):4236. doi: 10.1038/s41467-025-59422-w.
2
A systematic evaluation of the language-of-viral-escape model using multiple machine learning frameworks.使用多个机器学习框架对病毒逃逸模型语言进行的系统评估。
J R Soc Interface. 2025 Apr;22(225):20240598. doi: 10.1098/rsif.2024.0598. Epub 2025 Apr 30.
3
Simulating 500 million years of evolution with a language model.用语言模型模拟5亿年的进化历程。
Science. 2025 Feb 21;387(6736):850-858. doi: 10.1126/science.ads0018. Epub 2025 Jan 16.
4
SARS-CoV-2 Omicron XBB lineage spike structures, conformations, antigenicity, and receptor recognition.SARS-CoV-2 奥密克戎 XBB 谱系刺突结构、构象、抗原性和受体识别。
Mol Cell. 2024 Jul 25;84(14):2747-2764.e7. doi: 10.1016/j.molcel.2024.06.028.
5
Biophysical principles predict fitness of SARS-CoV-2 variants.生物物理原理预测 SARS-CoV-2 变体的适应性。
Proc Natl Acad Sci U S A. 2024 Jun 4;121(23):e2314518121. doi: 10.1073/pnas.2314518121. Epub 2024 May 31.
6
Biophysical evolution of the receptor-binding domains of SARS-CoVs.SARS-CoV 受体结合域的生物物理进化。
Biophys J. 2023 Dec 5;122(23):4489-4502. doi: 10.1016/j.bpj.2023.10.026. Epub 2023 Oct 28.
7
Population immunity predicts evolutionary trajectories of SARS-CoV-2.人群免疫力预测了 SARS-CoV-2 的进化轨迹。
Cell. 2023 Nov 9;186(23):5151-5164.e13. doi: 10.1016/j.cell.2023.09.022. Epub 2023 Oct 23.
8
Neutralization, effector function and immune imprinting of Omicron variants.奥密克戎变异株的中和作用、效应功能和免疫印迹。
Nature. 2023 Sep;621(7979):592-601. doi: 10.1038/s41586-023-06487-6. Epub 2023 Aug 30.
9
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
10
The landscape of antibody binding affinity in SARS-CoV-2 Omicron BA.1 evolution.SARS-CoV-2 奥密克戎 BA.1 进化过程中抗体结合亲和力的全景。
Elife. 2023 Feb 21;12:e83442. doi: 10.7554/eLife.83442.