• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用机器学习模型评估人畜共患病毒溢出风险中的潜在挑战。

Hidden challenges in evaluating spillover risk of zoonotic viruses using machine learning models.

作者信息

Kawasaki Junna, Suzuki Tadaki, Hamada Michiaki

机构信息

Faculty of Science and Engineering, Waseda University, Tokyo, Japan.

Department of Infectious Disease Pathobiology, Graduate School of Medicine, Chiba University, Chiba, Japan.

出版信息

Commun Med (Lond). 2025 May 20;5(1):187. doi: 10.1038/s43856-025-00903-w.

DOI:10.1038/s43856-025-00903-w
PMID:40394176
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12092720/
Abstract

BACKGROUND

Machine learning models have been deployed to assess the zoonotic spillover risk of viruses by identifying their potential for human infectivity. However, the lack of comprehensive datasets for viral infectivity poses a major challenge, limiting the predictable range of viruses.

METHODS

In this study, we address this limitation through two key strategies: constructing expansive datasets across 26 viral families and developing the BERT-infect model, which leverages large language models pre-trained on extensive nucleotide sequences.

RESULTS

Here we show that our approach substantially boosts model performance. This enhancement is particularly notable in segmented RNA viruses, which are involved with severe zoonoses but have been overlooked due to limited data availability. Our model also exhibits high predictive performance even with partial viral sequences, such as high-throughput sequencing reads or contig sequences from de novo sequence assemblies, indicating the model's applicability for mining zoonotic viruses from virus metagenomic data. Furthermore, models trained on data up to 2018 demonstrate robust predictive capability for most viruses identified post-2018. Nonetheless, high-resolution evaluation based on phylogenetic analysis reveals general limitations in current machine learning models: the difficulty in alerting the human infectious risk in specific zoonotic viral lineages, including SARS-CoV-2.

CONCLUSIONS

Our study provides a comprehensive benchmark for viral infectivity prediction models and highlights unresolved issues in fully exploiting machine learning to prepare for future zoonotic threats.

摘要

背景

机器学习模型已被用于通过识别病毒的人类感染潜力来评估病毒的人畜共患病溢出风险。然而,缺乏用于病毒感染性的全面数据集构成了一项重大挑战,限制了病毒的可预测范围。

方法

在本研究中,我们通过两个关键策略解决了这一局限性:构建涵盖26个病毒科的广泛数据集,并开发BERT-infect模型,该模型利用在广泛核苷酸序列上预训练的大语言模型。

结果

我们在此表明,我们的方法显著提高了模型性能。这种提升在分节段RNA病毒中尤为显著,这些病毒与严重的人畜共患病有关,但由于数据可用性有限而被忽视。即使使用部分病毒序列,如高通量测序读数或从头序列组装的重叠群序列,我们的模型也表现出较高的预测性能,这表明该模型适用于从病毒宏基因组数据中挖掘人畜共患病病毒。此外,基于2018年以前的数据训练的模型对2018年后鉴定的大多数病毒表现出强大的预测能力。然而,基于系统发育分析的高分辨率评估揭示了当前机器学习模型的一般局限性:难以警示特定人畜共患病病毒谱系(包括SARS-CoV-2)中的人类感染风险。

结论

我们的研究为病毒感染性预测模型提供了一个全面的基准,并突出了在充分利用机器学习以应对未来人畜共患病威胁方面尚未解决的问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b258/12092720/1146aa51bba4/43856_2025_903_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b258/12092720/0d28ae1fce9e/43856_2025_903_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b258/12092720/9ebdc42b7af0/43856_2025_903_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b258/12092720/b0f15a0d7074/43856_2025_903_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b258/12092720/005ba6d28a45/43856_2025_903_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b258/12092720/1146aa51bba4/43856_2025_903_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b258/12092720/0d28ae1fce9e/43856_2025_903_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b258/12092720/9ebdc42b7af0/43856_2025_903_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b258/12092720/b0f15a0d7074/43856_2025_903_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b258/12092720/005ba6d28a45/43856_2025_903_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b258/12092720/1146aa51bba4/43856_2025_903_Fig5_HTML.jpg

相似文献

1
Hidden challenges in evaluating spillover risk of zoonotic viruses using machine learning models.使用机器学习模型评估人畜共患病毒溢出风险中的潜在挑战。
Commun Med (Lond). 2025 May 20;5(1):187. doi: 10.1038/s43856-025-00903-w.
2
Variation in the ACE2 receptor has limited utility for SARS-CoV-2 host prediction.ACE2 受体的变异性对预测 SARS-CoV-2 的宿主有限用性。
Elife. 2022 Nov 23;11:e80329. doi: 10.7554/eLife.80329.
3
Strategy To Assess Zoonotic Potential Reveals Low Risk Posed by SARS-Related Coronaviruses from Bat and Pangolin.评估人畜共患病潜力的策略表明,来自蝙蝠和穿山甲的 SARS 相关冠状病毒风险较低。
mBio. 2023 Apr 25;14(2):e0328522. doi: 10.1128/mbio.03285-22. Epub 2023 Feb 14.
4
Identifying and prioritizing potential human-infecting viruses from their genome sequences.从基因组序列中识别和确定潜在的感染人类的病毒,并对其进行优先级排序。
PLoS Biol. 2021 Sep 28;19(9):e3001390. doi: 10.1371/journal.pbio.3001390. eCollection 2021 Sep.
5
Characterizing and Evaluating the Zoonotic Potential of Novel Viruses Discovered in Vampire Bats.描述和评估在吸血蝙蝠中发现的新型病毒的人畜共患潜力。
Viruses. 2021 Feb 6;13(2):252. doi: 10.3390/v13020252.
6
Ranking the risk of animal-to-human spillover for newly discovered viruses.对新发现病毒的人畜共患传播风险进行排名。
Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2002324118.
7
Metagenome-Assembled Viral Genomes Analysis Reveals Diversity and Infectivity of the RNA Virome of Gerbillinae Species.宏基因组组装病毒基因组分析揭示沙鼠亚科物种RNA病毒组的多样性和感染性。
Viruses. 2022 Feb 9;14(2):356. doi: 10.3390/v14020356.
8
Metagenomics in the fight against zoonotic viral infections: A focus on SARS-CoV-2 analogues.宏基因组学在对抗人畜共患病毒感染中的作用:聚焦 SARS-CoV-2 类似物。
J Virol Methods. 2024 Jan;323:114837. doi: 10.1016/j.jviromet.2023.114837. Epub 2023 Oct 31.
9
Viral discovery as a tool for pandemic preparedness.病毒发现作为大流行防范的一种工具。
Rev Sci Tech. 2017 Aug;36(2):499-512. doi: 10.20506/rst.36.2.2669.
10
Host and viral traits predict zoonotic spillover from mammals.宿主和病毒特征可预测哺乳动物的人畜共患病传播。
Nature. 2017 Jun 29;546(7660):646-650. doi: 10.1038/nature22975. Epub 2017 Jun 21.

引用本文的文献

1
Clinical metagenomics for diagnosis and surveillance of viral pathogens.用于病毒病原体诊断和监测的临床宏基因组学。
Nat Rev Microbiol. 2025 Aug 13. doi: 10.1038/s41579-025-01223-5.
2
An Update on RNA Virus Discovery: Current Challenges and Future Perspectives.RNA病毒发现的最新进展:当前挑战与未来展望
Viruses. 2025 Jul 15;17(7):983. doi: 10.3390/v17070983.

本文引用的文献

1
Emergence and interstate spread of highly pathogenic avian influenza A(H5N1) in dairy cattle in the United States.美国奶牛中高致病性甲型禽流感(H5N1)的出现及跨州传播
Science. 2025 Apr 25;388(6745):eadq0900. doi: 10.1126/science.adq0900.
2
Receptor-binding proteins from animal viruses are broadly compatible with human cell entry factors.来自动物病毒的受体结合蛋白与人类细胞进入因子具有广泛的兼容性。
Nat Microbiol. 2025 Feb;10(2):405-419. doi: 10.1038/s41564-024-01879-4. Epub 2025 Jan 2.
3
Prediction of virus-host associations using protein language models and multiple instance learning.
使用蛋白质语言模型和多实例学习预测病毒-宿主关联
PLoS Comput Biol. 2024 Nov 19;20(11):e1012597. doi: 10.1371/journal.pcbi.1012597. eCollection 2024 Nov.
4
Outbreak of Highly Pathogenic Avian Influenza A(H5N1) Viruses in U.S. Dairy Cattle and Detection of Two Human Cases - United States, 2024.2024年美国奶牛中高致病性甲型流感病毒(H5N1)疫情及两例人类病例的检测——美国
MMWR Morb Mortal Wkly Rep. 2024 May 30;73(21):501-505. doi: 10.15585/mmwr.mm7321e1.
5
Highly Pathogenic Avian Influenza A(H5N1) Clade 2.3.4.4b Virus Infection in Domestic Dairy Cattle and Cats, United States, 2024.美国 2024 年家养奶牛和猫感染高致病性禽流感 A(H5N1) 病毒 2.3.4.4b 分支
Emerg Infect Dis. 2024 Jul;30(7):1335-1343. doi: 10.3201/eid3007.240508. Epub 2024 Apr 29.
6
HostNet: improved sequence representation in deep neural networks for virus-host prediction.宿主网络:用于病毒-宿主预测的深度神经网络中改进的序列表示。
BMC Bioinformatics. 2023 Dec 1;24(1):455. doi: 10.1186/s12859-023-05582-9.
7
Learning from prepandemic data to forecast viral escape.从大流行前的数据中学习以预测病毒逃逸。
Nature. 2023 Oct;622(7984):818-825. doi: 10.1038/s41586-023-06617-0. Epub 2023 Oct 11.
8
Annual (2023) taxonomic update of RNA-directed RNA polymerase-encoding negative-sense RNA viruses (realm : kingdom : phylum ).年度(2023)RNA 指导的 RNA 聚合酶编码负义 RNA 病毒(领域:界:门)分类学更新。
J Gen Virol. 2023 Aug;104(8). doi: 10.1099/jgv.0.001864.
9
Predicting zoonotic potential of viruses: where are we?预测病毒的人畜共患潜力:我们在哪里?
Curr Opin Virol. 2023 Aug;61:101346. doi: 10.1016/j.coviro.2023.101346. Epub 2023 Jul 27.
10
Effect of polymorphism in Rhinolophus affinis ACE2 on entry of SARS-CoV-2 related bat coronaviruses.长耳菊蝠 ACE2 多态性对 SARS-CoV-2 相关蝙蝠冠状病毒进入的影响。
PLoS Pathog. 2023 Jan 23;19(1):e1011116. doi: 10.1371/journal.ppat.1011116. eCollection 2023 Jan.