• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过可扩展的机器学习方法对 SARS-CoV-2 的重要谱系进行无监督识别。

Unsupervised identification of significant lineages of SARS-CoV-2 through scalable machine learning methods.

机构信息

Department of Mathematics, The University of Manchester, Manchester M13 9PL, United Kingdom.

United Kingdom Health Security Agency, University of Oxford, Oxford OX3 7LF, United Kingdom.

出版信息

Proc Natl Acad Sci U S A. 2024 Mar 19;121(12):e2317284121. doi: 10.1073/pnas.2317284121. Epub 2024 Mar 13.

DOI:10.1073/pnas.2317284121
PMID:38478692
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10962941/
Abstract

Since its emergence in late 2019, SARS-CoV-2 has diversified into a large number of lineages and caused multiple waves of infection globally. Novel lineages have the potential to spread rapidly and internationally if they have higher intrinsic transmissibility and/or can evade host immune responses, as has been seen with the Alpha, Delta, and Omicron variants of concern. They can also cause increased mortality and morbidity if they have increased virulence, as was seen for Alpha and Delta. Phylogenetic methods provide the "gold standard" for representing the global diversity of SARS-CoV-2 and to identify newly emerging lineages. However, these methods are computationally expensive, struggle when datasets get too large, and require manual curation to designate new lineages. These challenges provide a motivation to develop complementary methods that can incorporate all of the genetic data available without down-sampling to extract meaningful information rapidly and with minimal curation. In this paper, we demonstrate the utility of using algorithmic approaches based on word-statistics to represent whole sequences, bringing speed, scalability, and interpretability to the construction of genetic topologies. While not serving as a substitute for current phylogenetic analyses, the proposed methods can be used as a complementary, and fully automatable, approach to identify and confirm new emerging variants.

摘要

自 2019 年底出现以来,SARS-CoV-2 已经多样化为许多谱系,并在全球范围内引发了多波感染。如果新的谱系具有更高的内在传染性和/或能够逃避宿主免疫反应,就像关注的 Alpha、Delta 和奥密克戎变体那样,它们有可能迅速在国际上传播。如果它们的毒力增加,就像 Alpha 和 Delta 那样,也会导致死亡率和发病率增加。系统发育方法为代表 SARS-CoV-2 的全球多样性并识别新出现的谱系提供了“金标准”。然而,这些方法计算成本高,当数据集变得太大时难以处理,并且需要手动策展来指定新的谱系。这些挑战提供了开发互补方法的动力,这些方法可以在不进行下采样的情况下整合所有可用的遗传数据,以便快速提取有意义的信息,同时进行最小的策展。在本文中,我们展示了使用基于词统计的算法方法来表示整个序列的效用,为构建遗传拓扑结构带来了速度、可扩展性和可解释性。虽然不能替代当前的系统发育分析,但所提出的方法可以作为一种补充的、完全自动化的方法来识别和确认新出现的变体。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e32/10962941/a08d9abe1e99/pnas.2317284121fig04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e32/10962941/87b3cb1240cc/pnas.2317284121fig01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e32/10962941/a2946083f72a/pnas.2317284121fig02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e32/10962941/51fcda877197/pnas.2317284121fig03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e32/10962941/a08d9abe1e99/pnas.2317284121fig04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e32/10962941/87b3cb1240cc/pnas.2317284121fig01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e32/10962941/a2946083f72a/pnas.2317284121fig02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e32/10962941/51fcda877197/pnas.2317284121fig03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1e32/10962941/a08d9abe1e99/pnas.2317284121fig04.jpg

相似文献

1
Unsupervised identification of significant lineages of SARS-CoV-2 through scalable machine learning methods.通过可扩展的机器学习方法对 SARS-CoV-2 的重要谱系进行无监督识别。
Proc Natl Acad Sci U S A. 2024 Mar 19;121(12):e2317284121. doi: 10.1073/pnas.2317284121. Epub 2024 Mar 13.
2
Emerging Variants of SARS-CoV-2 and Novel Therapeutics Against Coronavirus (COVID-19)严重急性呼吸综合征冠状病毒2(SARS-CoV-2)的新变种及针对冠状病毒(COVID-19)的新型疗法
3
Evolutionary and Phylogenetic Dynamics of SARS-CoV-2 Variants: A Genetic Comparative Study of Taiyuan and Wuhan Cities of China.SARS-CoV-2 变异株的进化与系统发育动态:中国太原市与武汉市的遗传比较研究。
Viruses. 2024 Jun 3;16(6):907. doi: 10.3390/v16060907.
4
Phenotyping the virulence of SARS-CoV-2 variants in hamsters by digital pathology and machine learning.通过数字病理学和机器学习对 SARS-CoV-2 变体在仓鼠中的毒力进行表型分析。
PLoS Pathog. 2023 Nov 7;19(11):e1011589. doi: 10.1371/journal.ppat.1011589. eCollection 2023 Nov.
5
Unraveling the Dynamics of Omicron (BA.1, BA.2, and BA.5) Waves and Emergence of the Deltacton Variant: Genomic Epidemiology of the SARS-CoV-2 Epidemic in Cyprus (Oct 2021-Oct 2022).解析奥密克戎(BA.1、BA.2 和 BA.5)波动态及德尔塔克戎变异株出现:塞浦路斯 2021 年 10 月至 2022 年 10 月期间 SARS-CoV-2 流行的基因组流行病学。
Viruses. 2023 Sep 15;15(9):1933. doi: 10.3390/v15091933.
6
Taxonium, a web-based tool for exploring large phylogenetic trees.Taxonium,一个用于探索大型系统发育树的网络工具。
Elife. 2022 Nov 15;11:e82392. doi: 10.7554/eLife.82392.
7
Prospective clinical performance of CoVarScan in identifying SARS-CoV-2 Omicron subvariants.CoVarScan在识别严重急性呼吸综合征冠状病毒2(SARS-CoV-2)奥密克戎亚变体方面的前瞻性临床性能。
Microbiol Spectr. 2025 Jan 7;13(1):e0138524. doi: 10.1128/spectrum.01385-24. Epub 2024 Dec 11.
8
Emergency SARS-CoV-2 Variants of Concern: Novel Multiplex Real-Time RT-PCR Assay for Rapid Detection and Surveillance.关注的紧急 SARS-CoV-2 变异株:用于快速检测和监测的新型多重实时 RT-PCR 检测方法。
Microbiol Spectr. 2022 Feb 23;10(1):e0251321. doi: 10.1128/spectrum.02513-21.
9
Wastewater-Based Epidemiology to Describe the Evolution of SARS-CoV-2 in the South-East of Spain, and Application of Phylogenetic Analysis and a Machine Learning Approach.基于污水的流行病学描述西班牙东南部 SARS-CoV-2 的演变,以及系统发生分析和机器学习方法的应用。
Viruses. 2023 Jul 3;15(7):1499. doi: 10.3390/v15071499.
10
Rapid spread of the SARS-CoV-2 Omicron XDR lineage derived from recombination between XBB and BA.2.86 subvariants circulating in Brazil in late 2023.2023年末在巴西流行的XBB和BA.2.86亚变体之间重组产生的SARS-CoV-2奥密克戎XDR谱系迅速传播。
Microbiol Spectr. 2025 Jan 7;13(1):e0119324. doi: 10.1128/spectrum.01193-24. Epub 2024 Nov 29.

引用本文的文献

1
Machine Learning and Artificial Intelligence for Infectious Disease Surveillance, Diagnosis, and Prognosis.用于传染病监测、诊断和预后的机器学习与人工智能
Viruses. 2025 Jun 23;17(7):882. doi: 10.3390/v17070882.
2
SARS-CoV-2 epidemiology, kinetics, and evolution: A narrative review.严重急性呼吸综合征冠状病毒2型的流行病学、动力学及进化:一篇综述
Virulence. 2025 Dec;16(1):2480633. doi: 10.1080/21505594.2025.2480633. Epub 2025 Apr 8.

本文引用的文献

1
Phylogenomic early warning signals for SARS-CoV-2 epidemic waves.新冠病毒流行波的系统发生基因组预警信号。
EBioMedicine. 2024 Feb;100:104939. doi: 10.1016/j.ebiom.2023.104939. Epub 2024 Jan 8.
2
Identifying disease-related microbes based on multi-scale variational graph autoencoder embedding Wasserstein distance.基于多尺度变分图自动编码器嵌入 Wasserstein 距离的疾病相关微生物识别。
BMC Biol. 2023 Dec 20;21(1):294. doi: 10.1186/s12915-023-01796-8.
3
Lineage replacement and evolution captured by 3 years of the United Kingdom Coronavirus (COVID-19) Infection Survey.
通过英国冠状病毒(COVID-19)感染调查三年的数据,捕捉到了谱系替换和进化。
Proc Biol Sci. 2023 Oct 25;290(2009):20231284. doi: 10.1098/rspb.2023.1284. Epub 2023 Oct 18.
4
Antibody escape of SARS-CoV-2 Omicron BA.4 and BA.5 from vaccine and BA.1 serum.奥密克戎 BA.4 和 BA.5 对疫苗和 BA.1 血清的抗体逃逸。
Cell. 2022 Jul 7;185(14):2422-2433.e13. doi: 10.1016/j.cell.2022.06.005. Epub 2022 Jun 9.
5
Emergence of SARS-CoV-2 Omicron lineages BA.4 and BA.5 in South Africa.南非出现 SARS-CoV-2 奥密克戎变异株 BA.4 和 BA.5。
Nat Med. 2022 Sep;28(9):1785-1790. doi: 10.1038/s41591-022-01911-2. Epub 2022 Jun 27.
6
BA.2.12.1, BA.4 and BA.5 escape antibodies elicited by Omicron infection.BA.2.12.1、BA.4 和 BA.5 逃避奥密克戎感染诱导的抗体。
Nature. 2022 Aug;608(7923):593-602. doi: 10.1038/s41586-022-04980-y. Epub 2022 Jun 17.
7
Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool.使用穿山甲工具对新出现的大流行中的流行病学谱系进行分类。
Virus Evol. 2021 Jul 30;7(2):veab064. doi: 10.1093/ve/veab064. eCollection 2021.
8
The landscape of antibody binding in SARS-CoV-2 infection.SARS-CoV-2 感染中抗体结合的全景。
PLoS Biol. 2021 Jun 18;19(6):e3001265. doi: 10.1371/journal.pbio.3001265. eCollection 2021 Jun.
9
Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic.超快现有树木样本放置 (UShER) 可实现 SARS-CoV-2 大流行的实时系统发生学。
Nat Genet. 2021 Jun;53(6):809-816. doi: 10.1038/s41588-021-00862-7. Epub 2021 May 10.
10
Learning vector quantization as an interpretable classifier for the detection of SARS-CoV-2 types based on their RNA sequences.学习向量量化作为一种基于RNA序列检测新冠病毒类型的可解释分类器。
Neural Comput Appl. 2022;34(1):67-78. doi: 10.1007/s00521-021-06018-2. Epub 2021 Apr 27.