• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对严重急性呼吸综合征冠状病毒2(SARS-CoV-2)种群结构的无监督聚类分析揭示了全球早期阶段的六种主要亚型。

Unsupervised clustering analysis of SARS-Cov-2 population structure reveals six major subtypes at early stage across the world.

作者信息

Li Yawei, Liu Qingyun, Zeng Zexian, Luo Yuan

机构信息

Department of Preventive Medicine, Northwestern University, Feinberg School of Medicine, Chicago, IL 60611, USA.

Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Boston, MA 02115, USA.

出版信息

bioRxiv. 2021 Nov 24:2020.09.04.283358. doi: 10.1101/2020.09.04.283358.

DOI:10.1101/2020.09.04.283358
PMID:34845455
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8629198/
Abstract

Identifying the population structure of the newly emerged coronavirus SARS-CoV-2 has significant potential to inform public health management and diagnosis. As SARS-CoV-2 sequencing data accrued, grouping them into clusters is important for organizing the landscape of the population structure of the virus. Due to the limited prior information on the newly emerged coronavirus, we utilized four different clustering algorithms to group 16,873 SARS-CoV-2 strains, which automatically enables the identification of spatial structure for SARS-CoV-2. A total of six distinct genomic clusters were identified using mutation profiles as input features. Comparison of the clustering results reveals that the four algorithms produced highly consistent results, but the state-of-the-art unsupervised deep learning clustering algorithm performed best and produced the smallest intra-cluster pairwise genetic distances. The varied proportions of the six clusters within different continents revealed specific geographical distributions. In particular, our analysis found that Oceania was the only continent on which the strains were dispersively distributed into six clusters. In summary, this study provides a concrete framework for the use of clustering methods to study the global population structure of SARS-CoV-2. In addition, clustering methods can be used for future studies of variant population structures in specific regions of these fast-growing viruses.

摘要

识别新出现的冠状病毒SARS-CoV-2的种群结构对于指导公共卫生管理和诊断具有重要潜力。随着SARS-CoV-2测序数据的积累,将它们分组为簇对于梳理病毒种群结构的全貌很重要。由于关于新出现的冠状病毒的先验信息有限,我们使用了四种不同的聚类算法对16873个SARS-CoV-2毒株进行分组,这自动实现了对SARS-CoV-2空间结构的识别。以突变谱作为输入特征,共识别出六个不同的基因组簇。聚类结果的比较表明,这四种算法产生了高度一致的结果,但最先进的无监督深度学习聚类算法表现最佳,产生的簇内成对遗传距离最小。六个簇在不同大陆中的比例各不相同,揭示了特定的地理分布。特别是,我们的分析发现大洋洲是唯一一个毒株分散分布在六个簇中的大陆。总之,本研究为使用聚类方法研究SARS-CoV-2的全球种群结构提供了一个具体框架。此外,聚类方法可用于未来对这些快速传播病毒特定区域变异种群结构的研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a1d/8629198/26fb37132099/nihpp-2020.09.04.283358v4-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a1d/8629198/688e26e940b9/nihpp-2020.09.04.283358v4-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a1d/8629198/6e069585efca/nihpp-2020.09.04.283358v4-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a1d/8629198/26fb37132099/nihpp-2020.09.04.283358v4-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a1d/8629198/688e26e940b9/nihpp-2020.09.04.283358v4-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a1d/8629198/6e069585efca/nihpp-2020.09.04.283358v4-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a1d/8629198/26fb37132099/nihpp-2020.09.04.283358v4-f0003.jpg

相似文献

1
Unsupervised clustering analysis of SARS-Cov-2 population structure reveals six major subtypes at early stage across the world.对严重急性呼吸综合征冠状病毒2(SARS-CoV-2)种群结构的无监督聚类分析揭示了全球早期阶段的六种主要亚型。
bioRxiv. 2021 Nov 24:2020.09.04.283358. doi: 10.1101/2020.09.04.283358.
2
Using an Unsupervised Clustering Model to Detect the Early Spread of SARS-CoV-2 Worldwide.利用无监督聚类模型检测 SARS-CoV-2 在全球范围内的早期传播。
Genes (Basel). 2022 Apr 7;13(4):648. doi: 10.3390/genes13040648.
3
Contrasting Epidemiology and Population Genetics of COVID-19 Infections Defined by Multilocus Genotypes in SARS-CoV-2 Genomes Sampled Globally.从全球采集的 SARS-CoV-2 基因组中的多位点基因型定义的 COVID-19 感染的对比流行病学和群体遗传学。
Viruses. 2022 Jun 29;14(7):1434. doi: 10.3390/v14071434.
4
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
5
Global lineage evolution pattern of sars-cov-2 in Africa, America, Europe, and Asia: A comparative analysis of variant clusters and their relevance across continents.非洲、美洲、欧洲和亚洲的新冠病毒全球谱系进化模式:变异簇的比较分析及其跨洲相关性
J Transl Int Med. 2023 Dec 20;11(4):410-422. doi: 10.2478/jtim-2023-0118. eCollection 2023 Dec.
6
Genome-wide analysis of 10664 SARS-CoV-2 genomes to identify virus strains in 73 countries based on single nucleotide polymorphism.对 10664 个 SARS-CoV-2 基因组进行全基因组分析,基于单核苷酸多态性在 73 个国家识别病毒株。
Virus Res. 2021 Jun;298:198401. doi: 10.1016/j.virusres.2021.198401. Epub 2021 Mar 26.
7
Key genetic elements, single and in clusters, underlying geographically dependent SARS-CoV-2 genetic adaptation and their impact on binding affinity for drugs and immune control.关键遗传要素,无论是单一存在还是成簇存在,都是导致 SARS-CoV-2 地理依赖性遗传适应性的基础,它们对药物结合亲和力和免疫控制的影响。
J Antimicrob Chemother. 2021 Jan 19;76(2):396-412. doi: 10.1093/jac/dkaa444.
8
Cov2clusters: genomic clustering of SARS-CoV-2 sequences.Cov2clusters:SARS-CoV-2 序列的基因组聚类。
BMC Genomics. 2022 Oct 19;23(1):710. doi: 10.1186/s12864-022-08936-4.
9
An improved Fuzzy based GWO algorithm for predicting the potential host receptor of COVID-19 infection.基于改进的模糊灰狼优化算法预测 COVID-19 感染的潜在宿主受体。
Comput Biol Med. 2022 Dec;151(Pt A):106050. doi: 10.1016/j.compbiomed.2022.106050. Epub 2022 Aug 25.
10
Analysis of RNA sequences of 3636 SARS-CoV-2 collected from 55 countries reveals selective sweep of one virus type.对来自 55 个国家的 3636 个 SARS-CoV-2 的 RNA 序列进行分析,揭示了一种病毒类型的选择清除。
Indian J Med Res. 2020 May;151(5):450-458. doi: 10.4103/ijmr.IJMR_1125_20.

引用本文的文献

1
The Infection and Pathogenicity of SARS-CoV-2 Variant B.1.351 in hACE2 Mice.严重急性呼吸综合征冠状病毒2变体B.1.351在人血管紧张素转换酶2小鼠中的感染性与致病性
Virol Sin. 2021 Oct;36(5):1232-1235. doi: 10.1007/s12250-021-00452-1. Epub 2021 Sep 27.
2
Safety and Considerations of the COVID-19 Vaccine Massive Deployment.新冠疫苗大规模接种的安全性及注意事项
Virol Sin. 2021 Oct;36(5):1097-1103. doi: 10.1007/s12250-021-00408-5. Epub 2021 Jun 1.

本文引用的文献

1
Genetic-Based Hypertension Subtype Identification Using Informative SNPs.基于遗传的高血压亚型信息 SNP 识别。
Genes (Basel). 2020 Oct 27;11(11):1265. doi: 10.3390/genes11111265.
2
The global population of SARS-CoV-2 is composed of six major subtypes.全球的 SARS-CoV-2 病毒种群由六大主要亚型构成。
Sci Rep. 2020 Oct 26;10(1):18289. doi: 10.1038/s41598-020-74050-8.
3
An Early Pandemic Analysis of SARS-CoV-2 Population Structure and Dynamics in Arizona.亚利桑那州 SARS-CoV-2 种群结构和动态的早期大流行分析。
mBio. 2020 Sep 4;11(5):e02107-20. doi: 10.1128/mBio.02107-20.
4
Tracking the COVID-19 pandemic in Australia using genomics.利用基因组学追踪澳大利亚的 COVID-19 疫情。
Nat Commun. 2020 Sep 1;11(1):4376. doi: 10.1038/s41467-020-18314-x.
5
Derivation and Validation of Novel Phenotypes of Multiple Organ Dysfunction Syndrome in Critically Ill Children.危重症儿童多器官功能障碍综合征新表型的推导和验证。
JAMA Netw Open. 2020 Aug 3;3(8):e209271. doi: 10.1001/jamanetworkopen.2020.9271.
6
Variant analysis of SARS-CoV-2 genomes.SARS-CoV-2 基因组变异分析。
Bull World Health Organ. 2020 Jul 1;98(7):495-504. doi: 10.2471/BLT.20.253591. Epub 2020 Jun 2.
7
COVID-19: Epidemiology, Evolution, and Cross-Disciplinary Perspectives.新型冠状病毒肺炎:流行病学、进化与跨学科视角。
Trends Mol Med. 2020 May;26(5):483-495. doi: 10.1016/j.molmed.2020.02.008. Epub 2020 Mar 21.
8
Decoding the evolution and transmissions of the novel pneumonia coronavirus (SARS-CoV-2 / HCoV-19) using whole genomic data.利用全基因组数据解码新型肺炎冠状病毒(SARS-CoV-2/HCoV-19)的进化和传播。
Zool Res. 2020 May 18;41(3):247-257. doi: 10.24272/j.issn.2095-8137.2020.022.
9
Phylogenetic network analysis of SARS-CoV-2 genomes.SARS-CoV-2 基因组的系统发育网络分析。
Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9241-9243. doi: 10.1073/pnas.2004999117. Epub 2020 Apr 8.
10
Evolutionary Trajectory for the Emergence of Novel Coronavirus SARS-CoV-2.新型冠状病毒SARS-CoV-2出现的进化轨迹。
Pathogens. 2020 Mar 23;9(3):240. doi: 10.3390/pathogens9030240.