Suppr超能文献

大流行第一年中 15 种不同谱系的 SARS-CoV-2 的准种分析促使人们对共识基因组序列进行深入研究。

Quasispecies Analysis of SARS-CoV-2 of 15 Different Lineages during the First Year of the Pandemic Prompts Scratching under the Surface of Consensus Genome Sequences.

机构信息

IHU Méditerranée Infection, 19-21 Boulevard Jean Moulin, 13005 Marseille, France.

Microbes Evolution Phylogeny and Infections (MEPHI), Institut de Recherche pour le Développement (IRD), Aix-Marseille University, 27 Boulevard Jean Moulin, 13005 Marseille, France.

出版信息

Int J Mol Sci. 2022 Dec 10;23(24):15658. doi: 10.3390/ijms232415658.

Abstract

The tremendous majority of SARS-CoV-2 genomic data so far neglected intra-host genetic diversity. Here, we studied SARS-CoV-2 quasispecies based on data generated by next-generation sequencing (NGS) of complete genomes. SARS-CoV-2 raw NGS data had been generated for nasopharyngeal samples collected between March 2020 and February 2021 by the Illumina technology on a MiSeq instrument, without prior PCR amplification. To analyze viral quasispecies, we designed and implemented an in-house Excel file (“QuasiS”) that can characterize intra-sample nucleotide diversity along the genomes using data of the mapping of NGS reads. We compared intra-sample genetic diversity and global genetic diversity available from Nextstrain. Hierarchical clustering of all samples based on the intra-sample genetic diversity was performed and visualized with the Morpheus web application. NGS mapping data from 110 SARS-CoV-2-positive respiratory samples characterized by a mean depth of 169 NGS reads/nucleotide position and for which consensus genomes that had been obtained were classified into 15 viral lineages were analyzed. Mean intra-sample nucleotide diversity was 0.21 ± 0.65%, and 5357 positions (17.9%) exhibited significant (>4%) diversity, in ≥2 genomes for 1730 (5.8%) of them. ORF10, spike, and N genes had the highest number of positions exhibiting diversity (0.56%, 0.34%, and 0.24%, respectively). Nine hot spots of intra-sample diversity were identified in the SARS-CoV-2 NSP6, NSP12, ORF8, and N genes. Hierarchical clustering delineated a set of six genomes of different lineages characterized by 920 positions exhibiting intra-sample diversity. In addition, 118 nucleotide positions (0.4%) exhibited diversity at both intra- and inter-patient levels. Overall, the present study illustrates that the SARS-CoV-2 consensus genome sequences are only an incomplete and imperfect representation of the entire viral population infecting a patient, and that quasispecies analysis may allow deciphering more accurately the viral evolutionary pathways.

摘要

迄今为止,绝大多数 SARS-CoV-2 基因组数据都忽略了宿主内遗传多样性。在这里,我们基于下一代测序(NGS)生成的完整基因组数据研究了 SARS-CoV-2 准种。SARS-CoV-2 的原始 NGS 数据是使用 Illumina 技术在 MiSeq 仪器上生成的,这些数据来自 2020 年 3 月至 2021 年 2 月期间采集的鼻咽样本,在进行 NGS 读段映射之前没有进行 PCR 扩增。为了分析病毒准种,我们设计并实现了一个内部 Excel 文件(“QuasiS”),该文件可以使用 NGS 读段映射的数据来描述基因组内样本核苷酸多样性。我们比较了样本内遗传多样性和 Nextstrain 提供的全球遗传多样性。基于样本内遗传多样性对所有样本进行层次聚类,并使用 Morpheus 网络应用程序进行可视化。对 110 份 SARS-CoV-2 阳性呼吸道样本的 NGS 映射数据进行了分析,这些样本的平均深度为 169 个 NGS 读段/核苷酸位置,并且获得了共识基因组,这些基因组被分为 15 种病毒谱系。分析了 1730 个样本(5.8%)中 15 种病毒谱系的 110 份 SARS-CoV-2 阳性呼吸道样本的 NGS 映射数据,这些样本的平均深度为 169 个 NGS 读段/核苷酸位置,并且获得了共识基因组,这些基因组被分为 15 种病毒谱系。110 份样本中 5357 个位置(17.9%)的核苷酸多样性显著(>4%),其中 1730 个位置(5.8%)的≥2 个基因组存在显著多样性。ORF10、spike 和 N 基因具有最高数量的多样性位置(分别为 0.56%、0.34%和 0.24%)。在 SARS-CoV-2 的 NSP6、NSP12、ORF8 和 N 基因中鉴定到了 9 个样本内多样性热点。层次聚类将一组不同谱系的 6 个基因组划分为具有 920 个表现出样本内多样性的基因组。此外,118 个核苷酸位置(0.4%)在个体内和个体间都表现出多样性。总的来说,本研究表明,SARS-CoV-2 的共识基因组序列只是感染患者的整个病毒群体的不完整和不完善的代表,准种分析可能允许更准确地破译病毒进化途径。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f693/9779826/989ef95ceeee/ijms-23-15658-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验