Chamblee Charter High School, Chamblee, GA 30341, USA.
AdventHealth Cancer Institute, Orlando, FL 32804, USA.
Gene. 2022 Oct 20;841:146774. doi: 10.1016/j.gene.2022.146774. Epub 2022 Jul 26.
The COVID-19 is caused by a novel coronavirus SARS-CoV-2, which started from China. It spread rapidly throughout the world and was later declared a pandemic by the WHO. Over the course of time, SARS-CoV-2 has mutated for survival advantages, and this led to multiple variants. Multiple studies on mutations identification in SARS-CoV2 have been published covering extensive sample areas. The purpose of this study was to limit the sample area to the Georgia state in the U.S. and to analyze the genome sequences for mutation profiling across the genome and origin of variants.
The genome sequences (n = 3,970) were obtained from the NCBI database as of June 12, 2021, with the filter of being complete sequenced genomes, homo-sapiens host, and only from Georgia State of the U.S. NextClade, an online tool was used for the analysis of the sequences using Wuhan-Hu-1/2019 as a reference genome. The algorithm was sequence alignment, translation, mutation calling, phylogenetic placement, clade assignment, and quality control (QC). Thirty-six samples with bad QC were removed from the mutational analysis.
A total 117,743 mutations in the nucleotides were identified (averaging 31.5 mutations per sample). The mutations A23403G, C3037T, C241T, and C14408T were detected in 98% of the samples. Also, a total of 75,517 mutations in the amino acid were identified (averaging 20.2 mutations per sample). The mutations D614G and P314L were identified in >97% samples whereas R203K, G204R, P681H, and N501Y were detected in >50% samples. Analysis also revealed 16 different clades with 20I (49.6%). Clades 20G (24.2%) and 20A (5.5%) being the most abundant, showed that SARS-CoV-2 in the Georgia State originated mainly from Southeast England, other parts of the U.S., and several countries in Western Europe.
Looking at the three most common variants in Georgia State of the U.S., we could determine the primary locations of transmission or origin for the virus, and our analyses indicates that majority of the cases originated from Southeast England (Clade 20I), the U.S. itself (Clade 20G), and from Western Europe (Clade 20C).
COVID-19 是由一种新型冠状病毒 SARS-CoV-2 引起的,该病毒起源于中国。它迅速在全球范围内传播,后来被世界卫生组织宣布为大流行。随着时间的推移,SARS-CoV-2 为了生存优势而发生了突变,从而产生了多种变体。已经有多项关于 SARS-CoV2 突变识别的研究发表,涵盖了广泛的样本区域。本研究的目的是将样本区域限制在美国乔治亚州,并分析整个基因组和变体起源的基因组序列中的突变特征。
从 NCBI 数据库中获得了截至 2021 年 6 月 12 日的基因组序列(n=3970),筛选条件为完整测序基因组、智人物种宿主和仅来自美国乔治亚州的序列。使用在线工具 NextClade ,以武汉-Hu-1/2019 作为参考基因组,对序列进行分析。该算法是序列比对、翻译、突变调用、系统发育定位、分支分配和质量控制(QC)。从突变分析中删除了 36 个 QC 较差的样本。
共鉴定出核苷酸中的 117743 个突变(平均每个样本 31.5 个突变)。在 98%的样本中检测到 A23403G、C3037T、C241T 和 C14408T 突变。此外,还鉴定出氨基酸中的 75517 个突变(平均每个样本 20.2 个突变)。在 >97%的样本中检测到 D614G 和 P314L 突变,而在 >50%的样本中检测到 R203K、G204R、P681H 和 N501Y 突变。分析还揭示了 16 个不同的分支,其中 20I(49.6%)最多。20G(24.2%)和 20A(5.5%)分支最为丰富,表明乔治亚州的 SARS-CoV-2 主要来源于东南英格兰、美国其他地区和西欧的几个国家。
从美国乔治亚州的三种最常见变体来看,我们可以确定病毒的主要传播或起源地,我们的分析表明,大多数病例起源于东南英格兰(20I 分支)、美国本土(20G 分支)和西欧(20C 分支)。