Department of Medical Genetics, National Institute for Genetic Engineering and Biotechnology, Tehran, Iran.
Department of Quantitative Health Sciences, John A. Burns School of Medicine, University of Hawaii at Manoa, Honolulu, HI, 96813, USA.
J Transl Med. 2023 Feb 25;21(1):152. doi: 10.1186/s12967-023-03996-w.
BACKGROUND: At the end of December 2019, a novel strain of Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) disease (COVID-19) has been identified in Wuhan, a central city in China, and then spread to every corner of the globe. As of October 8, 2022, the total number of COVID-19 cases had reached over 621 million worldwide, with more than 6.56 million confirmed deaths. Since SARS-CoV-2 genome sequences change due to mutation and recombination, it is pivotal to surveil emerging variants and monitor changes for improving pandemic management. METHODS: 10,287,271 SARS-CoV-2 genome sequence samples were downloaded in FASTA format from the GISAID databases from February 24, 2020, to April 2022. Python programming language (version 3.8.0) software was utilized to process FASTA files to identify variants and sequence conservation. The NCBI RefSeq SARS-CoV-2 genome (accession no. NC_045512.2) was considered as the reference sequence. RESULTS: Six mutations had more than 50% frequency in global SARS-CoV-2. These mutations include the P323L (99.3%) in NSP12, D614G (97.6) in S, the T492I (70.4) in NSP4, R203M (62.8%) in N, T60A (61.4%) in Orf9b, and P1228L (50.0%) in NSP3. In the SARS-CoV-2 genome, no mutation was observed in more than 90% of nsp11, nsp7, nsp10, nsp9, nsp8, and nsp16 regions. On the other hand, N, nsp3, S, nsp4, nsp12, and M had the maximum rate of mutations. In the S protein, the highest mutation frequency was observed in aa 508-635(0.77%) and aa 381-508 (0.43%). The highest frequency of mutation was observed in aa 66-88 (2.19%), aa 7-14, and aa 164-246 (2.92%) in M, E, and N proteins, respectively. CONCLUSION: Therefore, monitoring SARS-CoV-2 proteomic changes and detecting hot spots mutations and conserved regions could be applied to improve the SARS-CoV-2 diagnostic efficiency and design safe and effective vaccines against emerging variants.
背景:2019 年 12 月底,在中国中部城市武汉发现了一种新型严重急性呼吸系统综合征冠状病毒 2(SARS-CoV-2)疾病(COVID-19),随后传播到全球各个角落。截至 2022 年 10 月 8 日,全球 COVID-19 病例总数已超过 6.21 亿例,确诊死亡病例超过 656 万例。由于 SARS-CoV-2 基因组序列因突变和重组而发生变化,因此监测新出现的变异体并监测变化对于改善大流行管理至关重要。
方法:从 2020 年 2 月 24 日至 2022 年 4 月,从 GISAID 数据库以 FASTA 格式下载了 10287271 个 SARS-CoV-2 基因组序列样本。使用 Python 编程语言(版本 3.8.0)软件处理 FASTA 文件以识别变体和序列保守性。将 NCBI RefSeq SARS-CoV-2 基因组(注册号 NC_045512.2)视为参考序列。
结果:全球 SARS-CoV-2 中有 6 个突变的频率超过 50%。这些突变包括 NSP12 中的 P323L(99.3%)、S 中的 D614G(97.6)、NSP4 中的 T492I(70.4%)、N 中的 R203M(62.8%)、Orf9b 中的 T60A(61.4%)和 NSP3 中的 P1228L(50.0%)。在 SARS-CoV-2 基因组中,nsp11、nsp7、nsp10、nsp9、nsp8 和 nsp16 区域没有观察到超过 90%的突变。另一方面,N、nsp3、S、nsp4、nsp12 和 M 具有最高的突变率。在 S 蛋白中,观察到的最高突变频率位于 aa508-635(0.77%)和 aa381-508(0.43%)。M、E 和 N 蛋白中突变频率最高的分别是 aa66-88(2.19%)、aa7-14 和 aa164-246(2.92%)。
结论:因此,监测 SARS-CoV-2 蛋白质组变化,检测热点突变和保守区域,可以提高 SARS-CoV-2 的诊断效率,并设计针对新出现变异体的安全有效的疫苗。
J Transl Med. 2023-2-25
Sci Rep. 2021-3-23
Diagnostics (Basel). 2025-6-6
Viruses. 2025-3-14
Curr Microbiol. 2025-2-24
Heliyon. 2025-1-15
Microbiol Spectr. 2022-10-26
Microbiol Spectr. 2022-4-27