Wu Jiaqi, Kryukov Kirill, Takeuchi Junko S, Nakagawa So
Department of Molecular Life Science, Tokai University School of Medicine, Isehara, Kanagawa, Japan.
Bioinformation and DDBJ Center, National Institute of Genetics, Mishima, Shizuoka, Japan.
Heliyon. 2025 Feb 12;11(4):e42613. doi: 10.1016/j.heliyon.2025.e42613. eCollection 2025 Feb 28.
Given the pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), continuous analysis of its genomic variations at the nucleotide level is imperative to monitor the emergence of novel variants of concern. The Global Initiative on Sharing All Influenza Data (GISAID) serves as the standard database for the genomic information of SARS-CoV-2. However, limitations of its data-sharing policy hinder the comprehensive analysis of genomic variations. To address this problem, we developed SGV-caller, a bioinformatics pipeline for analyzing the frequently updated GISAID database. SGV-caller compares input datasets with pre-existing databases and generates local databases encompassing nucleotide, amino acid, and codon-level genomic variations for each SARS-CoV-2 genome. Furthermore, SGV-caller accommodates SARS-CoV-2 genomes from non-GISAID sources as well as other viral genomes. SGV-caller source code and test data are available at https://github.com/wujiaqi06/SGV-caller.
鉴于严重急性呼吸综合征冠状病毒2(SARS-CoV-2)引发的大流行,持续在核苷酸水平分析其基因组变异对于监测新出现的值得关注的变异株至关重要。全球共享流感数据倡议组织(GISAID)是SARS-CoV-2基因组信息的标准数据库。然而,其数据共享政策的局限性阻碍了对基因组变异的全面分析。为解决这一问题,我们开发了SGV-caller,这是一种用于分析频繁更新的GISAID数据库的生物信息学流程。SGV-caller将输入数据集与现有数据库进行比较,并为每个SARS-CoV-2基因组生成包含核苷酸、氨基酸和密码子水平基因组变异的本地数据库。此外,SGV-caller还能处理来自非GISAID来源的SARS-CoV-2基因组以及其他病毒基因组。SGV-caller的源代码和测试数据可在https://github.com/wujiaqi06/SGV-caller获取。