Nasir Jalees A, Maguire Finlay, Smith Kendrick M, Panousis Emily M, Baker Sheridan J C, Aftanas Patryk, Raphenya Amogelang R, Alcock Brian P, Maan Hassaan, Knox Natalie C, Banerjee Arinjay, Mossman Karen, Wang Bo, Simpson Jared T, Kozak Robert A, Mubareka Samira, McArthur Andrew G
M.G. DeGroote Institute for Infectious Disease Research, McMaster University, 1280 Main Street West, Hamilton, Ontario, L8S 4K1, Canada.
Department of Biochemistry and Biomedical Sciences, McMaster University, 1280 Main Street West, Hamilton, Ontario, L8S 4K1, Canada.
NAR Genom Bioinform. 2024 Dec 18;6(4):lqae176. doi: 10.1093/nargab/lqae176. eCollection 2024 Dec.
The incorporation of sequencing technologies in frontline and public health healthcare settings was vital in developing virus surveillance programs during the Coronavirus Disease 2019 (COVID-19) pandemic caused by transmission of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). However, increased data acquisition poses challenges for both rapid and accurate analyses. To overcome these hurdles, we developed the SARS-CoV-2 Illumina GeNome Assembly Line (SIGNAL) for quick bulk analyses of Illumina short-read sequencing data. SIGNAL is a Snakemake workflow that seamlessly manages parallel tasks to process large volumes of sequencing data. A series of outputs are generated, including consensus genomes, variant calls, lineage assessments and identified variants of concern (VOCs). Compared to other existing SARS-CoV-2 sequencing workflows, SIGNAL is one of the fastest-performing analysis tools while maintaining high accuracy. The source code is publicly available (github.com/jaleezyy/covid-19-signal) and is optimized to run on various systems, with software compatibility and resource management all handled within the workflow. Overall, SIGNAL illustrated its capacity for high-volume analyses through several contributions to publicly funded government public health surveillance programs and can be a valuable tool for continuing SARS-CoV-2 Illumina sequencing efforts and will inform the development of similar strategies for rapid viral sequence assessment.
在由严重急性呼吸综合征冠状病毒2(SARS-CoV-2)传播引起的2019冠状病毒病(COVID-19)大流行期间,将测序技术纳入一线和公共卫生医疗环境对于制定病毒监测计划至关重要。然而,数据采集量的增加给快速准确的分析带来了挑战。为了克服这些障碍,我们开发了SARS-CoV-2 Illumina基因组装配线(SIGNAL),用于对Illumina短读长测序数据进行快速批量分析。SIGNAL是一个Snakemake工作流程,可无缝管理并行任务以处理大量测序数据。它会生成一系列输出,包括一致基因组、变异位点调用、谱系评估和已识别的关注变异株(VOC)。与其他现有的SARS-CoV-2测序工作流程相比,SIGNAL是性能最快的分析工具之一,同时保持了高准确性。其源代码是公开可用的(github.com/jaleezyy/covid-19-signal),并经过优化可在各种系统上运行,工作流程内处理软件兼容性和资源管理。总体而言,SIGNAL通过对政府公共卫生监测计划的多项贡献展示了其进行大量分析的能力,并且可以成为继续进行SARS-CoV-2 Illumina测序工作的宝贵工具,并将为快速病毒序列评估的类似策略的制定提供参考。