Colquhoun Rachel, O'Toole Áine, Hill Verity, McCrone J T, Yu Xiaoyu, Nicholls Samuel M, Poplawski Radoslaw, Whalley Thomas, Groves Natalie, Ellaby Nicholas, Loman Nick, Connor Tom, Rambaut Andrew
Institute of Ecology and Evolution, University of Edinburgh, Ashworth Laboratories, Charlotte Auerbach Rd, Edinburgh EH9 3FL, United Kingdom.
Department of Epidemiology of Microbial Diseases, Yale School of Public Health, 60 College St, New Haven, CT 06510, United States.
Virus Evol. 2024 Oct 17;10(1):veae083. doi: 10.1093/ve/veae083. eCollection 2024.
In response to the escalating SARS-CoV-2 pandemic, in March 2020 the COVID-19 Genomics UK (COG-UK) consortium was established to enable national-scale genomic surveillance in the UK. By the end of 2020, 49% of all SARS-CoV-2 genome sequences globally had been generated as part of the COG-UK programme, and to date, this system has generated >3 million SARS-CoV-2 genomes. Rapidly and reliably analysing this unprecedented number of genomes was an enormous challenge. To fulfil this need and to inform public health decision-making, we developed a centralized pipeline that performs quality control, alignment, and variant calling and provides the global phylogenetic context of sequences. We present this pipeline and describe how we tailored it as the pandemic progressed to scale with the increasing amounts of data and to provide the most relevant analyses on a daily basis.
为应对不断升级的SARS-CoV-2大流行,2020年3月,英国新冠病毒基因组学(COG-UK)联盟成立,以在英国开展全国规模的基因组监测。到2020年底,全球所有SARS-CoV-2基因组序列的49%是作为COG-UK计划的一部分生成的,迄今为止,该系统已生成超过300万个SARS-CoV-2基因组。快速且可靠地分析这一史无前例的基因组数量是一项巨大挑战。为满足这一需求并为公共卫生决策提供依据,我们开发了一个集中式流程,该流程可进行质量控制、比对和变异检测,并提供序列的全球系统发育背景。我们展示了这个流程,并描述了随着大流行的发展,我们如何对其进行调整,以适应不断增加的数据量,并每天提供最相关的分析。