Gill Erin E, Jia Baofeng, Murall Carmen Lia, Poujol Raphaël, Anwar Muhammad Zohaib, John Nithu Sara, Richardsson Justin, Hobb Ashley, Olabode Abayomi S, Lepsa Alexandru, Duggan Ana T, Tyler Andrea D, N'Guessan Arnaud, Kachru Atul, Chan Brandon, Yoshida Catherine, Yung Christina K, Bujold David, Andric Dusan, Su Edmund, Griffiths Emma J, Van Domselaar Gary, Jolly Gordon W, Ward Heather K E, Feher Henrich, Baker Jared, Simpson Jared T, Uddin Jaser, Ragoussis Jiannis, Eubank Jon, Fritz Jörg H, Gálvez José Héctor, Fang Karen, Cullion Kim, Rivera Leonardo, Xiang Linda, Croxen Matthew A, Shiell Mitchell, Prystajecky Natalie, Quirion Pierre-Olivier, Bajari Rosita, Rich Samantha, Mubareka Samira, Moreira Sandrine, Cain Scott, Sutcliffe Steven G, Kraemer Susanne A, Joly Yann, Alturmessov Yelizar, Consortium Cphln, Consortium CanCOGeN, Fiume Marc, Snutch Terrance P, Bell Cindy, Lopez-Correa Catalina, Hussin Julie G, Joy Jeffrey B, Colijn Caroline, Gordon Paul M K, Hsiao William W L, Poon Art F Y, Knox Natalie C, Courtot Mélanie, Stein Lincoln, Otto Sarah P, Bourque Guillaume, Shapiro B Jesse, Brinkman Fiona S L
Department of Molecular Biology and Biochemistry, Simon Fraser University, Burnaby, BC, Canada.
Department of Microbiology and Immunology, McGill University, Montreal, QC, Canada.
ArXiv. 2024 May 8:arXiv:2405.04734v1.
The COVID-19 pandemic led to a large global effort to sequence SARS-CoV-2 genomes from patient samples to track viral evolution and inform public health response. Millions of SARS-CoV-2 genome sequences have been deposited in global public repositories. The Canadian COVID-19 Genomics Network (CanCOGeN - VirusSeq), a consortium tasked with coordinating expanded sequencing of SARS-CoV-2 genomes across Canada early in the pandemic, created the Canadian VirusSeq Data Portal, with associated data pipelines and procedures, to support these efforts. The goal of VirusSeq was to allow open access to Canadian SARS-CoV-2 genomic sequences and enhanced, standardized contextual data that were unavailable in other repositories and that meet FAIR standards (Findable, Accessible, Interoperable and Reusable). In addition, the Portal data submission pipeline contains data quality checking procedures and appropriate acknowledgement of data generators that encourages collaboration. From inception to execution, the portal was developed with a conscientious focus on strong data governance principles and practices. Extensive efforts ensured a commitment to Canadian privacy laws, data security standards, and organizational processes. This Portal has been coupled with other resources like Viral AI and was further leveraged by the Coronavirus Variants Rapid Response Network (CoVaRR-Net) to produce a suite of continually updated analytical tools and notebooks. Here we highlight this Portal, including its contextual data not available elsewhere, and the 'Duotang', a web platform that presents key genomic epidemiology and modeling analyses on circulating and emerging SARS-CoV-2 variants in Canada. Duotang presents dynamic changes in variant composition of SARS-CoV-2 in Canada and by province, estimates variant growth, and displays complementary interactive visualizations, with a text overview of the current situation. The VirusSeq Data Portal and Duotang resources, alongside additional analyses and resources computed from the Portal (COVID-MVP, CoVizu), are all open-source and freely available. Together, they provide an updated picture of SARS-CoV-2 evolution to spur scientific discussions, inform public discourse, and support communication with and within public health authorities. They also serve as a framework for other jurisdictions interested in open, collaborative sequence data sharing and analyses.
新冠疫情促使全球展开大规模行动,对患者样本中的严重急性呼吸综合征冠状病毒2(SARS-CoV-2)基因组进行测序,以追踪病毒进化并为公共卫生应对措施提供信息。数以百万计的SARS-CoV-2基因组序列已存入全球公共数据库。加拿大新冠基因组学网络(CanCOGeN - VirusSeq)是一个在疫情早期负责协调加拿大全国范围内扩大SARS-CoV-2基因组测序工作的联盟,它创建了加拿大VirusSeq数据门户以及相关的数据管道和程序,以支持这些工作。VirusSeq的目标是允许公开获取加拿大的SARS-CoV-2基因组序列以及其他数据库中没有但符合FAIR标准(可查找、可访问、可互操作和可重用)的增强型标准化背景数据。此外,该门户的数据提交管道包含数据质量检查程序以及对数据生成者的适当致谢,这鼓励了合作。从构思到执行,该门户的开发始终认真遵循强大的数据治理原则和实践。大量努力确保了对加拿大隐私法、数据安全标准和组织流程的承诺。这个门户已与其他资源(如Viral AI)相结合,并被冠状病毒变异株快速反应网络(CoVaRR-Net)进一步利用,以生成一系列不断更新的分析工具和笔记本。在此,我们重点介绍这个门户,包括其独有的背景数据,以及一个名为“资料夹”的网络平台,该平台展示了加拿大境内正在传播和新出现的SARS-CoV-2变异株的关键基因组流行病学和建模分析。“资料夹”展示了加拿大及各省SARS-CoV-2变异株组成的动态变化,估计变异株的增长情况,并展示了互补的交互式可视化内容以及当前形势的文本概述。VirusSeq数据门户和“资料夹”资源,连同从该门户计算得出的其他分析和资源(COVID-MVP、CoVizu),都是开源且免费提供的。它们共同呈现了SARS-CoV-2进化的最新情况,以促进科学讨论、为公众讨论提供信息,并支持与公共卫生当局以及在公共卫生当局内部进行沟通。它们还为其他有兴趣进行开放、协作式序列数据共享和分析的司法管辖区提供了一个框架。