用于全基因组测序数据常规分析的生物信息学工作流程的验证以及欧洲国家参考中心病原体分型的相关挑战：作为概念验证

Validation of a Bioinformatics Workflow for Routine Analysis of Whole-Genome Sequencing Data and Related Challenges for Pathogen Typing in a European National Reference Center: as a Proof-of-Concept.

作者信息

Bogaerts Bert, Winand Raf, Fu Qiang, Van Braekel Julien, Ceyssens Pieter-Jan, Mattheus Wesley, Bertrand Sophie, De Keersmaecker Sigrid C J, Roosens Nancy H C, Vanneste Kevin

机构信息

Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium.

Bacterial Diseases, Sciensano, Brussels, Belgium.

出版信息

Front Microbiol. 2019 Mar 6;10:362. doi: 10.3389/fmicb.2019.00362. eCollection 2019.

DOI:10.3389/fmicb.2019.00362

PMID:30894839

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6414443/

Abstract

Despite being a well-established research method, the use of whole-genome sequencing (WGS) for routine molecular typing and pathogen characterization remains a substantial challenge due to the required bioinformatics resources and/or expertise. Moreover, many national reference laboratories and centers, as well as other laboratories working under a quality system, require extensive validation to demonstrate that employed methods are "fit-for-purpose" and provide high-quality results. A harmonized framework with guidelines for the validation of WGS workflows does currently, however, not exist yet, despite several recent case studies highlighting the urgent need thereof. We present a validation strategy focusing specifically on the exhaustive characterization of the bioinformatics analysis of a WGS workflow designed to replace conventionally employed molecular typing methods for microbial isolates in a representative small-scale laboratory, using the pathogen as a proof-of-concept. We adapted several classically employed performance metrics specifically toward three different bioinformatics assays: resistance gene characterization (based on the ARG-ANNOT, ResFinder, CARD, and NDARO databases), several commonly employed typing schemas (including, among others, core genome multilocus sequence typing), and serogroup determination. We analyzed a core validation dataset of 67 well-characterized samples typed by means of classical genotypic and/or phenotypic methods that were sequenced in-house, allowing to evaluate repeatability, reproducibility, accuracy, precision, sensitivity, and specificity of the different bioinformatics assays. We also analyzed an extended validation dataset composed of publicly available WGS data for 64 samples by comparing results of the different bioinformatics assays against results obtained from commonly used bioinformatics tools. We demonstrate high performance, with values for all performance metrics >87%, >97%, and >90% for the resistance gene characterization, sequence typing, and serogroup determination assays, respectively, for both validation datasets. Our WGS workflow has been made publicly available as a "push-button" pipeline for Illumina data at https://galaxy.sciensano.be to showcase its implementation for non-profit and/or academic usage. Our validation strategy can be adapted to other WGS workflows for other pathogens of interest and demonstrates the added value and feasibility of employing WGS with the aim of being integrated into routine use in an applied public health setting.

摘要

尽管全基因组测序（WGS）是一种成熟的研究方法，但由于所需的生物信息学资源和/或专业知识，将其用于常规分子分型和病原体特征分析仍然是一项重大挑战。此外，许多国家参考实验室和中心，以及其他在质量体系下工作的实验室，需要进行广泛验证，以证明所采用的方法“适用”并能提供高质量结果。然而，尽管最近有几个案例研究强调了对此的迫切需求，但目前仍不存在一个关于WGS工作流程验证指南的统一框架。我们提出了一种验证策略，特别侧重于对一个WGS工作流程的生物信息学分析进行详尽表征，该工作流程旨在取代具有代表性的小规模实验室中用于微生物分离株的传统分子分型方法，并以病原体作为概念验证。我们针对三种不同的生物信息学分析专门调整了几种经典使用的性能指标：抗性基因表征（基于ARG-ANNOT、ResFinder、CARD和NDARO数据库）、几种常用的分型模式（包括核心基因组多位点序列分型等）以及血清群测定。我们分析了一个核心验证数据集，该数据集包含67个通过经典基因型和/或表型方法进行分型且在内部进行测序的特征明确的样本，从而能够评估不同生物信息学分析的可重复性、再现性、准确性、精密度、敏感性和特异性。我们还通过将不同生物信息学分析的结果与常用生物信息学工具获得的结果进行比较，分析了一个由64个样本的公开可用WGS数据组成的扩展验证数据集。对于两个验证数据集，我们都展示了高性能，抗性基因表征、序列分型和血清群测定分析的所有性能指标值分别>87%、>97%和>90%。我们的WGS工作流程已作为一个针对Illumina数据的“一键式”管道在https://galaxy.sciensano.be上公开提供，以展示其在非营利和/或学术用途中的应用。我们的验证策略可以适用于针对其他感兴趣病原体的其他WGS工作流程，并证明了采用WGS以整合到应用公共卫生环境中的常规使用中的附加价值和可行性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/98ab/6414443/84a2b6c4112e/fmicb-10-00362-g001.jpg

相似文献

Validation of a Bioinformatics Workflow for Routine Analysis of Whole-Genome Sequencing Data and Related Challenges for Pathogen Typing in a European National Reference Center: as a Proof-of-Concept.

Front Microbiol. 2019 Mar 6;10:362. doi: 10.3389/fmicb.2019.00362. eCollection 2019.

Validation strategy of a bioinformatics whole genome sequencing workflow for Shiga toxin-producing using a reference collection extensively characterized with conventional methods.

Microb Genom. 2021 Mar;7(3). doi: 10.1099/mgen.0.000531. Epub 2021 Mar 3.

A Bioinformatics Whole-Genome Sequencing Workflow for Clinical Mycobacterium tuberculosis Complex Isolate Analysis, Validated Using a Reference Collection Extensively Characterized with Conventional Methods and Approaches.

J Clin Microbiol. 2021 May 19;59(6). doi: 10.1128/JCM.00202-21.

Validation and Implementation of Clinical Laboratory Improvements Act-Compliant Whole-Genome Sequencing in the Public Health Microbiology Laboratory.

J Clin Microbiol. 2017 Aug;55(8):2502-2520. doi: 10.1128/JCM.00361-17. Epub 2017 Jun 7.

Survey on the Use of Whole-Genome Sequencing for Infectious Diseases Surveillance: Rapid Expansion of European National Capacities, 2015-2016.

Front Public Health. 2017 Dec 18;5:347. doi: 10.3389/fpubh.2017.00347. eCollection 2017.

Analytical Performance Validation of Next-Generation Sequencing Based Clinical Microbiology Assays Using a K-mer Analysis Workflow.

Front Microbiol. 2020 Aug 5;11:1883. doi: 10.3389/fmicb.2020.01883. eCollection 2020.

Validation of Whole-Genome Sequencing for Identification and Characterization of Shiga Toxin-Producing Escherichia coli To Produce Standardized Data To Enable Data Sharing.

J Clin Microbiol. 2018 Feb 22;56(3). doi: 10.1128/JCM.01388-17. Print 2018 Mar.

A Practical Bioinformatics Workflow for Routine Analysis of Bacterial WGS Data.

Microorganisms. 2022 Nov 29;10(12):2364. doi: 10.3390/microorganisms10122364.

Whole genome sequencing of Streptococcus pneumoniae: development, evaluation and verification of targets for serogroup and serotype prediction using an automated pipeline.

PeerJ. 2016 Sep 14;4:e2477. doi: 10.7717/peerj.2477. eCollection 2016.

Serotyping Based on Whole-Genome Sequencing Improves the Accuracy of Identification.

Appl Environ Microbiol. 2019 Mar 22;85(7). doi: 10.1128/AEM.00165-19. Print 2019 Apr 1.

引用本文的文献

Formal verification of bioinformatics software using model checking and theorem proving.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf383.

Genomics for antimicrobial resistance-progress and future directions.

Antimicrob Agents Chemother. 2025 May 7;69(5):e0108224. doi: 10.1128/aac.01082-24. Epub 2025 Apr 14.

Detection of antimicrobial resistance via state-of-the-art technologies versus conventional methods.

Front Microbiol. 2025 Feb 25;16:1549044. doi: 10.3389/fmicb.2025.1549044. eCollection 2025.

Galaxy @Sciensano: a comprehensive bioinformatics portal for genomics-based microbial typing, characterization, and outbreak detection.

BMC Genomics. 2025 Jan 8;26(1):20. doi: 10.1186/s12864-024-11182-5.

Towards facilitated interpretation of shotgun metagenomics long-read sequencing data analyzed with KMA for the detection of bacterial pathogens and their antimicrobial resistance genes.

Front Microbiol. 2024 Apr 4;15:1336532. doi: 10.3389/fmicb.2024.1336532. eCollection 2024.

Genomic characterization of a WHO critical priority isolate ST2070 harboring OXA-10, KPC-2, and CTX-M-12 recovered from a water irrigation channel in Ecuador.

Heliyon. 2024 Feb 16;10(5):e26379. doi: 10.1016/j.heliyon.2024.e26379. eCollection 2024 Mar 15.

Pipeline validation for the identification of antimicrobial-resistant genes in carbapenem-resistant Klebsiella pneumoniae.

Sci Rep. 2023 Sep 14;13(1):15189. doi: 10.1038/s41598-023-42154-6.

Public health implementation of pathogen genomics: the role for accreditation and application of ISO standards.

Microb Genom. 2023 Aug;9(8). doi: 10.1099/mgen.0.001097.

Genomic insight into isolated from commercial turkey flocks in Germany using whole-genome sequencing analysis.

Front Vet Sci. 2023 Feb 16;10:1092179. doi: 10.3389/fvets.2023.1092179. eCollection 2023.

Genomic insights of harboring by geographical region and a One-Health perspective.

Front Microbiol. 2023 Jan 16;13:1032753. doi: 10.3389/fmicb.2022.1032753. eCollection 2022.

本文引用的文献

The challenges of designing a benchmark strategy for bioinformatics pipelines in the identification of antimicrobial resistance determinants using next generation sequencing technologies.

F1000Res. 2018 Apr 13;7. doi: 10.12688/f1000research.14509.2. eCollection 2018.

Public health surveillance of multidrug-resistant clones of Neisseria gonorrhoeae in Europe: a genomic survey.

Lancet Infect Dis. 2018 Jul;18(7):758-768. doi: 10.1016/S1473-3099(18)30225-1. Epub 2018 May 15.

A Validation Approach of an End-to-End Whole Genome Sequencing Workflow for Source Tracking of and .

Front Microbiol. 2018 Mar 14;9:446. doi: 10.3389/fmicb.2018.00446. eCollection 2018.

Benchmark datasets for phylogenomic pipeline validation, applications for foodborne pathogen surveillance.

PeerJ. 2017 Oct 6;5:e3893. doi: 10.7717/peerj.3893. eCollection 2017.

Survey on the Use of Whole-Genome Sequencing for Infectious Diseases Surveillance: Rapid Expansion of European National Capacities, 2015-2016.

Front Public Health. 2017 Dec 18;5:347. doi: 10.3389/fpubh.2017.00347. eCollection 2017.

Validation of Whole-Genome Sequencing for Identification and Characterization of Shiga Toxin-Producing Escherichia coli To Produce Standardized Data To Enable Data Sharing.

J Clin Microbiol. 2018 Feb 22;56(3). doi: 10.1128/JCM.01388-17. Print 2018 Mar.

A RESTful application programming interface for the PubMLST molecular typing and genome databases.

Database (Oxford). 2017 Jan 1;2017. doi: 10.1093/database/bax060.

Practical issues in implementing whole-genome-sequencing in routine diagnostic microbiology.

Clin Microbiol Infect. 2018 Apr;24(4):355-360. doi: 10.1016/j.cmi.2017.11.001. Epub 2017 Nov 5.

Added Value of Next-Generation Sequencing for Multilocus Sequence Typing Analysis of a Pneumocystis jirovecii Pneumonia Outbreak1.

Emerg Infect Dis. 2017 Aug;23(8):1237-1245. doi: 10.3201/eid2308.161295.

The Validation and Implications of Using Whole Genome Sequencing as a Replacement for Traditional Serotyping for a National Reference Laboratory.

Front Microbiol. 2017 Jun 9;8:1044. doi: 10.3389/fmicb.2017.01044. eCollection 2017.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于全基因组测序数据常规分析的生物信息学工作流程的验证以及欧洲国家参考中心病原体分型的相关挑战：作为概念验证

Validation of a Bioinformatics Workflow for Routine Analysis of Whole-Genome Sequencing Data and Related Challenges for Pathogen Typing in a European National Reference Center: as a Proof-of-Concept.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献